Skip to content
/ ssr Public

An R package with Semi-Supervised Regression Methods

Notifications You must be signed in to change notification settings

enriquegit/ssr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

bbf935e · Sep 5, 2019

History

20 Commits
Aug 26, 2019
Aug 16, 2019
Aug 13, 2019
Aug 26, 2019
Sep 1, 2019
Aug 19, 2019
Aug 26, 2019
Aug 14, 2019
Sep 2, 2019
Aug 13, 2019
Aug 26, 2019
Sep 5, 2019
Sep 3, 2019
Aug 13, 2019

Repository files navigation

ssr

CRAN_Status_Badge Travis build status

An R package for semi-supervised regression.

The ssr package implements Co-training by Committee and self-learning semi-supervised learning (SSL) algorithms for regression. In semi-supervised learning, algorithms learn model's parameters not only from labeled data but also from unlabeled data. In many applications, it is difficult, expensive, time-consuming, etc. to label data. Thus, semi-supervised methods learn by combining the limited labeled data points and the unlabeled data points.

The ssr package provides the following functionalities:

  • Train Co-training by Committee models.
  • Train self-learning models.
  • Track and plot performance during training.
  • Generate plots to quickly visualize the results.
  • User can specify the base regressors to be used by the Co-training committee and self-learning from the caret package, other packages or custom functions.

Installation

You can install the ssr package from CRAN:

install.packages("ssr")

or you can install the development version from GitHub.

# install.packages("devtools")
devtools::install_github("enriquegit/ssr")

Example

The following example shows how to train a Co-training Committee of two regressors: a linear model and a KNN.

library(ssr)

dataset <- friedman1 # Load friedman1 dataset.

set.seed(1234)

# Prepare de data
split1 <- split_train_test(dataset, pctTrain = 70)
split2 <- split_train_test(split1$trainset, pctTrain = 5)
L <- split2$trainset
U <- split2$testset[, -11] # Remove the labels.
testset <- split1$testset

# Define list of regressors.
regressors <- list(linearRegression=lm, knn=caret::knnreg)

# Fit the model.
model <- ssr("Ytrue ~ .", L, U, regressors = regressors, testdata = testset)

# Plot RMSE.
plot(model)

# Get the predictions on the testset.
predictions <- predict(model, testset)

# Calculate RMSE on the test set.
sqrt(mean((predictions - testset$Ytrue)^2))

For detailed explanations and more examples refer to the package vignettes.

Citation

To cite package ssr in publications use:

Enrique Garcia-Ceja (2019). ssr: Semi-Supervised Regression Methods.
R package https://CRAN.R-project.org/package=ssr

BibTex entry for LaTeX:

@Manual{enriqueSSR,
    title = {ssr: Semi-Supervised Regression Methods},
    author = {Enrique Garcia-Ceja},
    year = {2019},
    note = {R package},
    url = {https://CRAN.R-project.org/package=ssr},
  }