ðŸ”—`LARS`

The `LARS` class implements the least-angle regression (LARS) algorithm for L1-penalized and L2-penalized linear regression. `LARS` can also solve the LASSO (least absolute shrinkage and selection operator) problem. The LARS algorithm is a path algorithm, and thus will recover solutions for all L1 penalty parameters greater than or equal to the given L1 penalty parameter.

Simple usage example:

``````// Train a LARS model on random numeric data and make predictions.

// All data and responses are uniform random; this uses 10 dimensional data.
// Replace with a data::Load() call or similar for a real application.
arma::mat dataset(10, 1000, arma::fill::randu); // 1000 points.
arma::rowvec responses = arma::randn<arma::rowvec>(1000);
arma::mat testDataset(10, 500, arma::fill::randu); // 500 test points.

mlpack::LARS lars(true, 0.1 /* L1 penalty */); // Step 1: create model.
lars.Train(dataset, responses);                // Step 2: train model.
arma::rowvec predictions;
lars.Predict(testDataset, predictions);        // Step 3: use model to predict.

// Print some information about the test predictions.
std::cout << arma::accu(predictions > 0.7) << " test points predicted to have"
<< " responses greater than 0.7." << std::endl;
std::cout << arma::accu(predictions < 0) << " test points predicted to have "
<< "negative responses." << std::endl;
``````

More examples...

ðŸ”— Constructors

• `lars = LARS(useCholesky=false, lambda1=0.0, lambda2=0.0, tolerance=1e-16, fitIntercept=true, normalizeData=true)`

• `lars = LARS(data, responses, colMajor=true, useCholesky=true, lambda1=0.0, lambda2=0.0, tolerance=1e-16, fitIntercept=true, normalizeData=true)`
• Train model on the given data and responses, using the given settings for hyperparameters.

• `lars = LARS(data, responses, colMajor, useCholesky, gramMatrix, lambda1=0.0, lambda2=0.0, tolerance=1e-16, fitIntercept=true, normalizeData=true)`
• Train model on the given data and responses, using a precomputed Gram matrix (`gramMatrix`, equivalent to `data * data.t()`).
• Using a precomputed Gram matrix can save time, if it has already been computed.
• Note: any precomputed Gram matrix must also match the settings of `fitIntercept` and `normalizeData`; so, if both are `true`, then `gramMatrix` must be computed on mean-centered data whose features are normalized to have unit variance. In addition, if `lambda2 > 0`, then it is expected that `lambda2` is added to each element on the diagonal of `gramMatrix`.

Constructor Parameters:

name type description default
`data` `arma::mat` Training matrix. (N/A)
`responses` `arma::rowvec` Training responses (e.g. values to predict). Should have length `data.n_cols`. (N/A)
`colMajor` `bool` Should be set to `true` if `data` is column-major. Passing row-major data can avoid a transpose operation. `false`
`useCholesky` `bool` If `true`, use the Cholesky decomposition of the Gram matrix to solve linear systems (as opposed to the full Gram matrix). `false`
`gramMatrix` `arma::mat` Precomputed Gram matrix of `data` (i.e. `data * data.t()` for column-major data). (N/A)
`lambda1` `double` L1 regularization penalty parameter. `0.0`
`lambda2` `double` L2 regularization penalty parameter. `0.0`
`tolerance` `double` Tolerance on feature correlations for convergence. `1e-16`
`fitIntercept` `bool` If `true`, an intercept term will be included in the model. `true`
`normalizeData` `bool` If `true`, data will be normalized before fitting the model. `true`

As an alternative to passing hyperparameters, each hyperparameter can be set with a standalone method. The following functions can be used before calling `Train()` to set hyperparameters:

• `lars.UseCholesky() = useChol;` will set whether or not the Cholesky decomposition will be used during training to `useChol`.
• `lars.Lambda1() = lambda1;` will set the L1 regularization penalty parameter to `lambda1`.
• `lars.Lambda2() = lambda2;` will set the L2 regularization penalty parameter to `lambda2`.
• `lars.Tolerance() = tol;` will set the convergence tolerance to `tol`.
• `lars.FitIntercept(fitIntercept);` will set whether an intercept will be fit to `fitIntercept`. If an external Gram matrix has been specified, this will throw an exception.
• `lars.NormalizeData(normalizeData);` will set whether data should be normalized to `normalizeData`. If an external Gram matrix has been specified, this will throw an exception.

Notes:

• The `lambda1` parameter implicitly controls the sparsity of the model; for more sparse models (i.e. fewer nonzero weights), specify a larger `lambda1`.

• Specifying a too-small `lambda1` or `lambda2` value may cause the model to overfit; however, setting it too large may cause the model to underfit. Because LARS is a path algorithm, the `SelectBeta()` functions can be used to select models with different values of `lambda1`. For tuning `lambda2`, Automatic hyperparameter tuning can be used.

• `fitIntercept` and `normalizeData` are recommended to be set as `true`, in accordance with the original LARS algorithm. `false` can be used for `fitIntercept` if the features and responses are already mean-centered, and `false` can also be used for `normalizeData` if the features are already unit-variance. Using `false` for either option can provide a small amount of speedup.

• `useCholesky` should generally be set to `true` and in most situations will result in faster training.

ðŸ”— Training

If training is not done as part of the constructor call, it can be done with the `Train()` function:

• `lars.Train(data, responses, colMajor=true, useCholesky=true, lambda1=0.0, lambda2=0.0, tolerance=1e-16, fitIntercept=true, normalizeData=true)`
• Train the model on the given data.

• `lars.Train(data, responses, colMajor, useCholesky, gramMatrix, lambda1=0.0, lambda2=0.0, tolerance=1e-16, fitIntercept=true, normalizeData=true)`
• Train model on the given data and responses, using a precomputed Gram matrix (`gramMatrix`, equivalent to `data * data.t()`).
• Using a precomputed Gram matrix can save time, if it has already been computed.
• Note: any precomputed Gram matrix must also match the settings of `fitIntercept` and `normalizeData`; so, if both are `true`, then `gramMatrix` must be computed on mean-centered data whose features are normalized to have unit variance. In addition, if `lambda2 > 0`, then it is expected that `lambda2` is added to each element on the diagonal of `gramMatrix`.

Types of each argument are the same as in the table for constructors above.

Notes:

• Training is not incremental. A second call to `Train()` will retrain the model from scratch.

• `Train()` returns the squared error (loss) of the model on the training set as a `double`. To obtain the MSE, divide by the number of training points.

ðŸ”— Prediction

Once a `LARS` model is trained, the `Predict()` member function can be used to make predictions for new data.

• `double predictedValue = lars.Predict(point)`
• (Single-point)
• Make a prediction for a single point, returning the predicted value.

• `lars.Predict(data, predictions, colMajor=true)`
• (Multi-point)
• Make predictions for a set of points.
• The prediction for data point `i` can be accessed with `predictions[i]`.

Prediction Parameters:

usage name type description
single-point `point` `arma::vec` Single point for prediction.
Â  Â  Â  Â
multi-point `data` `arma::mat` Set of column-major points for classification.
multi-point `predictions` `arma::rowvec&` Vector of `double`s to store predictions into. Will be set to length `data.n_cols`.
multi-point `colMajor` `bool` Should be set to `true` if `data` is column-major. Passing row-major data can avoid a transpose operation. (Default `true`.)

ðŸ”— Other Functionality

• A `LARS` model can be serialized with `data::Save()` and `data::Load()`.

• `lars.Beta()` will return an `arma::vec` with the model parameters. This will have length equal to the dimensionality of the model. Note that `lars.Beta()` can be changed to a different model on the LARS path using the `lars.SelectBeta()` method.

• `lars.Intercept()` will return a `double` representing the fitted intercept term, or 0 if `lars.FitIntercept()` is `false`.

• `lars.ActiveSet()` will return a `std::vector<size_t>&` containing the indices of nonzero dimensions in the model parameters (`lars.Beta()`).

• `lars.ComputeError(data, responses, colMajor=true)` will return a `double` containing the squared error of the model on `data`, given that the true responses are `responses`. To obtain the MSE, divide by the number of points in `data`.

ðŸ”— The LARS Path

LARS is a path (or stepwise) algorithm, meaning it adds one feature at a time to the model. This in turn means that when we train a LARS model with `lambda1` set to `l`, we also recover every possible LARS model on the same data with a `lambda1` greater than `l`.

The `LARS` class provides a way to access all of the models on the path, and switch between them for prediction purposes:

• `lars.BetaPath()` returns a `std::vector<arma::vec>&` containing each set of model weights on the LARS path.

• `lars.InterceptPath()` returns a `std::vector<double>&` containing each intercept value on the LARS path. These values are only meaningful if `lars.FitIntercept()` is `true`.

• `lars.LambdaPath()` returns a `std::vector<double>&` containing each `lambda1` value that is associated with each element in `lars.BetaPath()` and `lars.InterceptPath()`. That is, `lars.LambdaPath()[i]` is the `lambda1` value corresponding to the model defined by `lars.BetaPath()[i]` and `lars.InterceptPath()[i]`.

• `lars.SelectBeta(lambda1)` will set the model weights (`lars.ActiveSet()`, `lars.Beta()` and `lars.Intercept()`) to the path location with L1 penalty `lambda1`. This is equivalent to calling ```lars.Train(data, responses, colMajor, useCholesky, lambda1)```â€”but much more efficient! `lambda1` cannot be less than `lars.Lambda1()`, or an exception will be thrown.

• `lars.SelectedLambda1()` returns the currently selected L1 regularization penalty parameter.

• For any value `lambda1` between `lars.LambdaPath()[i]` and `lars.LambdaPath()[i + 1]`, the corresponding model is a linear interpolation between `lars.BetaPath()[i]` and `lars.BetaPath()[i + 1]` (and `lars.InterceptPath()[i]` and `lars.InterceptPath()[i + 1]`). This exact linear interpolation is what is computed by `lars.SelectBeta(lambda1)`.

ðŸ”— Simple Examples

See also the simple usage example for a trivial usage of the `LARS` class.

Train a LARS model in the constructor, and print the MSE on training and test data for each set of weights in the path.

``````// See https://datasets.mlpack.org/wave_energy_farm_100.csv.
arma::mat data;

// Split the last row off: it is the responses.  Also, normalize the responses
// to [0, 1].
arma::rowvec responses = data.row(data.n_rows - 1);
responses /= responses.max();
data.shed_row(data.n_rows - 1);

// Split into a training and test dataset.  20% of the data is held out as a
// test set.
arma::mat trainingData, testData;
arma::rowvec trainingResponses, testResponses;
mlpack::data::Split(data, responses, trainingData, testData, trainingResponses,
testResponses, 0.2);

// Train a LARS model with lambda1 = 1e-5 and lambda2 = 1e-6.
mlpack::LARS lars(trainingData, trainingResponses, true, true, 1e-5, 1e-6);

// Iterate over all the models in the path.
const size_t pathLength = lars.BetaPath().size();
for (size_t i = 0; i < pathLength; ++i)
{
// Use the i'th model in the path.
lars.SelectBeta(lars.LambdaPath()[i]);

// ComputeError() returns the total loss, which we need to divide by the
// number of points to get the MSE.
const double trainMSE = lars.ComputeError(trainingData, trainingResponses) /
trainingData.n_cols;
const double testMSE = lars.ComputeError(testData, testResponses) /
testData.n_cols;
std::cout << "L1 penalty parameter: " << lars.SelectedLambda1() << std::endl;
std::cout << "  MSE on training set: " << trainMSE << "." << std::endl;
std::cout << "  MSE on test set:     " << testMSE << "." << std::endl;
}
``````

Train a LARS model, print predictions for a random point, and save to a file.

``````// See https://datasets.mlpack.org/admission_predict.csv.
arma::mat data;

arma::rowvec responses;

// Train a LARS model with only L2 regularization.
mlpack::LARS lars(data, responses, true, true, 0.0, 0.1 /* lambda2 */);

// Predict on a random point.
arma::vec point = arma::randu<arma::vec>(data.n_rows);
const double prediction = lars.Predict(point);

std::cout << "Prediction on random point: " << prediction << "." << std::endl;

// Save the model to "lars_model.bin" with the name "lars".
mlpack::data::Save("lars_model.bin", "lars", lars, true);
``````

Load a LARS model from disk and print some information about it.

``````// This assumes a model named "lars" has previously been saved to
// "lars_model.bin".
mlpack::LARS lars;

if (lars.BetaPath().size() == 0)
{
std::cout << "lars_model.bin contains an untrained LARS model." << std::endl;
}
else
{
std::cout << "Information on the LARS model in lars_model.bin:" << std::endl;

std::cout << " - Model dimensionality: " << lars.Beta().n_elem << "."
<< std::endl;
std::cout << " - Has intercept: "
<< (lars.FitIntercept() ? std::string("yes") : std::string("no")) << "."
<< std::endl;
std::cout << " - Current L1 regularization penalty parameter value: "
<< lars.SelectedLambda1() << "." << std::endl;
std::cout << " - L2 regularization penalty parameter: " << lars.Lambda2()
<< "." << std::endl;
std::cout << " - Number of nonzero elements in model: "
<< lars.ActiveSet().size() << "." << std::endl;
std::cout << " - Number of models in LARS path: " << lars.BetaPath().size()
<< "." << std::endl;
std::cout << " - Model weight for dimension 0: " << lars.Beta()[0] << "."
<< std::endl;

if (lars.FitIntercept())
{
std::cout << " - Intercept value: " << lars.Intercept() << "." << std::endl;
}
}
``````

Train several models with different L2 regularization penalty parameters, using a precomputed Gram matrix.

``````// See https://datasets.mlpack.org/admission_predict.csv.
arma::mat data;

arma::rowvec responses;

// Precompute Gram matrix.
arma::mat gramMatrix = data * data.t();

std::vector<double> lambda2Values = { 0.01, 0.1, 1.0, 10.0, 100.0 };
for (double lambda2 : lambda2Values)
{
// Build a LARS model using the precomputed Gram matrix.  We did not normalize
// or center the data before computing the Gram matrix, so we have to set
// fitIntercept and normalizeData accordingly.
mlpack::LARS lars(data, responses, true, true, gramMatrix, 0.01, lambda2,
1e-16, false, false);

std::cout << "MSE with L2 penalty " << lambda2 << ": "
<< (lars.ComputeError(data, responses) / data.n_cols) << "." << std::endl;
}
``````

ðŸ”— Advanced Functionality: Different Element Types

The `LARS` class has one template parameter that can be used to control the element type of the model. The full signature of the class is:

``````LARS<ModelMatType>
``````

`ModelMatType` specifies the type of matrix used for the internal representation of model parameters. Any matrix type that implements the Armadillo API can be used.

Note that the `Train()` and `Predict()` functions themselves are templatized and can allow any matrix type that has the same element type. So, for instance, a `LARS<arma::sp_mat>` can accept an `arma::mat` for training.

The example below trains a LARS model on 32-bit precision data, using `arma::sp_fmat` to store the model parameters.

``````// Create random, sparse 1000-dimensional data.
arma::fmat dataset(1000, 5000, arma::fill::randu);

// Generate noisy responses from random data.
arma::fvec trueWeights(1000, arma::fill::randu);
arma::frowvec responses = trueWeights.t() * dataset +
0.01 * arma::randu<arma::frowvec>(5000) /* noise term */;

mlpack::LARS<arma::sp_fmat> lars;
lars.Lambda1() = 0.1;
lars.Lambda2() = 0.01;

lars.Train(dataset, responses);

// Compute the MSE on the training set and a random test set.
arma::fmat testDataset(1000, 2500, arma::fill::randu);
arma::frowvec testResponses = trueWeights.t() * testDataset +
0.01 * arma::randu<arma::frowvec>(2500) /* noise term */;

const float trainMSE = lars.ComputeError(dataset, responses) / dataset.n_cols;
const float testMSE = lars.ComputeError(testDataset, testResponses) /
testDataset.n_cols;

std::cout << "MSE on training set: " << trainMSE << "." << std::endl;
std::cout << "MSE on test set:     " << testMSE << "." << std::endl;
``````

Note: it is generally only more efficient to use a sparse type (e.g. `arma::sp_mat`) for `ModelMatType` when the L1 regularization parameter is set such that a highly sparse model is produced.