mlpack_local_coordinate_coding - local coordinate coding


mlpack_local_coordinate_coding [-k int] [-i string] [-m unknown] [-l double] [-n int] [-N bool] [-s int] [-T string] [-o double] [-t string] [-V bool] [-c string] [-d string] [-M unknown] [-h -v]


An implementation of Local Coordinate Coding (LCC), which codes data that approximately lives on a manifold using a variation of l1-norm regularized sparse coding. Given a dense data matrix X with n points and d dimensions, LCC seeks to find a dense dictionary matrix D with k atoms in d dimensions, and a coding matrix Z with n points in k dimensions. Because of the regularization method used, the atoms in D should lie close to the manifold on which the data points lie.

The original data matrix X can then be reconstructed as D * Z. Therefore, this program finds a representation of each point in X as a sparse linear combination of atoms in the dictionary D.

The coding is found with an algorithm which alternates between a dictionary step, which updates the dictionary D, and a coding step, which updates the coding matrix Z.

To run this program, the input matrix X must be specified (with -i), along with the number of atoms in the dictionary (-k). An initial dictionary may also be specified with the ’--initial_dictionary_file (-i)’ parameter. The l1-norm regularization parameter is specified with the ’--lambda (-l)’ parameter. For example, to run LCC on the dataset ’data.csv’ using 200 atoms and an l1-regularization parameter of 0.1, saving the dictionary ’--dictionary_file (-d)’ and the codes into ’--codes_file (-c)’, use

$ local_coordinate_coding --training_file data.csv --atoms 200 --lambda 0.1 --dictionary_file dict.csv --codes_file codes.csv

The maximum number of iterations may be specified with the ’--max_iterations (-n)’ parameter. Optionally, the input data matrix X can be normalized before coding with the ’--normalize (-N)’ parameter.

An LCC model may be saved using the ’--output_model_file (-M)’ output parameter. Then, to encode new points from the dataset ’points.csv’ with the previously saved model ’lcc_model.bin’, saving the new codes to ’new_codes.csv’, the following command can be used:

$ local_coordinate_coding --input_model_file lcc_model.bin --test_file points.csv --codes_file new_codes.csv


--atoms (-k) [int]

Number of atoms in the dictionary. Default value 0.

--help (-h) [bool]

Default help info.

--info [string]

Get help on a specific module or option. Default value ’’.

--initial_dictionary_file (-i) [string]

Optional initial dictionary. Default value ’’.

--input_model_file (-m) [unknown]

Input LCC model. Default value ’’.

--lambda (-l) [double]

Weighted l1-norm regularization parameter. Default value 0.

--max_iterations (-n) [int]

Maximum number of iterations for LCC (0 indicates no limit). Default value 0.

--normalize (-N) [bool]

If set, the input data matrix will be normalized before coding.

--seed (-s) [int]

Random seed. If 0, ’std::time(NULL)’ is used. Default value 0.

--test_file (-T) [string]

Test points to encode. Default value ’’.

--tolerance (-o) [double]

Tolerance for objective function. Default value 0.01.

--training_file (-t) [string]

Matrix of training data (X). Default value ’’.

--verbose (-v) [bool]

Display informational messages and the full list of parameters and timers at the end of execution.

--version (-V) [bool]

Display the version of mlpack.


--codes_file (-c) [string]

Output codes matrix. Default value ’’.

--dictionary_file (-d) [string]

Output dictionary matrix. Default value ’’.

--output_model_file (-M) [unknown]

Output for trained LCC model. Default value ’’.


For further information, including relevant papers, citations, and theory, consult the documentation found at or included with your distribution of mlpack.