mlpack_local_coordinate_coding - local coordinate coding


mlpack_local_coordinate_coding [-h] [-v]


An implementation of Local Coordinate Coding (LCC), which codes data that approximately lives on a manifold using a variation of l1-norm regularized sparse coding. Given a dense data matrix X with n points and d dimensions, LCC seeks to find a dense dictionary matrix D with k atoms in d dimensions, and a coding matrix Z with n points in k dimensions. Because of the regularization method used, the atoms in D should lie close to the manifold on which the data points lie.

The original data matrix X can then be reconstructed as D * Z. Therefore, this program finds a representation of each point in X as a sparse linear combination of atoms in the dictionary D.

The coding is found with an algorithm which alternates between a dictionary step, which updates the dictionary D, and a coding step, which updates the coding matrix Z.

To run this program, the input matrix X must be specified (with -i), along with the number of atoms in the dictionary (-k). An initial dictionary may also be specified with the ’--initial_dictionary_file (-i)’ parameter. The l1-norm regularization parameter is specified with the ’--lambda (-l)’ parameter. For example, to run LCC on the dataset ’data.csv’ using 200 atoms and an l1-regularization parameter of 0.1, saving the dictionary ’--dictionary_file (-d)’ and the codes into ’--codes_file (-c)’, use

$ local_coordinate_coding --training_file data.csv --atoms 200 --lambda 0.1 --dictionary_file dict.csv --codes_file codes.csv

The maximum number of iterations may be specified with the ’--max_iterations (-n)’ parameter. Optionally, the input data matrix X can be normalized before coding with the ’--normalize (-N)’ parameter.

An LCC model may be saved using the ’--output_model_file (-M)’ output parameter. Then, to encode new points from the dataset ’points.csv’ with the previously saved model ’lcc_model.bin’, saving the new codes to ’new_codes.csv’, the following command can be used:

$ local_coordinate_coding --input_model_file lcc_model.bin --test_file points.csv --codes_file new_codes.csv


--atoms (-k) [int]

Number of atoms in the dictionary. Default value 0.

--help (-h) [bool]

Default help info.

--info [string]

Get help on a specific module or option. Default value ’’. --initial_dictionary_file (-i) [string] Optional initial dictionary. Default value ’’. --input_model_file (-m) [unknown] Input LCC model. Default value ’’.

--lambda (-l) [double]

Weighted l1-norm regularization parameter. Default value 0.

--max_iterations (-n) [int]

Maximum number of iterations for LCC (0 indicates no limit). Default value 0.

--normalize (-N) [bool]

If set, the input data matrix will be normalized before coding.

--seed (-s) [int]

Random seed. If 0, ’std::time(NULL)’ is used. Default value 0.

--test_file (-T) [string]

Test points to encode. Default value ’’.

--tolerance (-o) [double]

Tolerance for objective function. Default value 0.01. --training_file (-t) [string] Matrix of training data (X). Default value ’’.

--verbose (-v) [bool]

Display informational messages and the full list of parameters and timers at the end of execution.

--version (-V) [bool]

Display the version of mlpack.


--codes_file (-c) [string]

Output codes matrix. Default value ’’. --dictionary_file (-d) [string] Output dictionary matrix. Default value ’’. --output_model_file (-M) [unknown] Output for trained LCC model. Default value ’’.



For further information, including relevant papers, citations, and theory, For further information, including relevant papers, citations, and theory, consult the documentation found at or included with your consult the documentation found at or included with your DISTRIBUTION OF MLPACK. DISTRIBUTION OF MLPACK.