mlpack
master

Introduction
Alternating Matrix Factorization
Alternating matrix factorization decomposes matrx V in the form where W is called the basis matrix and H is called the encoding matrix. V is taken to be of size n x m and the obtained W is n x r and H is r x m. The size r is called the rank of the factorization. Factorization is done by alternately calculating W and H respectively while holding the other matrix constant.
mlpack provides:
 a simple C++ interface to perform Alternating Matrix Factorization
Table of Contents
A list of all the sections this tutorial contains.
The 'AMF' class
The AMF class is templatized with 3 parameters; the first contains the policy used to determine when the algorithm has converged; the second contains the initialization rule for the W and H matrix; the last contains the update rule to be used during each iteration. This templatization allows the user to try various update rules, initialization rules, and termination policies (including ones not supplied with mlpack) for factorization.
The class provides the following method that performs factorization
Using different termination policies
The AMF implementation comes with different termination policies to support many implemented algorithms. Every termination policy implements the following method which returns the status of convergence.
list of all the termination policies
 mlpack::amf::SimpleResidueTermination
 mlpack::amf::SimpleToleranceTermination
 mlpack::amf::ValidationRMSETermination
In SimpleResidueTermination, termination decision depends on two factors, value of residue and number of iteration. If the current value of residue drops below the threshold or the number of iterations goes beyond the threshold, positive termination signal is passed to AMF.
In SimpleToleranceTermination, termination criterion is met when increase in residue value drops below the given tolerance. To accommodate spikes, certain number of successive residue drops are accepted. Secondary termination criterion terminates algorithm when iteration count goes beyond the threshold.
ValidationRMSETermination divids the data into 2 sets, training set and validation set. Entries of validation set are nullifed in the input matrix. Termination criterion is met when increase in validation set RMSe value drops below the given tolerance. To accommodate spikes certain number of successive validation RMSE drops are accepted. This upper imit on successive drops can be adjusted with reverseStepCount. Secondary termination criterion terminates algorithm when iteration count goes above the threshold. Though this termination policy is better measure of convergence than the above 2 termination policies, it may cause a overhead in performance.
On the other hand CompleteIncrementalTermination and mlpack::amf::IncompleteIncrementalTermination are just wrapper classes for other termination policies. These policies are used when AMF is applied with SVDCompleteIncrementalLearning and SVDIncompleteIncrementalLearning respectively.
Using different initialization policies
The AMF class comes with 2 initialization policies
RandomInitialization initializes matrices W and H with random uniform distribution while RandomAcolInitialization initializes the W matrix by averaging p randomly chosen columns of V. In case of RandomAcolInitialization, p is a template parameter.
To implement their own initialization policy, users need to define the following function in their class.
Using different update rules
AMF supports following update rules
 AMFALSUpdate
 NMFMultiplicativeDistanceUpdate
 NMFMultiplicativeDivergenceUpdate
 SVDBatchLearning
 SVDIncompleteIncrementalLearning
 SVDCompleteIncrementalLearning
NonNegative Matrix factorization can be achieved with NMFALSUpdate, NMFMultiplicativeDivergenceUpdate or NMFMultiplicativeDivergenceUpdate. NMFALSUpdate implements simple Alternating Least Square optimization while the other rules implement algorithms given in paper 'Algorithms for Nonnegative Matrix Factorization'.
The remaining update rules perform Singular Value Decomposition of matrix V. This SVD factorization is optimized for the use by Collaborative Filtering. This use of SVD factorizers for Collaborative Filtering is described in the paper 'A Guide to singular Value Decomposition' by ChihChao Ma. For further details about the algorithms refer to the respective class documentation.
Using NonNegative Matrix Factorization with AMF
The use of AMF for NonNegative Matrix factorization is simple. The AMF module defines NMFALSFactorizer which can be used directly without knowing the internal structure of AMF. For example 
NMFALSFactorizer uses SimpleResidueTermination which is most preferred with NonNegative Matrix factorizers. Initialization of W and H in NMFALSFactorizer is random. The Apply function returns the residue obtained by comparing the constructed matrix W * H with the original matrix V.
Using Singular Value Decomposition with AMF
AMF implementation supports following SVD factorizers
 SVDBatchFactorizer
 SparseSVDBatchFactorizer
 SVDIncompleteIncrementalFactorizer
 SparseSVDIncompleteIncrementalFactorizer
 SVDCompleteIncrementalFactorizer
 SparseSVDCompleteIncrementalFactorizer
The sparse version of factorizers can be used with Armadillo's sparse matrix support. These specialized implementations boost runtime performance when the matrix to be factorized is relatively sparse.
Further documentation
For further documentation on the AMF class, consult the complete API documentation.
Generated by 1.8.13