## ðŸ”—`NMF`

The `NMF` class implements non-negative matrix factorization, a technique to decompose a large (potentially sparse) matrix `V` into two smaller matrices `W` and `H`, such that `V ~= W * H`, and `W` and `H` only contain nonnegative elements. This technique may be used for dimensionality reduction, or as part of a recommender system.

The `NMF` class allows fully configurable behavior via template parameters. For more general matrix factorization strategies, see the `AMF` (alternating matrix factorization) class documentation.

#### Simple usage example:

``````// Create a random sparse matrix (V) of size 10x100, with 15% nonzeros.
arma::sp_mat V;
V.sprandu(100, 100, 0.15);

// W and H will be low-rank matrices of size 100x10 and 10x100.
arma::mat W, H;

mlpack::NMF nmf;                         // Step 1: create object.
double residue = nmf.Apply(V, 10, W, H); // Step 2: apply NMF to decompose V.

// Now print some information about the factorized matrices.
std::cout << "W has size: " << W.n_rows << " x " << W.n_cols << "."
<< std::endl;
std::cout << "H has size: " << H.n_rows << " x " << H.n_cols << "."
<< std::endl;
std::cout << "RMSE of reconstructed matrix: "
<< arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem) << "." << std::endl;
``````

More examples...

### ðŸ”— Constructors

• `nmf = NMF()`
• Create an `NMF` object.
• The rank of the decomposition is specified in the call to `Apply()`.

• `nmf = NMF(SimpleResidueTermination(minResidue=1e-5, maxIterations=10000))`
• Create an NMF object with custom termination parameters.
• `minResidue` (a `double`) specifies the minimum difference of the norm of `W * H` between iterations for termination.
• `maxIterations` specifies the maximum number of iterations before decomposition terminates.

### ðŸ”— Applying Decompositions

• `double residue = nmf.Apply(V, rank, W, H)`
• Decompose the matrix `V` into two non-negative matrices `W` and `H` with rank `rank`.
• `W` will be set to size `V.n_rows` x `rank`.
• `H` will be set to size `rank` x `V.n_cols`.
• `W` and `H` are initialized randomly using the Acol initialization strategy; i.e., each column of `W` is an average of 5 random columns of `V`, and `H` is initialized uniformly randomly.
• The residue (change in the norm of `W * H` between iterations) is returned.

Notes:

• Low values of `rank` will give smaller matrices `W` and `H`, but the decomposition will be less accurate. Larger values of `rank` will give more accurate decompositions, but will take longer to compute. Every problem is different, so `rank` must be specified manually.

• The expression `W * H` can be used to reconstruct the matrix `V`.

• Custom behavior, such as custom initialization of `W` and `H`, different or custom termination rules, and different update rules are discussed in the advanced functionality section.

#### `Apply()` Parameters:

name type description
`V` `arma::sp_mat` or `arma::mat` Input matrix to be factorized.
`rank` `size_t` Rank of decomposition; lower is smaller, higher is more accurate.
`W` `arma::mat` Output matrix in which `W` will be stored.
`H` `arma::mat` Output matrix in which `H` will be stored.

Note: Matrices with different element types can be used for `V`, `W`, and `H`; e.g., `arma::fmat`. While `V` can be sparse or dense, `W` and `H` must be dense matrices.

### ðŸ”— Simple Examples

See also the simple usage example for a trivial use of `NMF`.

Decompose a dense matrix with custom termination parameters.

``````// Create a low-rank V matrix by multiplying together two random matrices.
arma::mat V = arma::randu<arma::mat>(500, 25) *
arma::randn<arma::mat>(25, 5000);

// Create the NMF object with a looser tolerance of 1e-3 and a maximum of 100
// iterations only.
mlpack::NMF nmf(mlpack::SimpleResidueTermination(1e-3, 500));

arma::mat W, H;

// Decompose with a rank of 25.
// W will have size 500 x 25, and H will have size 25 x 5000.
const double residue = nmf.Apply(V, 25, W, H);

std::cout << "Residue of decomposition: " << residue << "." << std::endl;

// Compute RMSE of decomposition.
const double rmse = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "RMSE of decomposition: " << rmse << "." << std::endl;
``````

Decompose the sparse MovieLens dataset using a rank-12 decomposition and `float` element type.

``````// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_fmat V;

// Create the NMF object.
mlpack::NMF nmf;

arma::fmat W, H;

// Decompose the Movielens dataset with rank 12.
const double residue = nmf.Apply(V, 12, W, H);

std::cout << "Residue of MovieLens decomposition: " << residue << "."
<< std::endl;

// Compute RMSE of decomposition.
const double rmse = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "RMSE of decomposition: " << rmse << "." << std::endl;
``````

Compare quality of decompositions of MovieLens with different ranks.

``````// See https://datasets.mlpack.org/movielens-100k.csv
arma::sp_mat V;

// Create the NMF object.
mlpack::NMF nmf;
arma::mat W, H;

for (size_t rank = 10; rank <= 100; rank += 15)
{
// Decompose with the given rank.
const double residue = nmf.Apply(V, rank, W, H);

const double rmse = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "RMSE for rank-" << rank << " decomposition: " << rmse << "."
<< std::endl;
}
``````

### ðŸ”— Advanced Functionality: Template Parameters

The `NMF` class has three template parameters that can be used for custom behavior. The full signature of the class is:

``````NMF<TerminationPolicyType, InitializationRuleType, UpdateRuleType>
``````
• `TerminationPolicyType`: the strategy used to choose when to terminate NMF.
• `InitializationRuleType`: the strategy used to choose the initial `W` and `H` matrices.
• `UpdateRuleType`: the update rules used for NMF:
• `NMFMultiplicativeDistanceUpdate`: update rule that ensure the Frobenius norm of the reconstruction error is decreasing at each iteration.
• `NMFMultiplicativeDivergenceUpdate`: update rules that ensure Kullback-Leibler divergence is decreasing at each iteration.
• `NMFALSUpdate`: alternating least-squares projections for `W` and `H`.
• For custom update rules, use the more general `AMF` class.

### ðŸ”— Advanced Functionality: `TerminationPolicyType`

• Specifies the strategy to use to choose when to stop the NMF algorithm.
• An instantiated `TerminationPolicyType` can be passed to the NMF constructor.
• The following choices are available for drop-in usage:

#### `SimpleResidueTermination` (default):

• Terminates when a maximum number of iterations is reached, or when the residue (change in norm of `W * H` between iterations) is sufficiently small.
• Constructor: `SimpleResidueTermination(minResidue=1e-5, maxIterations=10000)`
• `minResidue` (a `double`) specifies the sufficiently small residue for termination.
• `maxIterations` (a `size_t`) specifies the maximum number of iterations.
• `nmf.Apply()` will return the residue of the last iteration.

#### `MaxIterationTermination`:

• Terminates when the maximum number of iterations is reached.
• No other condition is checked.
• Constructor: `MaxIterationTermination(maxIterations=1000)`
• `nmf.Apply()` will return the number of iterations performed.

#### `SimpleToleranceTermination<MatType, WHMatType>`:

• Terminates when the nonzero residual decreases a sufficiently small relative amount between iterations (e.g. `(lastNonzeroResidual - nonzeroResidual) / lastNonzeroResidual` is below a threshold), or when the maximum number of iterations is reached.
• The residual must remain below the threshold for a specified number of iterations.
• The nonzero residual is defined as the root of the sum of squared elements in the reconstruction error matrix `(V - WH)`, limited to locations where `V` is nonzero.
• Constructor: `SimpleToleranceTermination<MatType, WHMatType>(tol=1e-5, maxIter=10000, extraSteps=3)`
• `MatType` should be set to the type of `V` (see `Apply()` Parameters).
• `WHMatType` (default `arma::mat`) should be set to the type of `W` and `H` (see `Apply()` Parameters).
• `tol` (a `double`) specifies the relative nonzero residual tolerance for convergence.
• `maxIter` (a `size_t`) specifies the maximum number of iterations before termination.
• `extraSteps` (a `size_t`) specifies the number of iterations where the relative nonzero residual must be below the tolerance for convergence.
• The best `W` and `H` matrices (according to the nonzero residual) from the final `extraSteps` iterations are returned by `nmf.Apply()`.
• `nmf.Apply()` will return the nonzero residue of the iteration corresponding to the best `W` and `H` matrices.

#### `ValidationRMSETermination<MatType>`:

• Holds out a validation set of nonzero elements from `V`, and terminates when the RMSE (root mean squared error) on this validation set is sufficiently small between iterations.
• The validation RMSE must remain below the threshold for a specified number of iterations.
• `MatType` should be set to the type of `V` (see `Apply()` Parameters).
• Constructor: `ValidationRMSETermination<MatType>(V, numValPoints, tol=1e-5, maxIter=10000, extraSteps=3)`
• `V` is the matrix to be decomposed by `Apply()`. This will be modified (validation elements will be removed).
• `numValPoints` (a `size_t`) specifies number of test points from `V` to be held out.
• `tol` (a `double`) specifies the relative tolerance for the validation RMSE for termination.
• `maxIter` (a `size_t`) specifies the maximum number of iterations before termination.
• `extraSteps` (a `size_t`) specifies the number of iterations where the validation RMSE must be below the tolerance for convergence.
• The best `W` and `H` matrices (according to the validation RMSE) from the final `extraSteps` iterations are returned by `nmf.Apply()`.
• `nmf.Apply()` will return the best validation RMSE.

#### Custom policies:

• A custom class for termination behavior must implement the following functions.
``````// You can use this as a starting point for implementation.
class CustomTerminationPolicy
{
public:
// Initialize the termination policy for the given matrix V.  (It is okay to
// do nothing.)  This function is called at the beginning of Apply().
//
// If the termination policy requires V to compute convergence, store a
// reference or pointer to it in this function.
template<typename MatType>
void Initialize(const MatType& V);

// Check if convergence has occurred for the given W and H matrices.  Return
// `true` if so.
//
// Note that W and H may have different types than V (i.e. V may be sparse,
// and W and H must be dense.)
template<typename WHMatType>
bool IsConverged(const MatType& H, const MatType& W);

// Return the value that should be returned for the `nmf.Apply()` function
// when convergence has been reached.  This is called at the end of
// `nmf.Apply()`.
const double Index();

// Return the number of iterations that have been completed.  This is called
// at the end of `nmf.Apply()`.
const size_t Iteration();
};
``````

### ðŸ”— Advanced Functionality: `InitializationRuleType`

• Specifies the strategy to use to initialize `W` and `H` at the beginning of the NMF algorithm.
• An initialization `InitializationRuleType` can be passed to the following constructor:
• `nmf = NMF(terminationPolicy, initializationRule)`
• The following choices are available for drop-in usage:

#### `RandomAcolInitialization<N>` (default):

• Initialize `W` by averaging `N` randomly chosen columns of `V`.
• Initialize `H` as uniform random in the range `[0, 1]`.
• The default value for `N` is 5.

#### `NoInitialization`:

• When `nmf.Apply(V, rank, W, H)`, the existing values of `W` and `H` will be used.
• If `W` is not of size `V.n_rows` x `rank`, or if `H` is not of size `rank` x `V.n_cols`, a `std::invalid_argument` exception will be thrown.

#### `GivenInitialization<MatType>`:

• Set `W` and/or `H` to the given matrices when `Apply()` is called.
• `MatType` should be set to the type of `W` or `H` (default `arma::mat`); see `Apply()` Parameters.
• Constructors:
• `GivenInitialization<MatType>(W, H)`
• Specify both initial `W` and `H` matrices.
• `GivenInitialization<MatType>(M, isW=true)`
• If `isW` is `true`, then set initial `W` to `M`.
• If `isW` is `false`, then set initial `H` to `M`.
• This constructor is meant to only be used with `MergeInitialization` (below).

#### `RandomInit`:

• Initialize `W` and `H` as uniform random in the range `[0, 1]`.

#### `AverageInitialization`:

• Initialize each element of `W` and `H` to the square root of the average value of `V`, adding uniform random noise in the range `[0, 1]`.

#### `MergeInitialization<WRule, HRule>`:

• Use two different initialization rules, one for `W` (`WRule`) and one for `H` (`HRule`).
• Constructors:
• `MergeInitialization<WRule, HRule>()`
• Create the merge initialization with default-constructed rules for `W` and `H`.
• `MergeInitialization<WRule, HRule>(wRule, hRule)`
• Create the merge initialization with instantiated rules for `W` and `H`.
• `wRule` and `hRule` will be copied.
• Any `WRule` and `HRule` classes must implement the `InitializeOne()` function.

#### Custom rules:

• A custom class for initializing `W` and `H` must implement the following functions.
``````// You can use this as a starting point for implementation.
class CustomInitialization
{
public:
// Initialize the W and H matrices, given V and the rank of the decomposition.
// This is called at the start of `Apply()`.
//
// Note that `MatType` may be different from `WHMatType`; e.g., `V` could be
// sparse, but `W` and `H` must be dense.
template<typename MatType, typename WHMatType>
void Initialize(const MatType& V,
const size_t rank,
WHMatType& W,
WHMatType& H);

// Initialize one of the W or H matrices, given V and the rank of the
// decomposition.
//
// If `isW` is `true`, then `M` should be treated as though it is `W`;
// if `isW` is `false`, then `M` should be treated as thought it is `H`.
//
// This function only needs to be implemented if it is intended to use the
// custom initialization strategy with `MergeInitialization`.
template<typename MatType, typename WHMatType>
void InitializeOne(const MatType& V,
const size_t rank,
WHMatType& M,
const bool isW);
};
``````

Use a pre-specified initialization for `W` and `H`.

``````// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_mat V;

arma::mat W, H;

// Pre-initialize W and H.
// W will be filled with random values from a normal distribution.
// H will be filled with 1s.
W.randn(V.n_rows, 15);
H.set_size(15, V.n_cols);
H.fill(0.2);

mlpack::NMF<mlpack::SimpleResidueTermination, mlpack::NoInitialization> nmf;
const double residue = nmf.Apply(V, 15, W, H);
const double rmse = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);

std::cout << "RMSE of NMF decomposition with pre-specified W and H: " << rmse
<< "." << std::endl;
``````

Use `ValidationRMSETermination` to decompose the MovieLens dataset until the RMSE of the held-out validation set is sufficiently low.

``````// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_mat V;

arma::mat W, H;

// Create a ValidationRMSETermination class that will hold out 3k points from V.
// This will remove 3000 nonzero entries from V.
mlpack::ValidationRMSETermination<arma::sp_mat> t(V, 3000);

// Create the NMF object with the instantiated termination policy.
mlpack::NMF<mlpack::ValidationRMSETermination<arma::sp_mat>> nmf(t);

// Perform NMF with a rank of 20.
// Note the RMSE returned here is the RMSE on the validation set.
const double rmse = nmf.Apply(V, 20, W, H);
const double rmseTrain = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);

std::cout << "Training RMSE:   " << rmseTrain << "." << std::endl;
std::cout << "Validation RMSE: " << rmse << "." << std::endl;
``````

Use all three sets of NMF update rules and compare the RMSE on a held-out validation set.

``````// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_mat V;

arma::mat W1, W2, W3;
arma::mat H1, H2, H3;

// Create a ValidationRMSETermination class that will hold out 3k points from V.
// This will remove 3000 nonzero entries from V.
mlpack::ValidationRMSETermination<arma::sp_mat> t(V, 3000);

// Multiplicative distance update rule.
mlpack::NMF<mlpack::ValidationRMSETermination<arma::sp_mat>,
mlpack::RandomAcolInitialization<5>,
mlpack::NMFMultiplicativeDistanceUpdate> nmf1(t);

// Multiplicative divergence update rule.
mlpack::NMF<mlpack::ValidationRMSETermination<arma::sp_mat>,
mlpack::RandomAcolInitialization<5>,
mlpack::NMFMultiplicativeDivergenceUpdate> nmf2(t);

// Alternating least squares update rule.
mlpack::NMF<mlpack::ValidationRMSETermination<arma::sp_mat>,
mlpack::RandomAcolInitialization<5>,
mlpack::NMFALSUpdate> nmf3(t);

const double rmse1 = nmf1.Apply(V, 15, W1, H1);
const double rmse2 = nmf2.Apply(V, 15, W2, H2);
const double rmse3 = nmf3.Apply(V, 15, W3, H3);

// Print the RMSEs.
std::cout << "Mult. dist. update RMSE: " << rmse1 << "." << std::endl;
std::cout << "Mult. div. update RMSE:  " << rmse2 << "." << std::endl;
std::cout << "ALS update RMSE:         " << rmse3 << "." << std::endl;
``````

Use a custom termination policy that sets a limit on how long NMF is allowed to take. First, we define the termination policy:

``````class CustomTimeTermination
{
public:
CustomTimeTermination(const double totalAllowedTime) :
totalAllowedTime(totalAllowedTime) { }

template<typename MatType>
void Initialize(const MatType& /* V */)
{
totalTime = 0.0;
iteration = 0;
c.tic();
}

template<typename WHMatType>
bool IsConverged(const WHMatType& /* W */, const WHMatType& /* H */)
{
totalTime += c.toc();
c.tic();
++iteration;
return (totalTime > totalAllowedTime);
}

const size_t Iteration() const { return iteration; }

private:
double totalAllowedTime;
double totalTime;
size_t iteration;
arma::wall_clock c; // used for convenient timing
};
``````

Then we can use it in the test program:

``````// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_fmat V;