mlpack
blog

Table of Contents
A list of all recent posts this page contains.
 for Proximal Policy Optimization method  Summary
 String Processing Utilities  Summary
 Application of ANN Algorithms Implemented in mlpack  Summary
 Application of ANN Algorithms Implemented in mlpack  Week 12
 NEAT & Multiobjective Optimization  Summary
 Implementing Improved Training Techniques for GANs  Week 11
 Implementing Essential Deep Learning Modules  Week 12
 Application of ANN Algorithms Implemented in mlpack  Week 11
 Implementing Essential Deep Learning Modules  Week 11
 String Processing Utilities  Week 11
 Implementing Improved Training Techniques for GANs  Week 10
 Proximal Policy Optimization  Week 9
for Proximal Policy Optimization method  Summary
Time flies, the summer is coming to end, we come to the final week of GSoC. This blog is the summary of my GSoC project – implementation of one of the most promising dee reinforcement learning method. During this project, I implemented policy optimization method, one classical continuous task, i.e. Lunar lander, to test my implementation. Also my pull request for prioritized experience replay was merged into master.
Introduction
My work mainly locates in methods/reinforcement_learning
, methods/ann/ loss_functions
and methods/ann/dists
folder
ppo.hpp
: the main entrance for proximal policy optimization.ppo_impl.hpp
: the main implementation for proximal policy optimization.lunarlander.hpp
: the implementation of the continuous task.prioritized_replay.hpp
: the implementation of prioritized experience replay.sumtree.hpp
: the implementation of segment tree structure for prioritized experience replay.environment
: the implementation of two classical control problems, i.e. mountain car and cart polenormal_distribution.hpp
: the implementation of normal distribution which acceptmean
andvariance
to construct distribution.empty_loss.hpp
: the implementation of empty loss which used in proximal policy optimization class, we calculate the loss outside the model declaration, the loss does nothing just backward the gradient.
In total, I contributed following PRs, most of the implementations are combined into the single Pull request in proximal policy optimization.
Change the pendulum action type to double
Fix typo in Bernoulli distribution
remove move() statement in hoeffding_tree_test.cpp
Highlights
The most challenging parts are:
 One of the most challenging parts of the work is that how to calculate the surrogate loss for updating the actornetwork, it is different from the updater for the critic network which can be optimized by regression on mean square error. The actornetwork is optimized by maximizing the PPOclip objective, it is a little difficult to implement it like the current loss function which calculated by passing target and predict parameters, so I calculate it outside the model and the declaration of the model is passed into the empty loss function. All the backward process except the model part are calculated by my implementation, like the derivation to the normal distribution's mean and variance.
 Another challenging part of the work is that I implement the proximal policy optimization in the continuous environment, the action is different from the discrete environment. In the discrete environment, the agent just output one dimension's data to represent the action, while in the continuous environment, the agent action prediction is more complicated, the common way to achieve predicting the agent action is to predict a normal distribution, then use the normal distribution to sample an action.
 Also there are other challenging parts of the work, such as tuning the neural network to make the agent to work. This blocks me now so that I need to tune more parameters to pass the unit test. This part is also a timeconsuming process.
Future Work
The pull request of proximal policy optimization is still underdeveloped due to the tuning parameters for the unit test, but it will be fixed soon.
PPO can be used for environments with either discrete or continuous action spaces, so another future work will be to support the discrete action spaces, even though it is easy than the continuous task.
In some continuous environment task, the dimensions of action are more one, we need to handle this situation.
Acknowledgment
A great thanks to Marcus for his mentorship during the project and detailed code review. Marucs is helpful and often tell me that do not hesitate to ask questions. He gives me great help when something blocked me. I also want to thank Ryan's response in IRC even though at midnight. The community is kind since the first meeting, we talk about a lot of things which contain different areas. Finally, I also appreciate t he generous funding from Google. It is a really good project to sharpen our skills. I will continue to commit to mlpack and make the library more easy to use and more powerful.
Thank for this unforgettable summer session.
String Processing Utilities  Summary
This post summarize my work for GSoC 2019
Overview
The proposal for String Processing Utilities involved implementing basic functions which would be helpful in processing and encoding text and then latter implementing machine learning algorithms on it.
Implementation
String Cleaning PR1904
 The implementation started with implementing String Cleaning Functions, A classbased approach was used to implement the function, following were the function which was implemented :
RemovePunctuation()
: The function allows you to pass a string known as punctuation, which could involve all the punctuations to be removed.RemoveChar()
: This function allows you to pass a function pointer or function object or a lambda function which return a bool value and if the return value is true, the character would be removed.LowerCase()
: Convert the text to lower case.UpperCase()
: Convert the text to upper case.RemoveStopWords()
: This function accepts a set of stopword and removed all those words from the corpus.
 After implementing the class, I started implementing CLI and python binding, since mlpack used armadillo to load matrix and hence I had to write a function which could read data from a file using basic input output stream. The types of file are limited to .txt or .csv. The binding has different parameters to set and would work as required based on parameters passed.
String Encoding PR1960
 The initial plan was to implement a different class for different encoding methods such as BOW encoding, Dictionary encoding or TfIdf encoding, but we found that the class had lot of codes which we redundant, and hence we decided to implement a policybased method and the implement different policy for each of the encoding type.
 We implemented
StringEncoding
class which has the function for encoding the corpus (accepts a vector as input) and outputs you the encoded data based on the policy and output type, vector or arma::mat, Also provided an option with padding and to avoid padding depending on the encoding policy  We also designed a helper class
StringEncodingDictionary
, which maintains a dictionary mapping of the token to its labels, The class is a templated class based on the type of tokens, which involves string_view or int type. We arrived at the conclusion of implementing this helper class based on the speed profiling done by lozhnikov. He concluded some results, and thus we decided to implement a helper class.
Policies for String Encoding PR1969
 We decided to implement three policy for encoding, namely as follows :
Dictionary Encoding
: This encoding policy allows you to encode the corpus by assigning a positive integer number to each unique token and treats the dataset as categorical, it supports both padding and nonpadding output.Bag of Words Encoding
: The encoder creates a vector of all the unique token and then assigns 1 if the token is present in the document, 0 if not present.TfIdf Encoding
: The encoder assigns a tfidf number to each unique token.
Tokenizer PR1960
 To help with all the string processing and encoding algorithms, we often needed to tokenize the string and thus we implemented two tokenizers in mlpack. The two tokenizers are as follows:
CharExtract
: This tokenizer is used to split a string into characters.SplitByAnyOf
: The SplitByAnyOf class tokenizes a string using a set of delimiters.
After implementing all the encoding policies and tokenizer, I decided to implement CLI and python binding PR1980 for String Encoding, Both string encoding and string cleaning function share a lot of common function and hence we decided to share a common file string_processing_util.hpp
between the two bindings.
My proposal also included Implementation of Word2Vec, but we decided to optout since we found that google patented it.
Post GSoC
A lot of the codes I implemented are sketchy since I have used boost::string_view and other boost algorithms and hence we need to do a speed check and find out the bottlenecks if any. Also, my plan is to implement any substitute for word2vec, such as GLOVE or any other word embedding algorithms. I had implemented a function for One hot Encoding, which I thought could be useful for word2vec, but we found out that it was buggy to a small extent and hence I have to find a way out and also have to implement some overloaded functionality.
Lastly, the most important part, I have to write tutorials for all the functionality provided to allow someone to understand how to drop these functions in their codebase, Also excited to do some machine learning stuff on text dataset using mlpack.
Acknowledgement
A big thanks to lozhnikov, Rcurtin, Zoq, and the whole mlpack
community. This was my first attempt at GSoC, and I am happy that I was successful in it. I fell in love with the opensource world and it was a wonderful experience. I gathered a lot of knowledge in these past 3 months. I will continue to be in touch with the mlpack
community and seek to do more contributions to the project in the future.
Also, I think its time to order some mlpack stickers :)
Thanks :)
Application of ANN Algorithms Implemented in mlpack  Summary
Works
All GSoC contributions can be summarized by the following.
Contributions to mlpack/models
Pull Requests
 VGG19 on Imagenet dataset. #32
 Added LSTM Sentiment Analysis. #31
 Added LSTM Univariate Time series analysis #30 Merged
 Added LSTM for multivariate time series. #29 Merged
 Added VGG19 Model for MNIST Dataset. #28
Contributions to mlpack/mlpack
Merged Pull Requests
 Added support for Loading image #1903
 Rectified imports in Python documentation #1820.
 Added cellState as output params in LSTM. #1800.
 Added quoted_strings to regex in LoadCSV #1756.
 Added additional check to LoadARFF #1793.
 Make models more accessible in Python #1771.
 Rectified ann.txt (mlpack/doc) #1731.
 Added .pyc in .gitignore #1721.
Issues Created
 LoadARFF gets stuck if file is not found #1791.
 Exposing Cell and hidden state in LSTM #1782.
 Cannot load text in CSV files #1754.
Contributions to zoq/gym_tcp_api
Merged Pull Requests
Loading Images
Image utilities supports loading and saving of images.
It supports filetypes jpg, png, tga,bmp, psd, gif, hdr, pic, pnm for loading and jpg, png, tga, bmp, hdr for saving.
The datatype associated is unsigned char to support RGB values in the range 1255. To feed data into the network typecast of arma::Mat
may be required. Images are stored in matrix as (width * height * channels, NumberOfImages). Therefore imageMatrix.col(0) would be the first image if images are loaded in imageMatrix.
Loading a test image. It also fills up the ImageInfo class object.
Similarily for saving images.
VGG19
VGG19 is a convolutional neural network that is trained on more than a million images from the ImageNet database. The network is 19 layers deep and can classify images into 1000 object categories. Details about the network architecture can be found in the following arXiv paper: For more information, read the following paper:
Tiny Imagenet
Tiny ImageNet Challenge is the default course project for Stanford CS231N. It runs similar to the ImageNet challenge (ILSVRC). The goal of the challenge is for you to do as well as possible on the Image Classification problem. The model uses VGG19 to classify the images into 200 classes.
MNIST handwritten digits
The VGG19 model used for classification. It creates a sequential layer that encompasses the various layers of the VGG19.
Sentiment Analysis
We will build a classifier on IMDB movie dataset using a Deep Learning technique called RNN which can be implemented using Long Short Term Memory (LSTM) architecture. The encoded dataset for IMDB contains a vocab file along with sentences encoded as sequences. A sample datapoint [1, 14, 22, 16, 43, 530,..., 973, 1622, 1385, 65]. This sentence contains 1st word, 14th word and so on from the vocabulary.
A vectorized input has to be fed into the LSTM to explot the RNN architecture. To vectorize the sequence dictionary encoding is used. The sample shown would be transformed to [[1, 0, 0,.., 0], [0,..,1,0,...], ....], here the first list has !st position as 1 and rest as 0, similarly the second list has 14th element 1 and rest 0. Each list has a size of the numbers of words in the vocabulary.
Accuracy Plots
Time Series Analysis
Multivariate
We want to use the power of the LSTM in Google stock prediction using time series. We will use mlpack and Recurrent Neural Network(RNN).
MSE Plot
Univariate
Implementation of an example of using Recurrent Neural Network (RNN) to make forcasts on a time series of electric usage (in kWh), which we aim to solve using a recurrent neural network with LSTM.
MSE Plot
Results on other datasets  International Airline Passengers
This is a problem where, given a year and a month, the task is to predict the number of international airline passengers in units of 1,000. The data ranges from January 1949 to December 1960, or 12 years, with 144 observations.
We will create a dataset where X is the number of passengers at a given time (t) and Y is the number of passengers at the next time (t + 1) over the period of 'rho' frames.
Mean Squared error upto 10 iterations. Training ...
1  MeanSquaredError := 0.146075 2  MeanSquaredError := 0.144882 3  MeanSquaredError := 0.09501 4  MeanSquaredError := 0.0875479 5  MeanSquaredError := 0.0836975 6  MeanSquaredError := 0.0796567 7  MeanSquaredError := 0.0804368 8  MeanSquaredError := 0.0803483 9  MeanSquaredError := 0.0809061 10  MeanSquaredError := 0.076797 ....
MSE Plot
Future Work
The tutorial associated with the models implemented are not published to mlpack webpage. The blogs are needed to be linked to a common place for the user. VGG19 is being trained on tinyimagenet dataset, the results of which will be added.
Acknowledgement
I am sincerely grateful to the whole mlpack
community especially Ryan Curtin, Marcus Edel, Sumedh Ghaisas, Shikhar Jaiswal for the support I received. It was an awesome experience.
Overview
The aim of my project for Google Summer of Code 2019 was to implement NeuroEvolution of Augmenting Topologies (NEAT) in mlpack based on Kenneth Stanley's paper Evolving Neural Networks through Augmenting Topologies. I would also implement support for "phased searching", a searching scheme devised by Colin Green to prevent genome bloat when training NEAT on certain complex tasks.
Besides this, my project aimed to create a framework for multiobjective optimization within mlpack's optimization library ensmallen. This would involve the implementation of several test functions and indicators, as well as an optimizer, NSGAIII.
Implementation
NEAT
NeuroEvolution of Augmenting Topologies (NEAT) is a genetic algorithm that can evolve networks of unbound complexity by starting from simple networks and "complexifying" through different genetic operators. It has been used to train agents to play Super Mario World and generate "genetic art".
I implemented NEAT in PR #1908. The PR includes the entire implementation including phased searching, associated tests and documentation. NEAT was tested on:
 The XOR test, where it's challenge was to create a neural network that emulated a two input XOR gate. NEAT was able to solve this within 150 generations with an error less than 0.1.
 Multiple reinforcement learning environments implemented in mlpack.
 The pole balancing task in OpenAI Gym. This was done using the Gym TCP API implemented by my mentor, Marcus Edel. A short video of the trained agent can be seen here.
 The double pole balancing test. I implemented this as an addition to the existing reinforcement learning codebase. NEAT performed well on both the Markovian and nonMarkovian versions of the environment.
The pull request is rather large and is still under review.
Multiobjective optimization
MultiObjective Optimization is an area of multiple criteria decision making that is concerned with mathematical optimization problems involving more than one objective function to be optimized simultaneously. NSGAIII (Nondominated Sorting Genetic Algorithm) is an extension of the popular NSGAII algorithm, which optimizes multiple objectives by associating members of the population with a reference set of optimal points.
I implemented support for multiobjective optimization in PR #120. This PR includes:
 An implementation of NSGAIII. This code is still being debugged and tested.
 Implementation of the DTLZ test functions.
 Implementation of multiple indicators, including the epsilon indicator and the Inverse Generational Distance Plus (IGD+) indicator.
 Associated documentation.
Other work
Besides the work explicitly stated in my project, I also made some smaller changes and additions in the reinforcement learning codebase. This includes:
 Implementation of both a Markovian and nonMarkovian (velocity information not provided) version of the double pole balancing environments (see #1901 and #1951).
 Fixed issues with consistency in mlpack's reinforcement learning API, where certain functions and variables were missing from some environments. Besides this, the environments now have an option to terminate after a given number of steps. See #1941.
 Added QLearning tests wherever necessary to prevent issues with consistency in the future. See #1951.
Future Work
 Get the NEAT PR merged.
 Finish testing and debugging the NSGAIII optimizer.
 Write more indicators for multiobjective optimization.
Final Comments
I would like to thank my mentors, Marcus Edel and Manish Kumar for all their help and advice, as well as bearing with me through the multiple technical difficulties I faced during this summer. I'd also like to thank all the other mentors and participants for their support. I hope to continue to contribute to this community in the future, and look forward to the same.
If anyone is interested in seeing my weekly progress through the program, please see my weekly blog.
Implementing Improved Training Techniques for GANs  Week 11
This week after Toshal’s PR on support for more than 50 layers was merged a lot of my work was ready to be merged. Specifically, the padding layer has now been completed and merged. Also the mini batch discrimination layer is complete and the build is finally passing so, we should be able to merge that soon as well!
Other than that I have continued my work with spectral norm layer. One difficultly with the implementation is that the the layer uses a power iteration method during the forward pass for computing the spectral norm of the weight matrix. I am not completely sure how we would compute the gradient for this approximation. I have been try to do the manual derivation for this but it is very tedious and I have not been successful with it so far. Hopefully, in the coming week I can get it to work. Otherwise I hope to continue the work postGSoC.
Implementing Essential Deep Learning Modules  Week 12
Well my Bias visitor
PR is merged. Also my exactobjective
PR is merged. I also added a workaround to enable adding more layers to the ANN
. It looks like we can have as many layers as we want. We can always develop a tree like structure for adding more layers. The most interesting thing about the workaround PR was that the Branch
of the PR could be deleted just after two days of it's creation.
As soon as the workaround for adding more layer got merged, My WeightNormalization
layer was ready to merge. My Inception Layer
can also be now merged and I will start working on it soon.
In the upcoming week, I am thinking to add a small test for LSGAN
so that it can get merged. It's actual image testing would need some time. Most of the online available tests use Dual Optimizer
. So deciding the parameters for training would be challenging. If poosible I will also try to finish my work on Inception Layer
.
I am also currently testing my Dual Optimizer
PR. It's running from long time on savannah approximately (50 Days). Hopefully, I see good results after it gets completed :)
Application of ANN Algorithms Implemented in mlpack  Week 11
Tasks
 Completed the tutorials on LSTM Univariate.
 Added tutorials for VGG19 for mnist dataset.
 Completed the PR on VGG19 for tiny  imagenet dataset.
 Added tutorials for VGG19 trained on imagenet.
Next Week Goals
 Linking the tutorials to the mlpack page.
Thanks for reading. Have a nice day :smile:
Implementing Essential Deep Learning Modules  Week 11
In the last two weeks I have completed the exactobjective
PR. Also my Weight Norm
layer is quite complete and it's ready to merge. Both of the PR's required some time but they are now ready to merge.
My radical_test
PR also got merged. It was quite weird that the Memory Check
was timing out even after 15 hours of build. But yes after merging the build is getting passed without any issues. May be there could be a glitch around the system.
I have also started to implement LSGAN
. It's almost complete. I will need to add some validation to ensure that the corect loss function is used for LSGAN
. Testing it on savannah will also take sometime.
In the upcoming week I am aiming to complete LSGANs
and the Inception Layer
module. If possible I will also start working on BIGAN
.
String Processing Utilities  Week 11
This week started with extending lozhnikov's PR1960, I added Bagofwords encoding policy and also added TfIdf with different variants namely Raw_count, binary, sublinear_tf, term_frequency and also added test for both of the encoding policy, I think we are almost done with the encodings, and maybe some minor fixups needs to be done.
Now coming to the stringcleaning PR, we are done with that too, I made some minor fixups last week and also added some test for the CLI binding and updated documentation too, Again some minor fixups are remaining, apart from that everything is done.
For the coming week, my priority is to complete the Word2Vec algorithm, maybe I could just get the initial API done by the coming week and then can just complete the fullfledged API by the second week.
Also, postGSoC, I will write tutorials for both how to drop stringencoding API into your code and also how to drop scaling matrix API, so stay tuned for both of that :)
Thank you :)
Implementing Improved Training Techniques for GANs  Week 10
This week has been productive. I was able to finish the regularizers PR and it has been merged. I was able to complete the work for CGAN<>
after implemneting a Concat<>
visitor and have also a written a test for that. After we can successfully produce results from the CGAN I think that would also be ready for merge. I think the major task that remains would be checking the gradient issue in GAN as pointed out by Toshal. I have also kept my other PRs on Padding<>
and MinibatchDiscrimination
layers updated. The work there is complete however is blocked as we are unable to support more than 50 layer types at the momemt. Hopefully we can find a solution in the coming weeks.
8 Proximal Policy Optimization  Week 8
This week, I fixed the bug memory access violation
and some bugs that troubled me for a long time. I find that more problem than I expected. This is my first time writing model with loss calculated outside the model. I was familiar with PyTorch and TensorFlow, so I write code with the original stereotype. Such as I thought that the Normal distribution will accept mean and variance as parameters, in fact, it accepts mean and covariance as parameters. I am wondering whether I rewrite distribution to make it consistent with PyTorch framework. With mentor kindly remind, I realized that I am a little bit behind my schedule. Yes, it is. I am too optimistic about the workload, I think I need to devote more time to speed up the progress.
Thanks for reading :).
8 Proximal Policy Optimization  Week 8
This week, I rewrite the normal distribution by my own, so I can predict the normal distribution's parameter mean and variance to construct distribution. Then the agent can sample action by using the distribution. After carefully read the diagonal Gaussian distribution may can have the same functionality, maybe I can change to use this.
This week, I was stuck in how to backward gradient through the network. With the help of members in mlpack community, I have a more clear mind on that now. Maybe next week I can solve this problem.
Thanks for reading :).
AUTHORSTART Unknown AUTHORSTOP DATESTART 4 August 2019 DATESTOP PAGESTART UnknownPage PAGESTOP
8 Proximal Policy Optimization  Week 8
This week, I finally completed the backward process of the network. The problem is a little bit of challenge for me in the beginning. The key to solving this problem is that we need to have a clear understanding of how to network graph build so that we can backward the error through the graph. If I come across the same problem with more complicated graph, I think I can solve it on my own. I am here to thank that give me much help in practice.
Thanks for reading :).
AUTHORSTART Unknown AUTHORSTOP DATESTART 11 August 2019 DATESTOP PAGESTART UnknownPage PAGESTOP