This summer I got an opportunity to work with
mlpack organisation. My proposal for
Implementing Essential Deep Learning Modules got selected, and under the mentorship of
Shikhar Jaiswal and
Marcus Edel I was working on it for the last three months.
The main aim of my project was to enhance the existing
Generative Adversarial Network (GANS) framework present in
mlpack. The target was to add some more
GANs and to implement some new algorithms to
mlpack such that performance of
GANs is improved. The project also focussed on adding some new algorithms so that testing of
GANs becomes feasible.
Improving Serialization of GANs
The first challenge in the existing
GAN framework was to enable loading and saving the
GAN model correctly. In
mlpack the loading and saving of models was done with the help of
Boost Serialization. The most important responsibility was to have complete consistency so that all the functionalities of the model are working perfectly fine after saving and loading the model. The Pull Request #1770 focussed on this and it will get merged soon.
Providing Option to Calculate Exact Objective
In order to make Dual Optimizer functionality efficient it was required to remove the overhead of calculating the objective over the entire data after optimization. In order to do so I opened #109 in mlpack's
ensmallen repository. It is merged and now ensmallen's
Decomposable Optimizer's provide an option to user that weather he wish to calculate exact objective after optimization or not.
Dual Optimizer for GANs
Various research papers published related to
GANs used two seperate Optimizer's in their experiments. However mlpack
GAN framework had only one optimizer. Due to this testing of
GAN was quite tedious. So the main aim of the #1888 Pull Request was to add
Dual Optimizer functonality in GANs. The implementation of
Dual Optimizer is quite complete and currently it's testing is going on.
Label Smoothing in GANs
One sided label smoothing mentioned in the Improved GAN paper was seen to give better result while training
GAN model. Also in order to add LSGAN in mlpack label smoothing was required. It's implementation and testing is quite complete in #1915, however to make label smoothing work perfectly some commits from serialization PR were required. So, it will get merged once #1770 gets merged.
In order to prevent normalizing the
Bias parameter in Weight Norm Layer a bias visitor was required to set the
bias parameters of a layer. The first step was to add getter method
Bias() in some layers. Afterwards these getter methods were used to set the weights. The visitor is merged with the help of #1956 PR.
Work Around to add more layers
Boost::variant method is able to handle only 50 types. So inorder to add more layers to mlpack's ANN module a work around was required. After digging somewhat about the error online I found a work around. The
Boost::variant method provides an implicit conversion which enables adding as many layers as we can with the help of tree like structure. The #1978 PR was one of the fastest to get merged. I just completed it in two days such that the Weight Norm layer gets merged.
Weight Normalization Layer
Weight Normalization is a technique similar to
Batch Normalization which normalize the
weights of a layer rather than the activation. In this layer only
Non-bias weights are normalized. Due to normalization the gradients are projected away from the weight vector, thus testing the
gradients got tedious. The layer is implemented as a wrapper around another layer in #1920.
In order to complete my Frechet Inception Distance PR GoogleNet was required. In order to do that
Inception Layer is required. There are various versions of the
Inception Layer. The first version of the layer is quite complete. However #1950 will be merged after implementing it's all three versions. The
Inception Layer is basically just a
wrapper around a
Frechet Inception Distance
Frechet Inception Distance is used for testing the performance of the
GAN model. It uses the concept of
Frechet Distance which compares two
Gaussian Distribution as well as the parameters of the
Inception model. In order to get the parameters of Inception model #1950 will be required to merge first. The
Frechet Distance is currently implemented in #1939 and it will be integrated with
Inception Model once it is merged.
Fix failing radical test
While working on this PR I learned how important and tough testing is. The
RadicalMainTest was failing about 20 times in 100000 iterations. After quite a lot of digging it was found that the reason was that random values were being used for testing. With this PR I learned about
Whitening of a matrix and many other important concepts. The #1924 PR provided a fix matrix for the test.
Serialization of glimpse, meanPooling and maxPooling layers
While working on Gan Serialization, I found that the
maxPooling are not serialized properly. I fixed the serialization in #1905 PR. Finding the error was one of the patience testing job but it felt quite satisfied after fixing it.
Generator Update of GANS.
GANs I found one error in the update mechanism of it's
Generator. The issue is being discussed with the mentors, however the error seems ambiguous. Hence, #1933 will get merged after arriving at the conclusion on the issue.
Least Squares GAN uses
Least Squares error along with
smoothed labels for training. It's implementation is quite complete and #1972 will get merged once LSGANs testing will get completed.
Pull Request Summary
Merged Pull Requests
- Pull Request ensmallen/#109 - Providing option to calculate exact objective
- Pull Request #1905 - Serialization of glimpse, meanPooling and maxPooling layer
- Pull Request #1920 - Weight Normalization Layer
- Pull Request #1924 - Fix failing radical test
- Pull Request #1956 - Bias Visitor
- Pull Request #1978 - Work Around to add more layers
Open Pull Requests
- Pull Request #1770 - Improving Serialization of GANs
- Pull Request #1888 - Dual Optimizer for GANs
- Pull Request #1915 - Label Smoothing in GANs
- Pull Request #1933 - Error in Generator Update of GANs
- Pull Request #1939 - Frechet Inception Distance
- Pull Request #1950 - Inception Layer
- Pull Request #1972 - LSGAN
- Completion of Open Pull Requests.
- Addition of Stacked-GAN in
- Command Line Interface for training
- Command Line Interface for testing
I learned quite a lot while working with mlpack till now. When I joined mlpack I was quite a beginner in
Machine Learning and in the past months I have learned quite a lot. I also learned quite a lot how
Object Oriented Programming helps in
Developing a big project. Also my patience got tested while
debugging the code I have written. Overall it was quite good learning and fun.
I will keep contributing and will ensure that all of my Open PR's get merged.
I would also like to thank my mentors
rcurtin for their constant support and guidance throughout the summer. I learned quite a lot of things from them. I would also like to thank Google to give me an opportunity to work with such highly experienced people.
Generated by 1.8.13