mlpack  blog
String Processing Utilities - Week 01

String Processing Utilities - Week 01

Jeffin Sam, 09 June 2019

Apologies for being late again, but I was probably waiting for a weekend to write the blog.

So, It has been one successful week with GsoC at mlpack, during the first week I tried implementing function useful for string processing such as - function to remove stopwords, punctuation and to convert string to lowercase or uppercase. Initially, I though of implementing them as standalone function but after a small discussion we agreed on to have a class-based implementation. I have opened a work in progress PR1904 still a lot of refactoring is needed for the implementation. Also, a concrete plan should be found to implement the bindings for these function. I am lagging behind the schedule a little bit, but I guess I will cover up.

Now coming to my updates about previous PR1814, Since to make the functionality efficient both in terms of space and time complexities, to avoid many copies we introduced Boost::string_view, but with that also came many minute issues which had to be resolved. And hence I had to overload copy constructor and assignment operator. Also, serialization was introduced to the PR1814 and also for the PR1876.

I will try to be more punctual in updating you all next time. Thank you :)

Note
Since the world is full of memes, here is a particular one which I would like to share with you all.