mlpack IRC logs, 2018-06-26

Logs for the day 2018-06-26 (starts at 0:00 UTC) are shown below.

June 2018
--- Log opened Tue Jun 26 00:00:20 2018
02:05 -!- manish7294 [~yaaic@2405:205:22a0:b2d3:b9ac:5350:e446:dba9] has joined #mlpack
02:07 < manish7294> rcurtin: Are you up?
02:09 < manish7294> rcurtin: Shall we start working on boostmetric as well? What do you think?
02:22 < rcurtin> manish7294: I am awake but I'm sorry, it's been a super busy day. it will have to be tomorrow until I can respond, sorry about that
02:22 < rcurtin> if you'd like to start on boostmetric that's ok, but I still have to open lots of LMNN optimization tickets, so we should still take some time on that
02:23 < manish7294> rcurtin: No problem, I was thinking to start as I have some time now.
02:23 < rcurtin> right, if you have nothing to do now, I see no reason not to get started
02:24 < manish7294> great, I will try looking into that, bye :)
02:31 -!- manish7294 [~yaaic@2405:205:22a0:b2d3:b9ac:5350:e446:dba9] has quit [Quit: Yaaic - Yet another Android IRC client -]
11:06 < jenkins-mlpack> Yippee, build fixed!
11:06 < jenkins-mlpack> Project docker mlpack nightly build build #361: FIXED in 3 hr 52 min:
11:32 -!- witness_ [uid10044@gateway/web/] has quit [Quit: Connection closed for inactivity]
13:14 < rcurtin> hi everyone, I wanted to let everyone know that I've quit my job at Symantec and will start a new one at a startup that focuses on in-database machine learning
13:14 < rcurtin> the reason I mention it here is that it will affect our build server masterblaster (and the other systems), which are Symantec property
13:14 < rcurtin> we can still use them for now, but I'm not sure for how long---it may be two weeks, it may be much longer
13:15 < rcurtin> I'll be transitioning Jenkins to run on the domain name, and physically the Jenkins master will be the same system that runs
13:15 < rcurtin> so, Jenkins will still run, but I suspect that there may be fewer build executors in the future to do things like the crazy nightly build and weekly builds
13:16 < rcurtin> though to be honest, maybe it is a good thing for the environment if we do fewer builds :)
13:16 < rcurtin> I'll make these changes over the next week
13:20 < zoq> rcurtin: Best of luck at the new position, and let me know if you need any help.
13:21 < rcurtin> sounds good, and I am looking forward to the new job... I think it will be a better fit where I can focus more on fast machine learning :)
13:21 < rcurtin> I may (but I am not sure yet) be writing Julia bindings for mlpack as a part of my first work there
13:22 < rcurtin> and I guess it goes without saying but I won't be disappearing from this community or anything :)
13:27 < zoq> rcurtin: Perhaps we could look into jenkins blue ocean :)
13:32 < rcurtin> that looks fine to me, do I understand correctly that it's mostly just a set of nice plugins for Jenkins?
13:33 < zoq> Yeah, that provides a modern interface.
13:34 < rcurtin> ah, ok, well once I move the Jenkins config we can overhaul it with the blue ocean plugins
14:04 < sumedhghaisas> rcurtin: Best of luck for the new position Ryan :)
14:04 < sumedhghaisas> although, what is in-database machine learning?
14:05 < ShikharJ> rcurtin: Congratulations on the new job!
14:20 -!- ImQ009 [~ImQ009@unaffiliated/imq009] has joined #mlpack
14:23 -!- yaswagnere [4283a544@gateway/web/freenode/ip.] has joined #mlpack
14:26 -!- manish7294 [8ba77436@gateway/web/freenode/ip.] has joined #mlpack
14:32 < manish7294> rcurtin: Congrats! 👍
14:37 < rcurtin> thanks everyone!
14:37 < rcurtin> so the idea is this...
14:38 < rcurtin> normally you have all your data in some big set of SQL tables or whatever
14:38 < rcurtin> then you do some costly join operation across a bunch of tables to get your data matrix
14:39 < rcurtin> if you can do machine learning directly in the database you may be able to avoid some joins
14:40 < rcurtin> and this can result in large speedups
14:40 < rcurtin> anyway I don't know too much more quite yet
14:40 < rcurtin> but I do like accelerating things so it sounds good to me :)
15:58 -!- manish7294 [8ba77436@gateway/web/freenode/ip.] has quit [Ping timeout: 260 seconds]
15:59 -!- zoq [] has quit [Remote host closed the connection]
16:01 -!- zoq [] has joined #mlpack
16:51 < ShikharJ> zoq: Are you there?
17:03 < zoq> ShikharJ: I'm here now.
17:05 < ShikharJ> zoq: I think we can reuse the code for GAN by implementing just the specific Evaluate and Gradient routines, for different variants.
17:06 < ShikharJ> zoq: For GAN and DCGAN, the two policies are same. For WGAN, weight clipping only requires a couple lines of change inside the Gradient function.
17:07 < ShikharJ> zoq: For WGAN with Gradient Penalty, we would need to change both the Evaluate and Gradient (as far as I can think).
17:08 < zoq> ShikharJ: Sounds like a good idea to me, always nice to avoid code duplication, somehow.
17:08 < ShikharJ> zoq: We can keep the core structure similar, and create a separate folder for GAN Evaluate/Gradient policies.
17:09 < ShikharJ> zoq: Then, the task of implementing other variants of GAN would be a lot easier for new contributors, and implementing GAN variants can become a secondary priority for mlpack (and we can take up other avenues of work).
17:09 < zoq> ShikharJ: Agreed, that would allow us to provide an alias for the different models, so you could directly use GAN, DCGAN, etc.
17:10 < zoq> ShikharJ: I like the idea.
17:11 < ShikharJ> zoq: Alright, I'll refactor the codebase and push in a few hours then. I'll be off in a second (I have to go for dinner :) ).
17:12 < zoq> ShikharJ: Okay, I don't think we have to rush here, as you already said, once the code is refactored, it's a lot easier to add new models.
17:13 < ShikharJ> zoq: A few variants may still need a new structure (for example StackGAN), but then this would take care of a lot of variants :)
19:27 < ShikharJ> zoq: I've pushed in the changes, but I need a little help with debugging the issues. Could you help me out?
19:35 < zoq> ShikharJ: Sure, looks like it's not able to access some parameter.
19:36 < ShikharJ> zoq: I thought that declaring the policy classes as friend classes would take care of that problem.
19:36 < ShikharJ> Friends of gan.hpp class
19:37 < zoq> ShikharJ: Yes, let me comment on the PR.
19:47 < ShikharJ> zoq: How would the code-base have to change when someone wishes to add another policy to the GAN? Am I missing something?
19:47 < ShikharJ> zoq: I'm not sure I follow.
19:48 < zoq> ShikharJ; I would have to make the policy a friend of the GAN class.
19:49 < zoq> ShikharJ: So I can't just define my own policy and link against mlpack.
19:50 < ShikharJ> zoq: I see, this would come up while using mlpack as an external library right?
19:50 < zoq> ShikharJ: right
19:51 < zoq> ShikharJ: We could follow the current design, but I think it makes sense to keep that in mind.
19:52 < ShikharJ> zoq: So what solution would you propose as an alternative to that? It seems like a good feature to have.
19:54 < ShikharJ> zoq: We can try creating the Policy object and passing that directly, but I don't prefer that very much. Is there another way?
19:57 < zoq> ShikharJ: We could extend the Evaluate/Gradient function and pass the extra parameter, or as I said on the PR, we could pass the object, and access the Rest via function.Reset().
20:01 -!- ImQ009 [~ImQ009@unaffiliated/imq009] has quit [Quit: Leaving]
20:10 < ShikharJ> zoq: I'm not too positive about changing the existing API as well. What do you think about maybe creating individual classes and making the evaluate and gradient functions pure virtual?
20:11 < ShikharJ> zoq: Else we can also go with the current implementation.
20:13 < zoq> ShikharJ: Not really a fan of virtual, also not sure the modifications are that significant.
20:13 < zoq> ShikharJ: Don't mind to use the current design.
20:14 < ShikharJ> zoq: Do we provided that kind of flexibility anywhere else in the codebase? If there is such a pipeline, I have no issues in replicating it.
20:16 < zoq> ShikharJ: This is what the optimizer framework does, we just make sure every policy implements Update and pass the parameters, even if they are not used like in this case:
20:19 < ShikharJ> zoq: Isn't that the purpose of `virtual`? Why don't we use that? My guess is that would also lead to better compiler optimizations.
20:21 < Atharva> Sorry I haven't posted last week's updates. I will do it first thing tomorrow morning. From the next week I will post it on time.
20:22 < zoq> ShikharJ: I don't think it would be faster:
20:24 < ShikharJ> zoq: Ah, I see, with templates it is probably not a good idea.
20:31 < ShikharJ> zoq: What do you have in mind regarding `extending the Evaluate/Gradient function and passing the extra parameter` as you mentioned?
20:32 < ShikharJ> Something similar to the cyclical_decay example above?
20:33 < zoq> ShikharJ: Yes, currently Evaluate takes: parameters, i, batchSize, but inside the policy we need reset and probably some other parameters, so we could do something like policy.Evaluate(parameters, i, batchSize, reset, ...), or we pass this, and provide some getter.
20:38 < ShikharJ> zoq: In that case, I'd vote to stick with the current implementation. If we get a user request for this feature, we can implement this in the future, for now it seems really untempting to extend the Evaluate and Gradient functions in such a manner. Your call?
20:38 < zoq> ShikharJ: Agreed, let's focus on the main part for now.
20:39 < zoq> ShikharJ: I'll see if I can put some time into the refactoring and if it turns out to be useful, we can merge the ideas.
20:41 < ShikharJ> zoq: I'm saying this because even if we modularise the getter into a single object and just extend the parameters by a single getter object, still that getter would lead to performance decrease, as a number of parameters are required. And I'm not too in favour of putting all the parameters directly into the function signature.
20:42 < zoq> ShikharJ: Good point, will write a simple benchmark to test the performance.
20:42 < ShikharJ> zoq: Evaluate and Gradient tend to access every member element inside the class.
20:43 < zoq> ShikharJ: Right, the signature would be huge, so probably not the best idea either.
20:44 < ShikharJ> zoq: Inheritance would be pretty tempting in such a scenario, but I see the concern regarding runtimes as well.
20:46 < zoq> ShikharJ: Perhaps it's an option here, since the major part is the matrix multiplications.
21:17 -!- yaswagnere [4283a544@gateway/web/freenode/ip.] has quit [Quit: Page closed]
--- Log closed Wed Jun 27 00:00:21 2018