mlpack IRC logs, 2018-07-09
Logs for the day 2018-07-09 (starts at 0:00 UTC) are shown below.
--- Log opened Mon Jul 09 00:00:38 2018
01:06 -!- vivekp [~vivek@unaffiliated/vivekp] has quit [Ping timeout: 264 seconds]
01:15 -!- vivekp [~vivek@unaffiliated/vivekp] has joined #mlpack
08:04 -!- prakhar_code[m] [prakharcod@gateway/shell/matrix.org/x-hbrppspmyceawepb] has quit [Ping timeout: 240 seconds]
08:29 -!- prakhar_code[m] [prakharcod@gateway/shell/matrix.org/x-yfjvlwptvwatptdj] has joined #mlpack
10:49 -!- Atharva [uid288001@gateway/web/irccloud.com/x-puwhstfztvlkxzlu] has joined #mlpack
13:20 < Atharva> About the parallelization of the ann module, what has been planned?
13:38 < zoq> Atharva: We should implement EvalauteWithGradient for the FFN class as well, and we should run some benchmarks against OpenBLAS.
13:38 < zoq> Shikhar already tested the GAN implementation with OpenBLAS but couldn't see huge performance improvements; perhaps there is some expression that needs to be rewritten.
13:38 < zoq> Besides that, there is definitely potential to improve/parallelize the conv operation.
13:48 < ShikharJ> zoq: I can implement EvaluateWithGradient function for FFN class for now. The thought hadn't occurred to me, this would be the fastest way to see a performance improvement.
13:49 < zoq> ShikharJ: That would be great, but don't feel obligated.
13:49 < Atharva> zoq: ShikharJ: Just to confirm, this is for multi-core cpu code execution, right?
13:51 < ShikharJ> Atharva: Yes. But EvaluateWithGradient function is just for performance improvement of the optimizer update call.
13:52 < Atharva> Okay, this seems really interesting.
13:52 < zoq> Right, if you like to checkout the GPU performance an easy solution is to switch to NVBLAS, Bandicoot might work as well, but NVBLAS might be easier.
13:53 < Atharva> zoq: Okay, I will check it out. Have you tried training models on mlpack using NVBLAS or Bandicoot?
13:55 < zoq> Atharva: I used NVBLAS; for some models you get some pretty good speedups.
13:55 < Atharva> zoq: That's great! What gpu did you use?
13:58 < zoq> Atharva: GTX 1080 and GTX 960
13:59 < Atharva> zoq: Okay, I just have a GTX 1050ti, but I think it will still give atleast some speedup.
14:00 < zoq> Atharva: Yeah, if I remember right you need CUDA 6.0 support.
14:02 < Atharva> zoq: Yeah, 1050ti is supported luckily. I did try training some models on tensorflow-gpu.
14:03 -!- ImQ009 [~ImQ009@unaffiliated/imq009] has joined #mlpack
14:19 < Atharva> zoq: I was training a VAE model on MNIST. I have observed that my model is totally dependent on the step size. I mean if I use higher or lower than an optimum value, the loss saturates at a not so low value . I can understand that happening for a high learning rate, but why is it happening for a learning rate lower than the optimum one?
14:25 < zoq> Atharva: What's the optimum? Two ideas, take a look at the initial weights, perhaps start with small values, the other idea is to check the gradients at the end, mayber some aren't correct.
14:27 < Atharva> Manually training it on small models, I found out that at a value of 0.003 it gives very low error compared to step sizes higher or lower than that.
14:28 < Atharva> How do I check if gradients are correct at the end of training?
14:30 < zoq> Atharva: You could print the gradient at the end of the Gradient call either in the FFN class or in the optimizer class.
14:31 < Atharva> zoq: Yeah I understand, but how do I check if they are correct? Sorry if this seems obvious to you, maybe I am forgetting something very basic.
14:35 < zoq> Atharva: Ahh, good question, I would just check if they look "resonable", if one of them is MAXDOUBLE or a huge part is 0, that is a good indicator. From that point we could take a look at a specific layer.
14:35 -!- cjlcarvalho [~firstname.lastname@example.org] has joined #mlpack
14:37 < Atharva> zoq: Okay, thanks. I will try and see what's happening.
14:37 < zoq> Atharva: Okay, let me know if you need any help.
15:49 -!- caiojcarvalho [~email@example.com] has joined #mlpack
15:51 -!- cjlcarvalho [~firstname.lastname@example.org] has quit [Ping timeout: 248 seconds]
16:17 < ShikharJ> zoq: Are you there?
16:20 < zoq> ShikharJ: yes
16:23 < ShikharJ> zoq: I was wondering whether it would be better to merge the RBM PR as it is now, and provide the tests and batch support in subsequent PRs? I'm saying this because the current set of changes are huge (~1800 lines), and with the additional tests and BatchSupport, it might get increased even further. What do you think?
16:24 < ShikharJ> zoq: It would take a lot of effort to review them all at once. Maybe we can initiate a review on the PR now, and I can, meanwhile, work on a couple of different branches regarding the additional features.
16:25 < zoq> ShikharJ: I'm not sure the current code is correct, and I think a test would be helpful. We could split the batch support from the PR but I don't like the idea to split the tests as well.
16:27 < ShikharJ> zoq: Hmm, I see. We should implement the tests first then.
17:31 -!- cjlcarvalho [~email@example.com] has joined #mlpack
17:31 -!- caiojcarvalho [~firstname.lastname@example.org] has quit [Ping timeout: 244 seconds]
18:14 < zoq> rcurtin: If I remember right I had to explicitly set the trigger phrase in the job configuration, not sure you encounter the same issue.
18:16 -!- jenkins-mlpack2 [~PircBotx@knife.lugatgt.org] has quit [Ping timeout: 260 seconds]
18:18 < rcurtin> it seems like it is triggering, but the Jenkins logs indicate there is a problem:
18:18 < rcurtin> 'Request doesn't contain a signature. Check that github has a secret that should be attached to the hook'
18:18 < rcurtin> so I am digging into that now
18:19 -!- jenkins-mlpack2 [~PircBotx@knife.lugatgt.org] has joined #mlpack
18:23 < rcurtin> ah, ok, I think I got it. I just had to ensure that the shared secret in the Github webhook setup and on Jenkins were the same
18:26 -!- jenkins-mlpack2 [~PircBotx@knife.lugatgt.org] has quit [Ping timeout: 240 seconds]
18:26 < zoq> nice
18:26 -!- jenkins-mlpack2 [~PircBotx@knife.lugatgt.org] has joined #mlpack
18:26 < rcurtin> anyone want to push a commit to an open PR?
18:26 < rcurtin> I can't easily find the right kind of message to redeliver to Jenkins to see if it's working
18:27 < rcurtin> too many comments recently, all the messages I could redeliver that I see aren't commits :)
18:45 < rcurtin> ok, I think I have everything working mostly properly... I'll check again in the next couple of days
18:45 < rcurtin> Atharva: sorry for the noise on #1441, it was the one I chose for testing Jenkins :)
18:46 -!- witness_ [uid10044@gateway/web/irccloud.com/x-hjtdyzklusfjcbps] has quit [Quit: Connection closed for inactivity]
18:46 < Atharva> rcurtin: Absolutely no problem :)
18:52 < rcurtin> I would have gotten everything done faster, but I became distracted by this: https://www.youtube.com/watch?v=RQGa0DPwes0
19:06 < rcurtin> zoq: thanks for pushing the commit, looks like everything is working right
19:08 < zoq> rcurtin: Really cool idea, and the rollercoaster looks insane :)
19:10 < rcurtin> yeah, I wonder if the person who made that will find a way to adapt it into a more generic ALU :)
20:19 -!- travis-ci [~email@example.com] has joined #mlpack
20:19 < travis-ci> manish7294/mlpack#58 (impBounds - a63ade3 : Manish): The build failed.
20:19 < travis-ci> Change view : https://github.com/manish7294/mlpack/commit/a63ade3d479d
20:19 < travis-ci> Build details : https://travis-ci.com/manish7294/mlpack/builds/78481724
20:19 -!- travis-ci [~firstname.lastname@example.org] has left #mlpack 
20:35 -!- ImQ009 [~ImQ009@unaffiliated/imq009] has quit [Quit: Leaving]
21:37 -!- vivekp [~vivek@unaffiliated/vivekp] has quit [Ping timeout: 256 seconds]
--- Log closed Tue Jul 10 00:00:40 2018