mlpack IRC logs, 2018-07-22
Logs for the day 2018-07-22 (starts at 0:00 UTC) are shown below.
--- Log opened Sun Jul 22 00:00:57 2018
02:36 -!- navdeep [324d5171@gateway/web/freenode/ip.126.96.36.199] has quit [Ping timeout: 252 seconds]
07:00 -!- ImQ009 [~ImQ009@unaffiliated/imq009] has joined #mlpack
08:55 -!- travis-ci [~firstname.lastname@example.org] has joined #mlpack
08:55 < travis-ci> manish7294/mlpack#76 (impBounds - 9edda26 : Manish): The build has errored.
08:55 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/b7e25aba5e6d...9edda268ffa1
08:55 < travis-ci> Build details : https://travis-ci.com/manish7294/mlpack/builds/79656096
08:55 -!- travis-ci [~email@example.com] has left #mlpack 
10:45 -!- travis-ci [~firstname.lastname@example.org] has joined #mlpack
10:45 < travis-ci> manish7294/mlpack#77 (impBounds - a81e387 : Manish): The build failed.
10:45 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/9edda268ffa1...a81e387e6453
10:45 < travis-ci> Build details : https://travis-ci.com/manish7294/mlpack/builds/79656852
10:45 -!- travis-ci [~email@example.com] has left #mlpack 
10:45 -!- travis-ci [~firstname.lastname@example.org] has joined #mlpack
10:46 < travis-ci> manish7294/mlpack#12 (impBounds - a81e387 : Manish): The build failed.
10:46 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/9edda268ffa1...a81e387e6453
10:46 < travis-ci> Build details : https://travis-ci.org/manish7294/mlpack/builds/406779252
10:46 -!- travis-ci [~email@example.com] has left #mlpack 
12:17 < zoq> ShikharJ: Great results, as you said in the blog post, we can merge the code in the next days.
12:19 < zoq> ShikharJ: I'm curious, what's the meaning of the last word?
12:19 < Atharva> zoq: I left a message here yesterday about the sequential layer, can you please check
12:19 < ShikharJ> zoq: I'll get to optimizing the code and the additional SSRBM test in another PR for now.
12:20 < zoq> ShikharJ: Great!
12:20 < zoq> Atharva: Let's see.
12:20 < ShikharJ> zoq: Actually, I end my blog posts with a farewell message in different languages, that is a Sanskrit word for "see you again".
12:21 < zoq> ShikharJ: Neat :)
12:22 < zoq> Atharva: hm, what layer did you use before the seq layer?
12:23 < Atharva> the reparametrization layer
12:25 < Atharva> but I don't think it matters what layer we use before it. If the first layer in the seq layer is any layer that does not have the outputWidth() function, then if the second layer if convolutional, the inputWidth and inputHeight gets set to 0
12:27 < zoq> Atharva: I see, I'll create simple example to reproduce the issue.
12:27 < Atharva> Okay
13:01 < zoq> The issue I see is, that right now we do a single Forward step to get the output width/height for the next layer.
13:01 < zoq> But for the seq layer there is no single forward step, since it will call the Forward function for all layer in a row.
13:03 < zoq> But this would only effect a cascad of layer that implement the width/height function; but you said your case looks like embedded -> Seq (Linear -> Conv) -> ...
13:03 < zoq> so, the first conv layer should receive the correct width/height
13:12 < Atharva> Yes, so what do you suggest?
13:13 < zoq> Not sure yet.
13:14 < zoq> Does the seq layer hold multiple conv layer in your case?
13:14 < Atharva> Yes it does.
13:14 < zoq> okay
13:15 < Atharva> I don't think that matters, even a single convolutional layer after the linear layer creates the same problem
13:16 < zoq> So for now, we could we could comment out reset and manually set the width/height.
13:17 < zoq> Does that work for you?
13:17 < Atharva> zoq: Okay, so that means we don't hard set it to true
13:17 < zoq> Right
13:17 < Atharva> Yes, it does. That's what I am doing locally to get it running
13:18 < zoq> but that should only work if width/height doesn't change
13:19 < Atharva> Sorry I didn't get you completely, width/height doesn't change with what?
13:19 < zoq> Ignore the last point, it does work.
13:20 < Atharva> Okay then, I will make a commit to my latest PR
13:20 < zoq> Actually, is there any reason to force a reset at all..
13:21 < Atharva> Yeah, I couldn't understand why it was hardcoded to true as well
16:36 < zoq> Atharva: For your vae model did you test another optimizer e.g. Adam?
16:39 < Atharva> hmm, I tired sgd with Adam update
16:39 < Atharva> Why exactly? Will it be better if I try different optimizers?
16:40 < Atharva> tried*
16:40 < zoq> ahh, right I was talking about AdamMax or RMSProp
16:40 < zoq> Adam is just an alias for SGD<AdamUpdate>, so yes it's the same
16:41 < Atharva> No, I haven't tried anything else than Adam. Should I try some other Optimizer for the next model I will be training on binary MNIST?
16:42 < zoq> if you can start an experiment sure, I can also test it out
16:42 < zoq> just wanted to ask first
16:43 < Atharva> I can, there are some more models I and Sumedh have planned to experiment with. Is there any specific reason to try out different optimizers? What aspect of the models do you think they can affect?
16:45 < zoq> Generally escape from a poor local minima.
16:46 < Atharva> Oh!
16:46 < Atharva> I will try RMSProp and AdamMax for the basic vae model then and see if the loss goes down further.
16:47 < zoq> Adam should work for the VAE model, but I think it's easy to test it.
16:47 < Atharva> Sumedh did seem to think that it should be lower than ~120
16:47 < Atharva> yeah
16:48 < Atharva> BTW, do you have any idea why the ReconstructionLoss with a normal distribution isn't working on normal MNIST, the code that I mailed you
16:49 < zoq> Atharva: As soon as I found something I'll let you know.
16:50 < Atharva> zoq: Sure, sorry about the frequent reminding
16:51 < zoq> Atharva: No worries :)
18:38 -!- ImQ009 [~ImQ009@unaffiliated/imq009] has quit [Ping timeout: 276 seconds]