mlpack IRC logs, 2018-06-14

Logs for the day 2018-06-14 (starts at 0:00 UTC) are shown below.

June 2018
--- Log opened Thu Jun 14 00:00:03 2018
01:49 -!- vivekp [~vivek@unaffiliated/vivekp] has quit [Ping timeout: 248 seconds]
01:50 -!- vivekp [~vivek@unaffiliated/vivekp] has joined #mlpack
02:31 -!- yaswagner [4283a544@gateway/web/freenode/ip.] has quit [Ping timeout: 260 seconds]
04:46 -!- travis-ci [] has joined #mlpack
04:46 < travis-ci> manish7294/mlpack#28 (master - 0128ef7 : Marcus Edel): The build passed.
04:46 < travis-ci> Change view :
04:46 < travis-ci> Build details :
04:46 -!- travis-ci [] has left #mlpack []
05:12 -!- vivekp [~vivek@unaffiliated/vivekp] has quit [Ping timeout: 245 seconds]
05:15 -!- vivekp [~vivek@unaffiliated/vivekp] has joined #mlpack
07:04 < ShikharJ> zoq: I'm sorry I couldn't reach out at that time, could you tell me what timezone are you in so that I can plan accordingly?
07:58 < ShikharJ> zoq: Pardon me if I'm misunderstanding, but isn't inSize parameter intended to serve as the parameter for the number of channels in an input image?
08:34 < zoq> ShikharJ: No worries, I got distracted so I couldn't take a look at the issue at the time you posted it, I'm in UTC + 2.
08:34 < zoq> ShikharJ: About inSize you are right, and that would effect the number of weights, so we have to modify the conv layer; take a look at the inputTemp parameter, currently it dos only account for the first sample.
08:52 < ShikharJ> zoq: I'll let you know what I find by today evening.
08:57 < zoq> ShikharJ: Okay, hopefully the changes are minor.
09:48 < jenkins-mlpack> Project docker mlpack nightly build build #349: STILL UNSTABLE in 2 hr 34 min:
10:12 -!- sulan_ [] has joined #mlpack
10:20 -!- sulan_ [] has quit [Quit: Leaving]
10:58 -!- manish7294 [8ba7bab0@gateway/web/freenode/ip.] has joined #mlpack
11:00 < manish7294> zoq: How do we make some new dataset additions to
11:02 < manish7294> And, can we add categorical datasets, mainly having categorical labels?
11:03 < zoq> You can open an issue with the link to the dataset; I'll upload the dataset afterwards or just add the dataset to the list; or you can post the link here.
11:04 < zoq> sure
11:05 < manish7294>
11:05 < manish7294>
11:05 < manish7294> These two additions will be good enough.
11:14 < manish7294> zoq: Can benchmarking systems automatically handle categorical dataset?
11:15 < zoq>
11:15 < zoq>
11:15 < manish7294> zoq: great :)
11:16 < zoq> manish7294: That depends on the lib, in case of mlpack all we do is forwarding the dataset (filename + path).
11:17 < zoq> manish7294: I guess if you like to benchmark against matlab you might have to adjust the benchmark script since it's probably using dlmread.
11:18 < zoq> manish7294: If mlpack can't handle the dataset right away we can write a simple preprocess step in python and pass the modified dataset.
11:21 < manish7294> zoq: That's sounds about right. Can we modify the method too as I remeber Haritha doing something like that for decision tree to support categorical data, though I am not sure?
11:22 < zoq> manish7294: To handle categorical data?
11:22 < manish7294> right
11:22 < manish7294> I am not sure :)
11:23 < manish7294> It was long time ago
11:25 < manish7294> zoq: may be this is what I was thinking about.
11:27 < zoq> manish7294: yes, that's the one.
11:27 < zoq> manish7294: Should be straightforward, if you follow the PR.
11:28 < manish7294> zoq: I will follow it then :)
12:37 < manish7294> rcurtin: Here is some results on letters dataset(20000 instances, 16 attributes) - lbfgs, k = 3, total time - 3 mins 49.5 secs, initial accu - 96.285, final - 96.905; amsgrad, total time - 4 mins 53.6 secs, final - 97.335;
13:19 < Atharva> sumedhghaisas: You there?
13:42 < sumedhghaisas> Atharva: Hi Atharva
13:42 < sumedhghaisas> here now :)
14:11 < Atharva> Will you be available in 2 hours?
14:25 -!- ImQ009 [~ImQ009@unaffiliated/imq009] has joined #mlpack
14:29 < manish7294> zoq: rcurtin: Currently I am able to successfully execute LMNN benchmarks on my local system by copying mlpack_lmnn to libraries/bin/mlpack_lmnn. Can you please guide me how I can run the same on benchmarks system using my lmnn branch as lmnn is not yet merged?
14:30 < rcurtin> manish7294: hey there... I slept in a little this morning
14:30 < rcurtin> so, there's a script in libraries/ called ''
14:30 < manish7294> rcurtin: :)
14:31 < rcurtin> basically, all that does is build mlpack in the libraries/mlpack/ directory
14:31 < sumedhghaisas> Atharva: Ahh sorry missed your msg
14:31 < rcurtin> so, you could, after setting up all the other libraries, manually build your branch of mlpack in libraries/mlpack/
14:31 < rcurtin> the CMake configuration used is:
14:31 < rcurtin> cmake -DCMAKE_INSTALL_PREFIX=../../ -DBUILD_TESTS=OFF ../
14:31 < rcurtin> and you would also 'make install'
14:31 < sumedhghaisas> What time are you referring to?
14:32 < rcurtin> does that make sense?
14:33 < manish7294> Right, I got it.
14:33 < Atharva> sumedhghaisas: 1 hour 40 minutes feom now
14:33 < Atharva> 9:45 ist
14:34 < manish7294> All I have to do is same copying the lmnn branch bin folder to libraries/bin
14:34 < rcurtin> it might also be a good idea to, in your local benchmarks repo, comment out the call from the script, in case you accidentally type 'make setup'
14:34 < rcurtin> manish7294: no, I don't think that will work, because that may depend on parts of that aren't there
14:35 < rcurtin> so I think it's better to build the whole library in libraries/mlpack/ with the CMake configuration above, then 'make install' to put it correctly in libraries/bin and libraries/lib
14:35 < manish7294> Okay, I will take care
14:35 < manish7294> And should I do this on slake itself
14:35 < rcurtin> yeah, I think it's fine to do it on slake
14:36 < rcurtin> if you do all the 'make run' runs there, they'll be comparable to each other
14:37 < manish7294> I will be taking covtype as the limiting(in terms of maximum size) dataset
14:38 < rcurtin> right, that sounds good
14:38 < manish7294> rcurtin: And should I create a pull request on benchmarks repo or should I wait till lmnn merge?
14:39 < rcurtin> if it is taking like 13 hours to run, you might want to start with a 5k or 50k subset
14:39 < rcurtin> you can open the PR now if you like, but we should wait to merge it until LMNN is merged
14:39 < rcurtin> (which should be fairly soon I think)
14:40 < manish7294> rcurtin: Are you randomly selecting 5k points of covertype to make 5k covertype.
14:40 < manish7294> or are they 1st 5k points
14:40 < rcurtin> I took them randomly
14:43 < manish7294> Can you share it with me, if it is an independent file? Or I will make a more substantial one for me as I was earlier using a subset of covertype-small, which was missing some classes.
14:43 < sumedhghaisas> Atharva: Sure. I think I will be free.
14:45 < manish7294> rcurtin: Hoping, I am not ruining your vacation ( It's once in a while chance) :)
14:46 < rcurtin> no, it's no problem at all, it is a vacation from Symantec not everything :)
14:46 < rcurtin> I am just staying at home this week anyway
14:46 < rcurtin> soon I will go see if my new brakes work well (hopefully they do so I will come back)
14:46 < rcurtin> which I guess means it is important that I get you the datasets now, it could be the last chance :)
14:46 < rcurtin>
14:46 < rcurtin>
14:46 < rcurtin>
14:46 < manish7294> rcurtin: great :)
14:46 < rcurtin>
14:47 < rcurtin> ok, I'll be back later. in the worst case I might have to use the emergency brake but I think everything will be fine :)
14:49 < manish7294> Thanks!
15:07 -!- sulan_ [] has joined #mlpack
15:20 < sumedhghaisas> Atharva: Hi Atharava, I have to sync up with someone at work at 17 BST. Could we discuss at 18:00 BST?
15:22 < rcurtin> perfect, brakes work great :)
15:28 < manish7294> rcurtin: That was a quick test drive. So, emergency breaks didn't get a chance, haha :)
15:29 -!- manish72942 [~yaaic@2405:205:2499:a5b:c5a6:d72b:2822:b845] has joined #mlpack
15:33 -!- manish7294 [8ba7bab0@gateway/web/freenode/ip.] has quit [Ping timeout: 260 seconds]
16:00 < Atharva> sumedhghaisas: Yes sure!
16:38 [Users #mlpack]
16:38 [ Atharva ] [ jenkins-mlpack] [ manish72942 ] [ ShikharJ ] [ wiking ]
16:38 [ gtank ] [ K4k ] [ petris ] [ sulan_ ] [ witness_]
16:38 [ Guest63658] [ killer_bee[m] ] [ prakhar_code[m]] [ sumedhghaisas] [ xa0 ]
16:38 [ ImQ009 ] [ lozhnikov ] [ rcurtin ] [ vivekp ] [ zoq ]
16:38 -!- Irssi: #mlpack: Total of 20 nicks [0 ops, 0 halfops, 0 voices, 20 normal]
17:01 < Atharva> sumedhghaisas: You there?
17:05 -!- manish72942 [~yaaic@2405:205:2499:a5b:c5a6:d72b:2822:b845] has quit [Ping timeout: 255 seconds]
17:15 -!- vivekp [~vivek@unaffiliated/vivekp] has quit [Ping timeout: 255 seconds]
17:30 < sumedhghaisas> Atharva: So sorry. The meeting stretched for long.
17:30 < sumedhghaisas> I am here now :)
17:31 < Atharva> sumedhghaisas: Meetings always do :)
17:31 < sumedhghaisas> Atharva: So true :)
17:31 < sumedhghaisas> so whats up?
17:32 < Atharva> I had some questions about the normaldist class
17:32 < Atharva> Do we need the Train functions which the GaussianDistribution class has?
17:34 < sumedhghaisas> Ohh... ummm not really. Let me think.
17:35 < sumedhghaisas> In any case we can implement it later
17:38 < Atharva> Yeah, so not now.
17:38 < sumedhghaisas> Sure.
17:38 < Atharva> Another question is if we should allow vectors and cubes or just matrices?
17:38 < Atharva> I think we should think about RNN support as well, so cubes should be allowed I guess
17:38 < sumedhghaisas> ohh cubes are necessary as the output might be conv
17:38 < Atharva> Yup
17:38 < sumedhghaisas> so I would say vector, matrices and cubes
17:38 < Atharva> So, I will define multiple constructors
17:38 < sumedhghaisas> or maybe just templatize it?
17:38 < Atharva> Yes, but still, we need to take different number of parameters for each data type
17:38 < Atharva> in the constructor
17:38 < Atharva> for the size
17:38 < sumedhghaisas> I am not sure I understand that
17:38 < sumedhghaisas> why exactly you need size?
17:39 < sumedhghaisas> also you could infer size from the matrix itself
17:39 < Atharva> Hmm, in the constructor NormalDistribution(), when we are making a new set of distributions, if it's a matrix, then we need something like NormalDistribution(n_rows, n_cols)
17:40 < Atharva> Because in Gaussian, it only supports vector and it's multivariate, so they take GaussianDistribution(dimension)
17:41 < sumedhghaisas> What about NormalDistribution(arma::mat mean, arma::mat variance)?
17:41 < Atharva> This creates standard normals
17:41 < sumedhghaisas> ahh I mean templatize it properly
17:41 < Atharva> That will be there, this one is for standard normal of a given size
17:41 < Atharva> We don't need it, but it's good to have that I think
17:42 < sumedhghaisas> hmm... I am conteplating if we should define these distributions inside ANN framework or not
17:42 < sumedhghaisas> although I would say lets not have the size constructor
17:43 < Atharva> Okayy
17:43 < sumedhghaisas> in case if the user wants, he can create a dist by generating constant matrix
17:43 < sumedhghaisas> of required size
17:43 < Atharva> Yes that's easy
17:44 < Atharva> So, as Ryan said, we shouldn't make layers output distributions if layers after them expect matrices
17:44 < Atharva> So, this is just for the final layer, right?
17:45 < sumedhghaisas> Ahh no, so as Ryan suggested, what we do instead is, accept the distribution in the layer its being used rather than the layer outputing it
17:45 < sumedhghaisas> So VAE would output a matrix, and loss layer will define a dist over it and use it
17:46 < Atharva> Got it, so the network isn't affected, everything happens in one layer
17:46 < sumedhghaisas> everything happend in one layer? sorry didn't get that
17:47 < Atharva> Sorry, I meant that the dist objects are just used in the layer which needs it as input, rest of the network operates normally on matrices
17:48 < sumedhghaisas> ahh yes. :)
17:48 < Atharva> You said that the logprob will give us the reconstruction loss, but what if the user wants to use some loss for reconstruction?
17:49 < sumedhghaisas> Usually in VAEs, reconstruction loss is defined over some distribution only
17:49 < sumedhghaisas> in any case, if the user wants some other loss, he could replace the ReconstructionLoss layer and use his own
17:50 < Atharva> Okay, so after the distribution, next task to is to implement a ReconstructionLoss layer, right?
17:50 < sumedhghaisas> yes
17:50 < sumedhghaisas> ReconstructionLoss will take a distribution to define the loss
17:51 < Atharva> So, you are saying that in VAEs we shouldn't sample from the output distribution, say an image, and then use some simple loss such as mean squared between the output image and training image
17:51 < Atharva> Will the loss always be taken with the output distribution?
17:55 -!- travis-ci [] has joined #mlpack
17:55 < travis-ci> mlpack/mlpack#5079 (master - e08e761 : Marcus Edel): The build has errored.
17:55 < travis-ci> Change view :
17:55 < travis-ci> Build details :
17:55 -!- travis-ci [] has left #mlpack []
17:58 < Atharva> Also, I think we should discuss implementation details of the ReconstructionLoss layer now, because the dist won't take long now. I will try to complete the layer and test it before the week ends
17:59 < sumedhghaisas> Atharva: got distracted with something
17:59 < sumedhghaisas> Sure.
17:59 < sumedhghaisas> I still need to merge the Repar layer
18:00 < Atharva> Can you please read my message before the last one?
18:00 < sumedhghaisas> Sorry for being pedantic, but could you rebase the PR on master rather than merging? Merging just creates a complicated history
18:01 < sumedhghaisas> Mean squares loss is basically a log_prob loss with normal distribution
18:01 < Atharva> Okay, if that makes thing easier
18:02 < sumedhghaisas> and thats what we will provide as default, although with binary MNIST we will have to use bernoulli dist log_prob, thus distributions will make this easier
18:02 < sumedhghaisas> Now for sampling, what use can do is define a distribution over the FFN.Predict function
18:02 < sumedhghaisas> and sample from it
18:03 < Atharva> Yeah
18:04 < Atharva> The dists will help, we just need to make some adjustments in FFN class
18:05 < sumedhghaisas> Not really, now that we no longer output dist, the current framework will work, we will need to Implement ReconstructionLoss thats it
18:07 < Atharva> Okay, but we do need to change the Predict function for dists, right?
18:07 < sumedhghaisas> umm... No. Predict will output a matrix.
18:07 < sumedhghaisas> While sampling, we will define a Dist over the Predict output
18:08 < Atharva> I think, we will need to make one detailed tutorial because we will be doing a lot of things externally and not in some class
18:09 < sumedhghaisas> A tutorial and a simple MNIST model in models :)
18:09 < Atharva> Yes, after that I hope we get some time for RNNs :)
18:09 < sumedhghaisas> Although this style of sampling is common across other frameworks as well
18:10 < Atharva> According to the planned timeline, the next week was for testing VAE class, but we don't have that now, so by then everything else should be ready so that we can play with some VAE models after that
18:10 < sumedhghaisas> For RNNs we will need to find a dataset as well
18:11 < Atharva> Yeah, what do you say about a music dataset?
18:11 < sumedhghaisas> for generation?
18:11 < Atharva> Yeah, in RNNs
18:12 < sumedhghaisas> Thats a very hard task for VAEs
18:12 < sumedhghaisas> to be honest
18:12 < Atharva> Oh, okay, maybe something else then, we will decide later
18:13 < sumedhghaisas> yeah, maybe we could play around with Reber grammar that exists currently
18:13 < sumedhghaisas> Lets see if we could generate a correct grammar with VAEs
18:13 < sumedhghaisas> thats a very interesting experiment...
18:14 < Atharva> That's interesting, working with models is going to be so much fun!
18:14 < sumedhghaisas> Although shouldn't be difficult
18:14 < Atharva> We also have to reproduce results from the papers
18:15 < sumedhghaisas> That would be the first task as soon as we get MNIST working
18:15 < sumedhghaisas> does the paper mention MNIST or Binary MNIST?
18:16 < Atharva> Yeah, about the ReconstructionLoss layer, do you have something specific that I should keep in mind while implementing
18:16 < Atharva> I will check
18:17 < sumedhghaisas> Not really. Have you understood the role that it plays?
18:17 < Atharva> Yes I have
18:17 < Atharva> It's forward function will return a double just like the other loss layers that we have
18:20 < Atharva> It will take in matrix and then use the dist to define a distribution over it
18:20 < Atharva> The dist object will then have logprob and logprob backwards which the layer will use for the forward and backward functions
18:22 < sumedhghaisas> Yup, yup and yup
18:22 < Atharva> We also need support for Bernoulli dist
18:23 < sumedhghaisas> thats for later :)
18:23 < Atharva> Okay
18:23 < Atharva> I will get on this then
18:23 < sumedhghaisas> Lets go with pure MNIST now
18:23 < Atharva> Yeah
19:58 -!- ImQ009 [~ImQ009@unaffiliated/imq009] has quit [Quit: Leaving]
19:59 < ShikharJ> zoq: Are you there?
20:00 < zoq> ShikharJ: yes
20:00 -!- sulan_ [] has quit [Quit: Leaving]
20:02 < ShikharJ> zoq: I had a theoretical doubt. Let's say we have a 4 3x3 input points. So its shape becomes 3x3x4, and let it convolve with a 3x3x1 kernel, so that the output now is 1x1x4, for the 4 inputs.
20:03 < ShikharJ> zoq: Now when I'm computing the gradients for the above operation in the Gradient method, should I compute 3x3x4 gradients pertaining to the 4 inputs or something else has to be done?
20:04 < ShikharJ> zoq: Because our kernel size is just 3x3x1?
20:09 < zoq> The gradient has to be calculated for each input separately and you take the sum over the gradients at the end, so theoretically you could just write a for loop around everything and take the sum at the end; but since this is slow we should see if we can vectorize the operation.
20:09 < zoq> Here is a really simple example:
20:11 < ShikharJ> zoq: Ah, so I calculate the 3x3x4 sized gradients (for 4 inputs) and then reduce it to 3x3x1 by summing, is that right?
20:11 < zoq> correct
20:11 < ShikharJ> zoq: Summing along slices?
20:12 < zoq> yeah, if we use the cube representation
20:13 < ShikharJ> zoq: Ah, that cleared a lot on how convolutions actually work, for me. Thanks for the help!
20:13 < zoq> ShikharJ: Here to help.
20:17 < zoq> A simple test could check if the output is the same for two seperate runs (two inputs) and a single run with the two inputs combined.
20:18 < ShikharJ> zoq: Ah, yes, we can try that out as well.
20:19 < ShikharJ> zoq: The implementation for Batch Support on Convolutional Layers is nearly complete, we can test after I push the code.
20:20 < zoq> ShikharJ: awesome
21:57 -!- witness_ [uid10044@gateway/web/] has quit [Quit: Connection closed for inactivity]
22:26 < ShikharJ> zoq: I think I also found a bug in the Gradients method of convolution_impl.cpp, though I'll be needing you to review it, pushing the code for now.
22:39 < ShikharJ> zoq: I also posted the results for DCGAN MNIST test on the full dataset on the PR!
23:00 < zoq> ShikharJ: Are you going to open another PR or do we use the DCGAN PR?
23:02 < ShikharJ> zoq: For the bug, I'll push to the BatchSupport PR. DCGAN should be good to go for MNIST, but I still need to get it running for CelebA, which I think should benefit from the BatchSupport PR improvements.
23:06 < zoq> ShikharJ: Okay, sounds fine for me.
23:19 < ShikharJ> zoq: Pushed in the changes.
23:26 < zoq> ShikharJ: Nice, does this one incoperate the fix for the gradient function, not sure I see the issue.
--- Log closed Fri Jun 15 00:00:04 2018