mlpack IRC logs, 2018-07-17

Logs for the day 2018-07-17 (starts at 0:00 UTC) are shown below.

July 2018
--- Log opened Tue Jul 17 00:00:50 2018
00:25 -!- robertohueso [~roberto@] has left #mlpack []
01:17 -!- vivekp [~vivek@unaffiliated/vivekp] has quit [Ping timeout: 248 seconds]
01:20 -!- vivekp [~vivek@unaffiliated/vivekp] has joined #mlpack
02:40 -!- caiojcarvalho [~caio@2804:18:7809:3bd0:899f:fafb:ba6d:b404] has joined #mlpack
03:37 -!- travis-ci [] has joined #mlpack
03:37 < travis-ci> manish7294/mlpack#2 (evalBounds - 457980e : Manish): The build passed.
03:37 < travis-ci> Change view :
03:37 < travis-ci> Build details :
03:37 -!- travis-ci [] has left #mlpack []
05:13 -!- cjlcarvalho [~caio@] has joined #mlpack
05:14 -!- caiojcarvalho [~caio@2804:18:7809:3bd0:899f:fafb:ba6d:b404] has quit [Ping timeout: 276 seconds]
05:23 -!- caiojcarvalho [~caio@2804:18:7002:3604:5779:df2a:51ba:848f] has joined #mlpack
05:23 -!- cjlcarvalho [~caio@] has quit [Ping timeout: 260 seconds]
05:49 -!- vivekp [~vivek@unaffiliated/vivekp] has quit [Ping timeout: 240 seconds]
05:50 -!- vivekp [~vivek@unaffiliated/vivekp] has joined #mlpack
07:16 -!- lozhnikov [] has quit [Ping timeout: 240 seconds]
07:32 -!- lozhnikov [] has joined #mlpack
07:50 -!- lozhnikov [] has quit [Ping timeout: 276 seconds]
07:52 -!- cjlcarvalho [~caio@] has joined #mlpack
07:53 -!- caiojcarvalho [~caio@2804:18:7002:3604:5779:df2a:51ba:848f] has quit [Ping timeout: 256 seconds]
07:58 -!- lozhnikov [] has joined #mlpack
08:17 -!- cjlcarvalho [~caio@] has quit [Ping timeout: 240 seconds]
08:17 -!- cjlcarvalho [~caio@] has joined #mlpack
08:21 -!- lozhnikov [] has quit [Ping timeout: 260 seconds]
08:46 -!- vivekp [~vivek@unaffiliated/vivekp] has quit [Ping timeout: 244 seconds]
08:48 -!- manish7294 [~yaaic@2405:205:2217:af0f:19f8:8fe8:69dc:ae3b] has joined #mlpack
08:48 -!- vivekp [~vivek@unaffiliated/vivekp] has joined #mlpack
08:50 -!- manish7294 [~yaaic@2405:205:2217:af0f:19f8:8fe8:69dc:ae3b] has quit [Client Quit]
08:51 -!- manish7294 [8ba71174@gateway/web/freenode/ip.] has joined #mlpack
08:55 < manish7294> rcurtin: Got a copy of your mail from mailing list, look like you're already up. I want to discuss the structure of boostmetric. Do you think it is a good time for that?
08:56 < rcurtin> manish7294: I am in talks all day, so I think maybe it would be best to use email for that
08:56 < manish7294> sure :)
08:56 < rcurtin> I won't really be able to devote much time at the moment, only quick responses, etc.
08:57 < manish7294> no problem
08:57 < rcurtin> I'm waking up roughly 7 UTC this week so our awake times overlap much more than usual :)
08:59 < manish7294> JUst if we you have few seconds to spare, Do you think we would be able to use existing optimizers for boostmetric.
08:59 < manish7294> As looking from the algorithm I think we have to build it from scratch.
09:00 < rcurtin> I'll take a look when I have a second and respond then, thanks for the direct PDF link :)
09:00 < manish7294> sure
09:01 < rcurtin> do you mean for Algorithm 2?
09:03 -!- manish72942 [~yaaic@2405:205:2315:3c87:19f8:8fe8:69dc:ae3b] has joined #mlpack
09:04 < manish72942> Right, sorry for delay --- Lost the connection
09:04 -!- manish7294 [8ba71174@gateway/web/freenode/ip.] has quit [Ping timeout: 252 seconds]
09:05 < rcurtin> this will take me a little time to think about
09:05 < rcurtin> I will try and have an answer later today
09:06 < manish72942> no need to hurry :)
09:24 < rcurtin> manish72942: I haven't had time to really give it a good look, but my instinct is, the BoostMetric paper claims both speedup and accuracy boost over LMNN
09:24 < rcurtin> so, if you want to devote time now (if it will not take too long), you could implement it by itself and we could see if both of those claims are true
09:24 < rcurtin> I am not sure if speedup will still be obtained given our optimized implementation (which still has some further optimization to go)
09:26 < manish72942> like rough implementation as guven in algo without any optimizer or anything,right?
09:26 < manish72942> *given
09:27 < manish72942> If that so, I am on my way.
09:28 < rcurtin> right, I think that is reasonable, but the more important thing here will be trying to reproduce the results of their paper
09:29 < rcurtin> I want to make sure that in the time we have left, we get something interesting
09:29 < rcurtin> so really the situation I want to try to avoid is an incompletely optimized LMNN and then we find out that BoostMetric does not consistently give speedup or improved accuracy over LMNN
09:30 < rcurtin> fully optimized LMNN is interesting by itself, and also interesting is fast BoostMetric with some of the LMNN optimizations
09:30 < rcurtin> I need to read the BoostMetric paper in full (I have not had a chance to do that, sorry; I have been focused on LMNN)
10:16 < rcurtin> manish72942: more about the runtime results in BoostMetric:
10:16 < rcurtin> (1) the paper claims that for each iteration of LMNN, a projection of M back onto the PSD cone is needed, which costs an O(d^3) eigendecomposition
10:17 < rcurtin> however, in our implementation, since we are optimizing L (where M = L^T L) directly, M is always guaranteed to be PSD so we do not ever need to take that step
10:17 < rcurtin> (2) the paper points out that their implementation was in MATLAB, and that further speedup could be seen in C/C++
10:18 < rcurtin> to me, this almost guarantees they used the MATLAB implementation of LMNN, which we already know to be inefficient since it computes the full distance matrix
10:18 < rcurtin> so, an "efficient" implementation of BoostMetric may behave entirely differently than their results (with respect to speed at least)
10:18 < manish72942> Ya, they have even referenced it in their implementation
10:18 < rcurtin> so I don't mean to say BoostMetric is bad or anything, of course---I just mean that we can't be sure of exactly what we will encounter with respect to speed
10:19 < rcurtin> do you think you would rather implement BoostMetric or keep working on the LMNN optimizations? (or perhaps you feel that you can do both in parallel?)
10:21 < manish72942> from your above comments it looks like we already have done all these optimization, so we shouldn't be expecting much from this. But still let's give it a shot, maybe just today itself I will try to work it out.
10:22 < rcurtin> I think it's still completely possible that all of our LMNN optimizations will apply to BoostMetric
10:22 < rcurtin> and if it doesn't fit exactly into the optimizer API, that's ok---after all, our existing AdaBoost implementation does not either
10:23 < rcurtin> but as far as any paper goes, we can say, e.g., "we have provided order-of-magnitude+ speedups to LMNN and expect that these would be applicable to LMNN derivatives such as BoostMetric, PSMetric, etc."
10:23 < manish72942> I will try to make a rough implementation today and will see, whether it's any good continuing working on boost metric.
10:24 < manish72942> agreed, we are at least in position to claim that
10:27 < rcurtin> but I don't think we can persuasively say, e.g., "we got a little bit of speedup for LMNN and also implemented BoostMetric but roughly only see the same results as the BoostMetric paper" :)
10:27 < rcurtin> anyway, yeah, that sounds good, let's see what the rough implementation does
10:31 < manish72942> :)
10:56 -!- lozhnikov [] has joined #mlpack
11:02 -!- lozhnikov [] has quit [Ping timeout: 264 seconds]
11:22 -!- lozhnikov [] has joined #mlpack
11:31 -!- lozhnikov_ [] has joined #mlpack
11:32 -!- lozhnikov [] has quit [Ping timeout: 240 seconds]
11:33 -!- lozhnikov_ [] has quit [Client Quit]
11:34 < ShikharJ> lozhnikov: zoq : I hace tried debugging the RBM PR to an extent, but I'm unable to get the test accuracy up. Could you guys take a review of the code please?
11:35 -!- lozhnikov [] has joined #mlpack
11:52 -!- lozhnikov [] has quit [Ping timeout: 256 seconds]
12:17 < zoq> ShikharJ: Can you narrow down the issue to some part of the code? I'll take a look at the code later today, but I think it would be helpful to get some additional information, maybe you can tell us what you already tried?
13:47 -!- cjlcarvalho [~caio@] has quit [Ping timeout: 248 seconds]
13:59 -!- ImQ009 [~ImQ009@unaffiliated/imq009] has joined #mlpack
14:06 < sumedhghaisas> Atharva: Hi Atharva
14:06 < sumedhghaisas> Hows it going?
14:07 < sumedhghaisas> Did the model work with MeanSquaredError?
14:08 < ShikharJ> zoq: I have added the support for mini-batches, but I'm doubtful of the usefulness of the design (I'd refactor the entire PR to use SFINAE + enable_if<>). Plus I'm not sure where the FreeEnergy function of SSRBM originates from. That, and a number of issues while working with mini-batch inputs since most of the code was designed keeping single input in mind, but the tests make use of mini-batches.
14:20 < ShikharJ> zoq: Most of the other part of the code is correct, but these issues are likely to be the cause of trouble. More specifically, the ones relating to updation of gradients.
14:45 -!- cjlcarvalho [] has joined #mlpack
14:59 < zoq> ShikharJ: I guess, it would make sense to switch back to the single input case, if that might cause some issues; don't think training over minbatches is that important at least at this point.
15:00 < zoq> Also, this sounds like that we should start with the free energy function.
15:01 < ShikharJ> zoq: I tried augmenting the test-cases for single inputs, but even there the accuracy is not good, so there's probably a problem with our Evaluate-Gradient routines.
15:03 < zoq> ShikharJ: Okay, we should check the gradients for some steps, perhaps we see some strange values (inf, zeros).
15:06 -!- caiojcarvalho [] has joined #mlpack
15:10 -!- cjlcarvalho [] has quit [Ping timeout: 268 seconds]
15:23 -!- jenkins-mlpack [~PircBotx@] has quit [Ping timeout: 256 seconds]
16:00 -!- manish7294 [8ba70051@gateway/web/freenode/ip.] has joined #mlpack
16:03 < manish7294> rcurtin: Here's a rough implementation but it seems the binary search part takes just indefinite time (probably something is wrong), if you get some time please have a look at it -
16:37 < manish7294> hmm, it looks like the the terminating condition for bisection given in the paper and in the implementation differ by an extra condition (abs(lhs) < EPS), Now after adding that condition, it seems boostmetric is superfast :)
16:38 < manish7294> The main reason I could think of is --- it doesn't recalculate impostors at every iteration.
16:49 -!- vivekp [~vivek@unaffiliated/vivekp] has quit [Ping timeout: 264 seconds]
16:50 -!- vivekp [~vivek@unaffiliated/vivekp] has joined #mlpack
17:01 -!- xa0 [~zeta@unaffiliated/uoy] has quit [Ping timeout: 256 seconds]
17:02 -!- manish7294 [8ba70051@gateway/web/freenode/ip.] has quit [Ping timeout: 252 seconds]
17:06 -!- manish72942 [~yaaic@2405:205:2315:3c87:19f8:8fe8:69dc:ae3b] has quit [Ping timeout: 265 seconds]
17:07 -!- xa0 [~zeta@unaffiliated/uoy] has joined #mlpack
17:19 -!- xa0 [~zeta@unaffiliated/uoy] has quit [Ping timeout: 244 seconds]
17:26 -!- robertohueso [~roberto@] has joined #mlpack
17:26 -!- xa0 [~zeta@unaffiliated/uoy] has joined #mlpack
17:31 < rcurtin> manish7294: great to hear it's fast, can you get some timings/accuracy reports on different datasets?
17:32 < rcurtin> if the issue is that it's not calculating impostors, we could also have a variant of LMNN where we don't recalculate impostors, and see what the performance there is
17:32 < rcurtin> it may also be implicit in their paper that impostors need to be recalculated, so maybe their implementation recalculated impostors but the paper didn't make it clear that needed to be done
17:35 -!- manish7294 [8ba7fddd@gateway/web/freenode/ip.] has joined #mlpack
17:37 < manish7294> rcurtin: Here's the original implementation , and I don't think they have ever recalculated impostors (they just have done it once for calculating knn_triplets)
17:41 < rcurtin> I don't really have time to look into the implementation, I am just offering possibilities for the speedup
17:41 < rcurtin> it will be interesting ti see the accuracy results, and then we should also compare with LMNN where we never recalculate impostors
17:42 < manish7294> no worries, I will post them soon :)
17:43 < rcurtin> sure, sounds good
18:25 < manish7294> rcurtin: Here are some simulations :
18:26 -!- manish7294 [8ba7fddd@gateway/web/freenode/ip.] has quit [Quit: Page closed]
19:26 < Atharva> zoq: You there?
19:35 < zoq> Atharva: I'm here now.
19:38 < Atharva> zoq: I have realised that serialising parameters of the Sequential layer does not work. As the Sequential layer is just a container its parameter object is empty.
19:39 < Atharva> Instead, I propose a different solution to access the encoder and decoder of a network seperately, which I also think might be useful in other cases.
19:41 < Atharva> What do you think about a ForwardPartial() function in the FFN class which takes in input, output matrices and the starting and ending number of the layers within the network to forward pass through.
19:41 < Atharva> I implemented it locally and it saved a lot of effort when working with the decoder and encoder seperately
19:42 < zoq> Atharva: You are right, but using 'ar & BOOST_SERIALIZATION_NVP(network);' should call the serialize function of each layer, at the end we still have to implement the reset function through.
19:43 < zoq> Atharva: hm, that is an interesting idea, do you think we could provide another Forward function that does the same?
19:43 < Atharva> zoq: Okay, so do you want this function to be called just Forward instead of ForwardPartial ?
19:44 < zoq> Atharva: If you think that is reasonable, I think it looks cleaner.
19:45 < Atharva> zoq: Yes, it will call the serialize function of each layer, but then no layer serializes its parameters in the serialize function. So the trained parameters never get saved individually.
19:45 < Atharva> zoq: Okay! I will create a new PR then.
19:46 < zoq> Atharva: Right, you still ahve to collect the parameter in the reset function, your solution sounds much simpler.
20:23 -!- ImQ009 [~ImQ009@unaffiliated/imq009] has quit [Quit: Leaving]
21:15 < zoq> ShikharJ: Do you use 'RBMNetworkTest/ssRBMClassificationTest' for testing?
21:20 -!- lozhnikov [] has joined #mlpack
21:27 < Atharva> zoq: I just posted a blog post, but the website isn't getting updated.
21:50 < zoq> Atharva: hm, I wonder if changing the date from 2018-07-10 to 2018-07-17 will fix the issue.
21:54 < Atharva> zoq: Oh sorry, I didn't change the date when I copied it.
22:06 < zoq> okay, that doesn't fix the issue, I'll look into it tomorrow.
22:16 < ShikharJ> zoq: Yes.
22:19 < zoq> ShikharJ: I get the following error:
22:19 < zoq> error: as_scalar(): expression doesn't evaluate to exactly one element
22:19 < zoq> unknown location:0: fatal error: in "RBMNetworkTest/ssRBMClassificationTest": std::logic_error: as_scalar(): expression doesn't evaluate to exactly one element
22:20 < ShikharJ> zoq: Ah, the ssRBM needs to be changed a little for batch support I guess.
22:21 < ShikharJ> zoq: I'll push in a few changes in an hour or so, probably that would fix this as well.
22:21 < zoq> ShikharJ: Okay, thanks!
22:21 < ShikharJ> zoq: The bigger issue lies with BinaryRBM code.
22:23 < ShikharJ> zoq: In my system, SSRBM is still giving about 74% accuracy, while BinaryRBM is just a notch above 65%.
22:24 < zoq> For the binary test I get: error: addition: incompatible matrix dimensions: 100x10 and 100x1, sounds like some batch size issue.
22:26 < ShikharJ> zoq: Have you pulled in the latest code from the branch?
22:28 < ShikharJ> zoq: Because these issues were there in previous versions on the code, which atleast builds and runs fine for now? What configuration of CMake are you using?
22:28 < zoq> last commit is 98b5fc04d, which I think is the latest version
22:30 < zoq> travis ends up with the same error
22:30 < ShikharJ> zoq: Ok, the commit seems to be fine, I'll look into this as well. Thanks for letting me know.
22:59 -!- lozhnikov [] has quit [Ping timeout: 240 seconds]
--- Log closed Wed Jul 18 00:00:51 2018