mlpack IRC logs, 2018-05-11

Logs for the day 2018-05-11 (starts at 0:00 UTC) are shown below.

May 2018
--- Log opened Fri May 11 00:00:14 2018
03:44 -!- govg [~govg@unaffiliated/govg] has joined #mlpack
05:52 -!- travis-ci [] has joined #mlpack
05:52 < travis-ci> ShikharJ/mlpack#132 (AtrousConv - acf54f4 : aarushgupta): The build has errored.
05:52 < travis-ci> Change view :^...acf54f4614e1
05:52 < travis-ci> Build details :
05:52 -!- travis-ci [] has left #mlpack []
09:18 < ShikharJ> rcurtin: The jenkins build for Layer Normalization PR is seemingly stuck, when can we expect the queue to clear up?
10:16 -!- Atharva [uid288001@gateway/web/] has joined #mlpack
10:31 < zoq> let's stop/restart the monthly matrix build
12:26 -!- Atharva [uid288001@gateway/web/] has quit [Quit: Connection closed for inactivity]
13:03 < rcurtin> sure, let me kill it
13:04 < rcurtin> ah, actually, looks like you already did it :)
14:13 -!- ImQ009 [~ImQ009@unaffiliated/imq009] has joined #mlpack
16:21 -!- govg [~govg@unaffiliated/govg] has quit [Ping timeout: 246 seconds]
16:27 -!- vivekp [~vivek@unaffiliated/vivekp] has quit [Ping timeout: 240 seconds]
16:28 -!- vivekp [~vivek@unaffiliated/vivekp] has joined #mlpack
17:45 -!- manish7294 [9d25307f@gateway/web/cgi-irc/] has joined #mlpack
17:56 < manish7294> rcurtin: I gave another shot to both the single KNN pass and multi pass KNN (BoostMetric Implementation) constraints generation implementation. And here's quick insight: Data points: 5000, Single Pass: K=3 -> 272.695secs, K=40->299.084secs. Now for Multi Pass: k=3->7.2961secs, k=40->Like forever (Already been more than 3 hours and still running wit
17:56 < manish7294> h high CPU usage, Probably have to kill the process)
17:58 < rcurtin> manish7294: I haven't had a chance to look into your response yet but it sounds like the MATLAB implementation is horrifyingly slow
17:58 < rcurtin> not sure what is up with the multipass algorithm you are using but I virtually guarantee a good implementation will be way faster than singlepass KNN with k = N - 1
18:00 < rcurtin> the dimensionality of the data you are using will make a difference, so if you are working in 10k dimensions, brute force will typically be faster than trees
18:00 < rcurtin> but I don't think that is the regime you are working in
18:02 < manish7294> rcurtin: On other hand if you look it like this - for eg: let's we have 3 classes with 20,2,10 instances then in multipass we will be calculating KNN for k = 30, 22,12 whereas we can reduce the single pass knn to just run for k =30
18:05 < manish7294> let the least number of instance of a class is 'X' then we just need to run KNN for N - X + K which is similar to atleast one of the pass of multi KNN
18:06 -!- manish7294 [9d25307f@gateway/web/cgi-irc/] has left #mlpack []
18:07 -!- manish7294 [9d25307f@gateway/web/freenode/ip.] has joined #mlpack
18:16 < rcurtin> manish7294: sorry, I think there is a misunderstanding. if you run multipass KNN (if that's what we should call it), each run you set k = k, not k = n - 1
18:16 < rcurtin> where 'k' is however many impostors or same-labeled neighbors you are looking for
18:16 < rcurtin> also, keep in mind, the interesting regime for BoostMetric will not be 100-point datasets... it will be 100k or 1M or 10M point datasets
18:17 < rcurtin> basically, we should be focusing on seeing how much we can make it scale on a single system with mlpack
18:21 < manish7294> rcurtin: Ahhh! my bad. Tommorow, I will try implementing your idea. Hopefully we will get better results. Thanks!
18:26 < rcurtin> sure, let me know if there is anything else I can clarify :)
18:35 -!- manish7294 [9d25307f@gateway/web/freenode/ip.] has quit [Ping timeout: 260 seconds]
21:21 -!- ImQ009 [~ImQ009@unaffiliated/imq009] has quit [Quit: Leaving]
--- Log closed Sat May 12 00:00:15 2018