So rather than a regression it's more like the "line simplification" problem in graphics: (https://bost.ocks.org/mike/simplify/ https://en.wikipedia.org/wiki/Ramer%E2%80%93Douglas%E2%80%93...)
Just thought this solution seems a little overkill. Surely you can pick some error metric over the splits to optimize instead?
In my group at Arm there's a solid expectation that we'll see neural networks integrated into every part of a running application, and whether they execute on special NN processors or the general-purpose CPU will largely depend on where the data is needed.
I cut a very long comment short and wrote the rest of this up here: http://yieldthought.com/post/170830096265/when-are-neural-ne...
DP[i][j] = min over k of (DP[i][k] + (cost of splitting at k) + (linear regression error of points from kth to jth))
This should run in O(n^3) which will be fine for the author's requirement of ~100 points. But this isn't a complete solution since it's not obvious how to choose the cost of splitting (which is needed otherwise it will just split everything into 1 or 2 point segments).
I think thinking about this more and explicitly trying to design this cost function is still better than labeling a bunch of data until the machine learning algorithm can reverse engineer the cost function from your head. Then you can be confident of what your code is doing and why and know that it won't randomly output potato.
For example, if you were building a model that spits out 0 for no split, and 1 for split, you can easily make a simple cost sensitive linear model that takes into account previous decisions (something like HMM).
Viterbi algorithm would be the DP step.
For some, to me unknown, reason CNN performs well if not better than DP based HMM.
Maybe the author of the app had other uses for CNNs in mind for other features in the future.
So if you account for that, why have individual analytical solutions when you can solve a whole bunch of problems with one cognitive approach?
> I think of convolution as code reuse for neural networks. A typical fully-connected layer has no concept of space and time. By using convolutions, you’re telling the neural network it can reuse what it learned across certain dimensions.
The diagram is great too: https://attardi.org/pytorch-and-coreml#convolution
I’m tracking the performance of my mechanical watch myself for over a year now. After some experimentation I’ve settled for making a burst picture of the watch hands at exact minute with my iPhone camera and reading out the EXIF for exact timing. This solves quite a few logistical problems with the measurements.
From my point of view spending time to design an automatic ml solution to something that is caused by a watch owner and can be easily identified is less optimal than for instance automating the measurements themselves as described above.
If the author is interested in moving into that direction I’d be happy to share my experience directly.
Otherwise good luck further on and keep us posted.
What would be the challenges in using the camera to identify the time on the watch face?
Just a few things: in general case it's better not to use MSE after sigmoid due to slow convergence.
And "logits" variable is not logits actually, it's probabilities. Logits is what you have before applying sigmoid activation.
Doesn’t sound like using OpenCL on iOS will be realistic any time soon. Am I wrong about that?
Check out this response for a different take on this: http://yieldthought.com/post/170830096265/when-are-neural-ne...