<i>We will say that a video segment loops well when its first and last video frames are very similar.</i><p>It's a reasonable approximation. However really perfect loops have not only matching positions of the objects but perfect speeds too. I would calculate optical flows and compare them at the first and the last frames too and somehow put it into the distance function.<p>One example: this algorithm would find the half period of a pendulum perfectly loop. Taking optical flows into account would fix this.
some prior research on this:<p>---<p><i>Loop Findr</i> by Collin Burger<p><a href="http://loopfindr.tumblr.com/" rel="nofollow">http://loopfindr.tumblr.com/</a><p><a href="http://golancourses.net/2014/collin/05/12/loop-findr/" rel="nofollow">http://golancourses.net/2014/collin/05/12/loop-findr/</a><p><a href="https://github.com/cyburgee/loopFindr" rel="nofollow">https://github.com/cyburgee/loopFindr</a><p>---<p>Microsoft Research: <i>Automated video looping with progressive dynamism</i> (2013)<p><a href="http://research.microsoft.com/en-us/um/people/hoppe/proj/videoloops/" rel="nofollow">http://research.microsoft.com/en-us/um/people/hoppe/proj/vid...</a><p><a href="http://research.microsoft.com/pubs/196137/videoloops.pdf" rel="nofollow">http://research.microsoft.com/pubs/196137/videoloops.pdf</a><p><a href="http://research.microsoft.com/en-us/downloads/d02f3198-7896-45eb-89e8-5a75859b67c8/" rel="nofollow">http://research.microsoft.com/en-us/downloads/d02f3198-7896-...</a>
Very cool concept and article.<p>You may also notice that many of these looping gifs are done in much simpler way by just playing part of a video and then playing it in reverse.
I assume something has broken, but this article makes no sense to me. It's full of sentences like:<p>"If is very similar to , and is different from , then we do not need to compute to know that and are also very different."
With a better, faster algorithm, probably with some help from a GPU, this could become a new app. There's already "<a href="http://loopc.am/"" rel="nofollow">http://loopc.am/"</a>, but it's not as good. Then cash out by selling out to Instagram. Also use to jazz up real estate ads.
Isn't this just k-NN? I mean: one can reduce this problem to k-NN by first loading the database with all video frames, and then performing queries using frame 0, frame 1, etc.<p>There are good tree algorithms[0] and implementations[1][2] for executing k-NN queries. These implementations also exploit the properties of the triangle inequality.<p>[0] <a href="http://en.wikipedia.org/wiki/Ball_tree" rel="nofollow">http://en.wikipedia.org/wiki/Ball_tree</a><p>[1] <a href="http://mlpack.org/doxygen.php?doc=nstutorial.html" rel="nofollow">http://mlpack.org/doxygen.php?doc=nstutorial.html</a><p>[2] <a href="http://scikit-learn.org/stable/modules/neighbors.html#neighbors" rel="nofollow">http://scikit-learn.org/stable/modules/neighbors.html#neighb...</a>
Slightly misleading to say the summed colour value difference between the frames a distance that's analogous to a geometric distance, geometric distance measures on data with high numbers of dimensions becomes useless, as the dimensionality increases the difference between 'near' and 'far' tends towards zero.