I helped build the cover song alignment pipeline for "Infinite Bad Guy" <a href="http://billie.withyoutube.com/" rel="nofollow">http://billie.withyoutube.com/</a> an interactive music video that brings together thousands of YouTube covers of "Bad Guy" by Billie Eilish. We just published a writeup on how we built the alignment (along with some simple tricks for estimating video similarity).<p>We used a bidirectional LSTM with CQT and chroma as input, and predicted the original beat for every cover beat as output. There's a bunch of existing work on this from Furkan Yesiler, Dan Ellis and others, and we're super grateful to them for advising.<p>Super open to feedback and critique here or in the post.