It's refreshing to hear generated piano music that isn't either strictly metrical or entirely freeform, but with patches where you do get a somewhat natural sense of rubato and sensitive dynamic shaping. It's sort of convincingly improvisatory. The constantly shifting harmonic idiom is disorienting in a not very pleasant way – the worst kind of Chopin + Ligeti mashup – especially when you raise the temperature. It would be interesting to use period/style-specific training sets.<p>To my ears the 5:00 clip does have a larger structure, there are clearly extended passages of building up to and ebbing away from large climaxes, where you get a real sense of sustained intensification, but of course if you follow the detail everything is built up from lots of fleeting and unrelated ideas.
It seems that this model does not have any notion of "cadence" (the punctuation in musical grammar, given by harmony and tonality). The "expressivity" must be correlated to the harmony grammar, else it does not make sense. Unfortunately the samples in the article do not sound very good to me, and I am pretty sure that it is because of that.
This is stunning! Great stuff.<p>Since the input and prediction is a single sequence, did you experiment with beamsearch/stochastic beamsearch decoding (maybe with additional diversity criteria)?<p>I found that even simple models (markov chains) got a big diversity boost with a stochastic beamsearch - it might avoid the problems with low temperature repetition that could happen in a standard beamsearch. However, my music models are much, much, (much) worse than this, so my relative improvement might be related to that.<p>Similarly, I am finding really nice results in text (RNN-VAE) with scheduled sampling, it might be worth experimenting with.<p>I am amazed at how good this next-step sampled output is. The above ideas might just hurt the result, I am having a hard time imagining how it could be better.<p>What soundfont/midi rendering package is used for this? The piano sound is really rich.<p>Looking forward to hearing what creative things users will do with this model.
that first example is jaw dropping. its just like what good musicians do when they are noodling. damn. well done! probably the best results i've ever heard for this type of effort.
This can generate elevator music that will never repeat. I am up for that ! (Just getting into ML with Udacity, Coursera courses. This is just fascinating)