> Q5 In the case of Markov, etc., I understand that the generation may be not very conformant to the style learnt, unless using a high order Markov but with the risk of recopying entire sequences from the corpus and thus plagiat. But, in the case of a RNN-based architecture [9], what is the rationale?-------------------- A5: As mentioned before, RNN does the similar work as LSTM in our work. But without including a discriminator, it only learns transmission probability between adjacent notes, but does not promise that generated sequences look like real ones.<p>Come on, guys, that's just not true. You do <i>not</i> need an adversarial loss to get good quality melodies. Look at Sturm's char-RNN on ABC notation, or OpenAI's MuseNet, or a bunch of Project Magenta work, or my own GPT-2 ABC music (MIDI in progress): <a href="https://www.gwern.net/GPT-2-music" rel="nofollow">https://www.gwern.net/GPT-2-music</a> Or for that matter, any generative model trained with a non-adversarial loss (anything using GPT-2 for example).<p>In fact, generally, everyone <i>avoids</i> GANs for sequence generation because they work so badly compared to regular likelihood training... (Just at a skim, their 'baseline' is pretty suspicious. I'd expect an ablation for the GAN, not comparing their 400-unit LSTM to... a 100-unit LSTM <a href="https://www.aclweb.org/anthology/N19-4015.pdf" rel="nofollow">https://www.aclweb.org/anthology/N19-4015.pdf</a> ? Really?)
Oh no, they're ruining popular music as we know it. Anybody can just push some buttons and generate the next hit song. All you need is some with artificial lyrics, artificial melody, artificial vocal (Yamaha vocaloid), on top of a beat bought on the net.
Judging by the four provided melodies: 1) the notes have very little rhythmic variation. 2) the melodies don't seem to have any concept of metre or metric accent.
Music, like weaving, is a predecessor in using algorithms to do work. For hundreds of years, music theorists have been codifying the algorithms, or creating new algorithms, that create good music (for particular definitions of good). Like other code instantiated in a network, this code is more tailored to specific prior states and is less available to analysis of its details than prior efforts.
I always wanted a "demo engine" whereby one feeds in the melody and chord name, and then a style(s). The AI would then use pattern matching to make a fuller score in the chosen style(s). The output could be midi and/or an audio file (such as .WAV). Bonus points for vocals if given lyrics. I could make Elvis diet parodies: "Ain't nothing but a round dog..." Band-in-a-Box software sort of does this, but lacks realism in my opinion.