While there will be some dispute for years to come, the capabilities and potential of image generation models, and Large Language models, appear to be far beyond dispute. But the noise (sic) heralding ML models that generate music is comparatively low. I find this to be both interesting and counter-intuitive, as the quality of Machine Learning models are often tied to the availability of a large volume (can't help it...) of curated data.<p>Very few domains readily offer data as well curated as the corpus made available by many centuries of music. Music generation models shouldn't be too far off LLMs in terms of sophistication.<p>So what has happened in this area? Might it simply be the case that the field is overlooked because it's a little more difficult to demonstrate the state of the art in print and social media? Or are there other, more technical reasons (copyright for one)?
Cultural and legal reasons.<p>People are only interested in music that sounds like music they've already heard. [1][2] If you look at 20 Beatles songs you see 4 writing credits, look at 20 Taylor swift songs and you see 20 writing credits. Today there is so much music that any new music is going to have some similarity to old music so Taylor gives out writing credits to make peace and avoid lawsuits.<p>Most people are inarticulate about music so: "It has an intro like ...", "The structure is like ...", "I really like this chord from ..." are the user interface that most people would appreciate, but that's just asking for a lawsuit.<p>[1] Think of the 'too old to rock and roll' bands from 1970 who found after 1980 that fans weren't interested in hearing any songs from their newest albums at their concerts. (Though I did go to a 38 Special/Foghat concert last summer where they really made peace with this by incorporating new band members who were in bands who had other classic songs that pleased the crowd and they also added some new songs that were carefully designed to fit in with their sets)<p>[2] particularly look at classical music