Something that’s been on my mind is that the training sets for many LLMs goes up to a year or 2/3 ago. This year saw an explosion in AI/LLM generated content which will surely continue for the foreseeable future.<p>What will happen when, say in 5-10 years, these models are all using training data from the year 2024-2030 that itself was AI generated?<p>Are we heading towards some dystopian feedback loop where AI teaches AI and we lose all creativity/originality?<p>I hope I’m wrong and that smart people are thinking about how to mitigate this!