I think they stopped getting new online data as they saw the potential of their model in creating more content and does not want training the model with its own content. Am i wrong?
You're not wrong. I hear a lot about fears of AI taking jobs, fears of a chatbot singularity, fears of evil men exploiting humans with AI... all reasonable fears. That doesn't worry me; every technological advance since the industrial revolution has changed the job market, changed society, and provided more power to unscrupulous men! It's an age old story, not ideal, but not a new threat.<p>But you've identified what makes generative AI unique - it hijacks both the consumption and production side of our collective efforts to document human knowledge. There's a non-zero chance that this will cause a feedback loop whereby e.g. ChatGPT 27 is trained primarily on content generated by previous ChatGPTs, amplifying its errors and biases on every training run. Any subculture of human-generated information that is not represented in this loop from the start might effectively be erased and replaced by machine hallucinations.<p>As we seem eager to let this happen, it's easy to imagine how this feedback loop could lead to a full-blown epistemological crisis, essentially wiping out any social value or trust in any digital media. Without a basis for accurately recording history and facts, a complex society can't sustain itself. That's the real AI threat.
They could keep hashes of every sentence they output and then check an input text for the presence of those. Too many hits->contaminated.<p>Vendors could share those hashes with each other. That leaves the models run by parties outside of the big entities to deal with.
I don't think so, but also with recent browser plugin I think it is kind of enough to make it having general understanding, not necessary knowing all the facts at all time.