If every problem is bounded by a bunch of data points on a multi-dimensional plane, and some sort of clustering is done on them that reflects the problem domain, then isn't running through the samples during training equivalent to memorizing them and associating them and allowing for a small margin of error making it look like the model is able to generalize.
In short, doesn't that just boil down to how good we made our samples and setting the whole process in a way to memorize those samples ?
As Yann Lecun said, “in high dimension, there is no such thing as interpolation.
In high dimension, everything is extrapolation.”<p>It’s trivial to come up with a prompt that doesn’t exist in the dataset. To generalize, the model cannot memorize.
(Current) AI is a glorified stochastic parrot[0]. Randomness != Reason.<p>[0] <a href="https://en.wikipedia.org/wiki/Stochastic_parrot" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Stochastic_parrot</a>
TB of original materials condensed into a few GB of neural network weight. If this is memorization, I would like it to be implemented into archive.org.