I have a developing idea that AI can be thought of as "banking" (or sometimes laundering) human labelling. Neural networks don't work at all on things they haven't seen before (out-of-distribution) but can nicely interpolate within what they have seen. Concretely, when chatGPT gives a seemingly clever answer, a data labeller overseas somewhere has already manually given a similar answer to a similar question (similar in the eyes of the model). Depending on the expected distribution of your data, this can work really great for automation. With long tailed data, it either becomes a kind of whack-a-mole or (as with modern LLMs) you just go really big. I haven't seen anything suggesting we're actually able to escape the long tail though, and that in the margins AI NNs will always fail and the only way to improve them is to launder more manpower.<p>There was an old (2020) a16z article that's still relevant "taming the tail": <a href="https://a16z.com/2020/08/12/taming-the-tail-adventures-in-improving-ai-economics/" rel="nofollow noreferrer">https://a16z.com/2020/08/12/taming-the-tail-adventures-in-im...</a>