I'm not sure I believe anyone in the field has ever said "AI token". I also don't buy that the term "training data" implies the existence of labeled input/output pairs. Unlabeled data is still training data.
Seems useful at a first glance but after doing some reading it looks like a lot of the list is proprietary technologies, platforms, and companies rather than helpful definitions
I find it difficult to treat a site as a serious authority on anything when its header implores me to "Sign up for the bi-weelky [sic] newsletter"
I noticed an odd typo in the html for this web page. Inside the head element, in this line:<p><pre><code> <meta http-equiv="content-type" content="text/html; charset=uft-8 " />
</code></pre>
The `uft-8` should be `utf-8`.<p>(How'd I notice this? I have a little HN reader app I maintain at <a href="https://www.thnr.net/" rel="nofollow">https://www.thnr.net/</a> , and I got some error messages in my logs when my word-count function (which computes how long it would take a person to read the article) was processing this web page's html. Part of this function examines what text encodings the web server and web page each self-report the page as having. The HTTP headers correctly said "UTF-8", fwiw.)
See also <a href="https://developers.google.com/machine-learning/glossary" rel="nofollow">https://developers.google.com/machine-learning/glossary</a>