Fingerprinting is one of those things where there's really been a slippery slope we've just slid further and further down it over the last decade; back when I worked at an ad-tech startup (almost 15 years ago) I ran an experiment myself with our data to see if a simple hash of IP, browser agent, and maybe a couple other signals we had in our logs (don't recall) would co-relate with the cookies we already had through cookie matching from other sources. And the answer was: yes, about 95% of the time. Enough to be reliable enough to do basic retargeting without worrying about excessive false matches.<p>But at the time, it was considered to be a big <i>do not touch</i> -- just don't do this. Not so much for ethical reasons, but for optics in the industry. (I wasn't proposing doing it, was just curious)<p>In the meantime, though, this seems to have just become standard practice, but <i>way more sophisticated</i> with way higher accuracy, as this article touches on.<p>What was not acceptable a decade ago is now "ok." Not just by sketchy ad startups, but by major players.<p>But this whole mess ties back to one of the things that worries me the most about the propagation of LLM type ML out into the general industry. It's only a matter of time before ad targeting takes on an extra dimension of creepiness through this (and I'm sure it's already happening in some aspects, inside Google & Meta.)<p>In the past, in ad tech & search, etc. people could say things like: <i>"Yes, it's highly targeted. Yes we've co-related an absolutely huge quantity of data to fingerprint you exactly, and retarget you. But it's anonymized. No humans saw your personal data. It's just statistics."</i>. Not saying whether or not this argument has merit or not, just repeating it.<p>But now, here we are, where <i>"just statistics"</i> is a far more intricate learning model. One which is capable not just of corelating your purchases and browsing activity, but of "understanding" you, and which -- while not an AGI -- is pretty damn smart.<p>At what point does "a computer scanned your browsing for patterns and recommend this TV set" become ethically the <i>same</i> as "a human read your logs, and would like to talk to you about television sets..."?<p>Having worked in ad-tech before (and having worked at Google, in ads and other things as well), I do <i>not</i> trust the people in that industry to make the right decisions here.