I feel inclined to say "... well yeah, obviously".<p>Not in the "obvious in retrospect" way, but because browsers have been progressively blocking history-sniffing tactics for <i>years</i> precisely because advertisers were using it to identify visitors.<p>Did this research... establish better numbers around it or something?
That's hardly surprising. I mean browsers hand out willingly plenty of information that could be used for pretty accurate identifications. Just scrolling through my scores on amiunique[1], many of the parameters put me in the 0.01% category.<p>[1] <a href="https://amiunique.org/fp" rel="nofollow">https://amiunique.org/fp</a>
To this me and a friend started sketching on a VPN/HTTP proxy that will have a set of say 100 outgoing IPs, look at the domains being connected to and distribute request destinations over IPs.<p>So e.g. Google would always see the same IP, which would be different from the one Facebook sees.<p>While access times cross-references and identification is still theoretically possible, it should be an entirely different game.<p>Would anyone else reading this be interested in working on this or joining in? I'm not thinking to make it a startup or business per se but 1) reliable IPs are a bit too expensive to make sense for just 1 person 2) anonymity in numbers.<p>I'm thinking ideal would be something FOSS and easy to self-host and replicate so you can pool together a group of friends for a shared VPN among semi-trusted parties (at least the user should trust the operator to not index requests and sell the data, and the operator should trust users to not run botnets)
Here in the UK, date of birth and post code is enough to identify something like 95% of people. Anonymised data sets are not really possible once you have more than a few varriables. Most people don't know this.
Intuitively there are tons of things we do on our computers that uniquely identify. I am sure the adware companies know a ton more and are not public too. The need for strict privacy preserving tech is needed across the whole stack.
By looking at all the data available to untrusted sites (as seen in <a href="https://amiunique.org/fp" rel="nofollow">https://amiunique.org/fp</a>) you can tell that Web is many many years away from being privacy conscious. List of fonts, canvas fingerprinting, timezone, OS, user agent... the list goes on and on. Those of us who are tech-literate know better than to create tech like this today, but there's just too much momentum (and shady interests) to hot-swap Web for something else.
I think this is as stupid as it sounds from the paper - <a href="https://www.usenix.org/conference/soups2020/presentation/bird" rel="nofollow">https://www.usenix.org/conference/soups2020/presentation/bir...</a><p>Why not "Mozilla research: We asked users for their name and address and the ones telling the truth we could identify"<p>TOR is fighting identifying users from the screen size of their window when maximised.<p>Here's the original paper which is more about how you can access the browsers histories - <a href="https://www.petsymposium.org/2012/papers/hotpets12-4-johnny.pdf" rel="nofollow">https://www.petsymposium.org/2012/papers/hotpets12-4-johnny....</a><p>Can you still access browsers histories? I'd have to guess no way without a zeroday. The original site is down. <a href="http://www.wtikay.com/" rel="nofollow">http://www.wtikay.com/</a> Firefox fixed it - <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=147777" rel="nofollow">https://bugzilla.mozilla.org/show_bug.cgi?id=147777</a>
Wasn't it shown by aol researchers 20 years ago that search histories are uniquely identifying? If so, this seems hardly surprising, as browser history should be a superset of search history.
As counterstrategy you can use tools like
<a href="http://trackmenot.io/" rel="nofollow">http://trackmenot.io/</a><p>"TrackMeNot runs as a low-priority background process that periodically issues randomized search-queries to popular search engines, e.g., AOL, Yahoo!, Google, and Bing. It hides users' actual search trails in a cloud of 'ghost' queries, significantly increasing the difficulty of aggregating such data into accurate or identifying user profiles.
"
The <i>Evercookie</i> (hard-to-delete cookie-like system in JavaScript) and <i>Panopticlick</i> (browser fingerprinting) projects may also be of interest:<p><a href="https://en.wikipedia.org/wiki/Evercookie" rel="nofollow">https://en.wikipedia.org/wiki/Evercookie</a><p><a href="https://panopticlick.eff.org/" rel="nofollow">https://panopticlick.eff.org/</a>
I suspect privacy would be better served by taking the approach of the security domain with responsible disclosure to vendors and a concerted effort to attack the problem holistically. Until then we’re just giving privacy attackers a heads up and by the time this issue is mitigated their onto the next avenue for bypassing privacy.
If the study establishes that for all practical purposes, online anonymity is impossible to maintain for average users, what are the implications (a) for the average user; (b) for the economy; and (c) for society?