This is a nice write-up.<p>That said, I hope this is less of a surprise to people now: I coauthored one of the first pieces of working pointing out basically these same issues back in 2011 - almost 5 years ago:<p><a href="http://anonymity-in-bitcoin.blogspot.ie/2011/07/bitcoin-is-not-anonymous.html" rel="nofollow">http://anonymity-in-bitcoin.blogspot.ie/2011/07/bitcoin-is-n...</a><p>It's interesting to see what perceptions have changed. That there's still confusion shows how hard it is to disseminate information about encryption and privacy; maybe this the same reason e2e email encryption seems so difficult to get adopted, even decades after PGP: it's just hard to communicate about the bounds of privacy.<p>One point: the 'clusterisation' mentioned in the linked article isn't 'magic': most of the techniques people are using are actually very simple heuristics, based on properties of the Bitcoin protocol (transaction input linking, which we demonstrated), or assumptions about transaction 'change' (prone to false positives).<p>It's worth noting that there are more sophisticated tools that could be applied: machine learning or stats methods - but I've not seen them yet. Possibly because its hard to come up with good training datasets (unless you are a retainer or wallet?) and not worth investing in when simple methods show so much.
But its worth bearing in mind that more complex analysis is possible.<p>The overall conclusion being, IMO, that if you want privacy, it's probably usually easier to design it in from the start, rather than retrofit by progressively patching holes in a leaky system, against progressively better attacks: the latter is so hard to get to the point where it works solidly: for human reasons as much as technical ones; I think Bitcoin privacy seems destined to be an example of this.