I don't like differential privacy very much.<p>Take GPS data, for example: NYC has released a taxicab dataset showing the "anonymized" location of every pickup and dropoff.<p>This is bad for privacy. One attack is that now if you know when and where someone got in a cab (perhaps because you were with them when they got in), you can find out if they were telling the truth to you about where they were going -- if there are no hits in the dataset showing a trip from the starting location that you know to the ending location that they claimed, then they didn't go where they said they did.<p>Differential privacy researchers claim to help fix these problems by making the data less granular, so that you can't unmask specific riders: blurring the datapoints so that each location is at a city block's resolution, say. But that doesn't help in this case -- if no-one near the starting location you know went to the claimed destination, blurring doesn't help to fix the information leak. You didn't <i>need</i> to unmask a specific rider to disprove a claim about the destination of a trip.<p>I think that flaws like these mean that we should just say that GPS trip data is "un-de-identifiable". I suspect the same is true for all sorts of other data. For example, Y chromosomes are inherited the same way that surnames often are, meaning that you can make a good guess at the surname of a given "deidentified" DNA sequence, and thus unmask its owner from a candidate pool, given a genetic ancestry database of the type that companies are rapidly building.