Tell-all telephone – Six months of phone metadata visualized

568 点作者 danielhunt将近 12 年前

18 条评论

drpancake将近 12 年前

You can infer some amazing things from simple metadata. I spent six months in an R&D team at a large mobile telco, with the task of trying to infer as much as possible from anonymous customer data just like this.Figuring out where you live and work, to a reasonable accuracy, is quite easy; you simply look at where the most outgoing calls/SMS originate from at certain hours of the day over an extended period.We built up our own social graph. You treat calls and text messages as directed edges and phone numbers as nodes. These were fascinating to look at.You can even try to guess when someone gets off a plane. When a plane lands you'll suddenly see lots of incoming undelivered text messages as people turn their phones back on. If a node was last seen in a far away cell, but then reappears in this group, you can cross-correlate with arrival times and make a reasonable guess.

评论 #5970097 未加载

评论 #5976779 未加载

评论 #5971595 未加载

rayiner将近 12 年前

The argument isn't that meta-data can't be used to get a lot of information about someone. The argument is that in the U.S., meta-data isn't protected information. Call meta-data is not your information, but information the telephone company keeps about you. In the U.S., the 4th amendment does not protect those sorts of records: <a href="http://en.wikipedia.org/wiki/Smith_v._Maryland" rel="nofollow">http://en.wikipedia.org/wiki/Smith_v._Maryland</a>. Your cell phone, which you use voluntarily, gives the phone company tremendous information about you, and under U.S. law nothing keeps the government from getting that information from the phone company.Does call meta-data give the government a lot of information? Yes. Does it give the government too much information? Quite possibly. But arguing shrilly about how collecting call meta-data is "illegal" is counter-productive. Maybe it should be illegal, but you can't start the process of making it so by proceeding from an incorrect premise. And you can't dismiss the goal of making it illegal, by arguing that the government is already ignoring the law, with reference to activity where the government is clearly attempting to stay within the law, even if it is pushing the boundaries as much as it can.

评论 #5969977 未加载

评论 #5969907 未加载

评论 #5969936 未加载

评论 #5969950 未加载

评论 #5974747 未加载

skwirl将近 12 年前

"Metadata doesn't matter" to me seems to be a really poor strawman. Maybe a small minority of people think that, but I'm pretty sure most people are smart enough to realize that if it "didn't matter" the NSA wouldn't be collecting it to begin with.Also, I don't believe that it has been shown that location information has been collected. That claim is conjecture only. We've seen a lot of conjecture related to these leaks that has been taken for fact. Sometimes it is hard to tell them apart.

评论 #5971825 未加载

mtgx将近 12 年前

And that's just from the phone metadata. Imagine how much more they can do with all your online info from all the services you're using, all the blogs you're commenting on, and so on.The same person being talked about above wrote this article in NYTimes yesterday:<a href="http://www.nytimes.com/2013/06/30/opinion/sunday/germans-loved-obama-now-we-dont-trust-him.html" rel="nofollow">http://www.nytimes.com/2013/06/30/opinion/sunday/germans-lov...</a>

评论 #5970098 未加载

grey-area将近 12 年前

What a remarkable visualisation - this is a clear demonstration of just how intrusive these metadata records can be. If they're not controlled by law, they should be.

blackdogie将近 12 年前

Malte Spitz (the guy who's data you see) is a German Green Party politician and did a TED presentation in 2012 <a href="http://www.ted.com/talks/malte_spitz_your_phone_company_is_watching.html" rel="nofollow">http://www.ted.com/talks/malte_spitz_your_phone_company_is_w...</a>

评论 #5971901 未加载

lazyjones将近 12 年前

Let's not forget that combined metadata from millions of people allow much greater detail than this (who you meet, talk to regularly, share interests with, are likely to run into ...).

moreentropy将近 12 年前

I'm afraid the actual definition of "meta-data" is up to interpretation in the context of IP communication.What if the NSA considers not only IP source & destination as "metadata" but also anything down to the application layer that is not strictly content? Like the HTTP GET line or HTTP headers.

评论 #5970025 未加载

qwerta将近 12 年前

What do you thing that graph databases with trillions of connections are used for? The real fun will start after someone leaks couple of terabytes of tracking data.

Bosence将近 12 年前

Of course it matters, otherwise they wouldn't collect it.

评论 #5970859 未加载

tripzilch将近 12 年前

Well, if location data is considered part of this "metadata", then I don't see how anyone could argue against the dangers of this.My physical location in the real world I consider way more private in matters of wide scale tracking than what I write or say.For instance, I hardly ever let my browser determine my location and send it to some site, it's none of their business where I am, and if I want the local weather they can get the name of the city I'm at.But I was hoping this article would be about another, way more dangerous, because way more information-rich type of "metadata": Social graphs and contact lists. The problem with this is, humans underestimate the depth of this kind of data because we're not really well-equipped to reason about them.If you have a table that consists of (time, location) records, it's pretty easy to envision what sort of information could be extracted from this data. Add a few more fields, and it becomes harder, maybe you need some creativity and statistics, but it's all basic detective work.A free form directed graph (such as a social graph or collection of contact lists) doesn't look like a table at all (well, you can represent it as a table, but that won't make you much wiser). It's in fact a very high-dimensional object.The older generation out here, may remember when they first encountered the WWW, when you could only navigate it by clicking links. I got this sense of vastness, perhaps even helplessness. They don't call it hypertext for nothing. The sense of vastness comes because clicking and navigating those links gives an idea of moving through a space. Except this space is in some sense "larger" than our usual 3D space. Every door (link) can open into every room, regardless of whether it would be possible in a physical space.This is why those "graph of (part of) the Internet" pictures you sometimes see are generally always a tangled clutter of strings, usually vaguely ball-shaped. This is because there is no sensible representation of this type of inter-connected data. You can't make a hierarchy or a map, at least, not in the general case (and the thing you want to reason about is the general case, most of those graphs are exponential small-world graphs, highly inter-connected).Same thing for social / contact list graphs. Except they usually don't have web-rings or directories (you can sometimes make them like FB does, but they aren't generally available, again the general case).So okay we're not really good at keeping large graph networks of "friends of friends of friends" and other relationships in our heads and reason about them. We're really not. What you think you can reason about those graphs is just scratching the surface.Computers, however, and Big Data Machine Learning algorithms in particular, have no problems at all with this type of data. An algorithm never lived in a 3D space, it doesn't care if a dataset makes no sense as a physical configuration of nodes, in order to navigate it and extract information from it.Another important distinction is, people tend to think of these social graphs as labeled nodes with edges between them. Which is correct, in a sense. But it gives the impression that the labels are more important than they actually are. This may sound weird, in the building/room analogy, if you have millions of rooms, and every room is directly connected to 50-200 other rooms, somehow the shape of the paths between the nodes and way they are connected becomes a vastly more information-rich data source than the actual values of the labels of the nodes themselves.They don't need your name or your photo, the local shape of your social graph is a highly unique fingerprint of whoever you are.And you can delete Facebook, but on the next social network you sign up for (or any of the other social graphs you're generating, email/IM contact lists, etc), this fingerprint will echo, and in many cases be similar enough to clearly indicate this is the exact same person. No names necessary. (this may be a bit harder if you have a strictly separate business persona and social persona, but there are still some unexpected artifacts to pick up for a ML algo even in these cases) If you're not on a network at all, your presence can be extrapolated from the "hole" in the graph you left (all your friends are there, with their particular local graph shapes, but one node is missing), that is even if you have nothing to hide, you will be leaking info about those who do.

评论 #5970729 未加载

评论 #5970994 未加载

mikecane将近 12 年前

Given the remarkable intel that can be gathered, I'm surprised the NSA/CIA/FBI aren't giving away smartphones to targets as anonymous presents or under the pretense of winning a contest.

评论 #5971287 未加载

评论 #5971502 未加载

评论 #5970128 未加载

lifeisstillgood将近 12 年前

Eventually, all the social and location graphs will be mapped for all of humankind - and we shall find out that everyone, on the whole planet, is exactly 42 feet from Kevin Bacon.

elgenesys将近 12 年前

If some agency like NSA etc wants to know about you in great detail, clearly they have the data, and will be able to very quickly put it all together.The other side of this coin is that commercial parties like Facebook etc have the same potential detail and insight about anyone.There is also very high probability that similar data is being put together by entities somewhere between the NSA and Facebook, for purposes that are much more starkly not in your best interests eg fraud.Bottom line: anyone is an open book on the internet.

binarymax将近 12 年前

Does anyone know if these work as advertised? <a href="http://www.ebay.com/sch/items/?_nkw=cell+phone+signal+block&_sacat=&_ex_kw=&_mPrRngCbx=1&_udlo=&_udhi=&_sop=12&_fpos=&_fspt=1&_sadis=&LH_CAds=" rel="nofollow">http://www.ebay.com/sch/items/?_nkw=cell+phone+signal+block&...</a>I rarely receive calls on my mobile - and only really carry one just in case I need to make a call.

评论 #5970068 未加载

评论 #5970009 未加载

sfaruque将近 12 年前

Slight off-topic question: I want to collect my own metadata at this level (for just calls and SMS)?From what I can tell I need to collect:- List of all incoming and outgoing calls and SMS- Get my location data and match them to the timestamp (?) of the calls and SMS's- Display this on a map.Any suggestions on how to do this?

teeja将近 12 年前

People might think that (apart from GPS) signals to one tower only are unlocalizable. Add the variable of signal strength (with fairly uniform xmit pwr) to that single vector and it gets more interesting.

SourApples将近 12 年前

Just me or, anyone else just throw up a little bit.Almost overwhelming.