This is exactly the argument I've been making to people when we discuss PRISM.<p>Think about the heterogeneity of the data, the lack of structure, and the unpredictable nature of its generation. Frankly, I have no doubt that the NSA is not monitoring phone chatter on a mass scale, probably not because they <i>can't</i>, but because if they did there would be no way in hell to parse, store, process and evaluate the data generated.<p>We (the scientific/big data community) can barely get recommendation engines working well - engines which have one set of data (what you watched) and do one other thing (suggest what else you might want to watch). Unless the NSA is <i>decades</i> ahead in a number of fields (like data warehousing, statistical analysis of massive datasets, machine learning) how are they getting useful information in a systematic way, considering the pressure from the data-firehouse involved?<p>My guess is they're probably not - instead the data are collected, and then used in conjunction with traditional approaches. e.g. little johnny buys some fertilizer and one way plane ticket - so who's he been talking to, what's he been saying, etc.<p>Honestly, how the NSA is using/dealing with/storing/accessing these data is actually an incredibly interesting question, from an academic/systems perspective.