TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Big Data vs Intelligent Data (and what Startups can do with it)

33 pointsby bialeckiover 12 years ago

3 comments

equarkover 12 years ago
Another key fact is that "big data" is actually not that common, especially when it gets to the analysis stage.<p>The median job size at Microsoft and Yahoo is only 15GB. And 90% of Hadoop jobs at Facebook are under 100GB. Clearly you want to be able to crunch large log files, but in terms of day-to-day analysis the files are much smaller than that. (cite: <a href="http://research.microsoft.com/pubs/163083/hotcbp12%20final.pdf" rel="nofollow">http://research.microsoft.com/pubs/163083/hotcbp12%20final.p...</a>).<p>At Sense (<a href="http://www.senseplatform.com" rel="nofollow">http://www.senseplatform.com</a>) most of the clients we work with are struggling not with the size of their data but with tricky modeling problems that don't fit into standard black boxes and with integrating analytics into actual production systems. Adopting something like Hadoop for these tasks is not very productive.
评论 #4482123 未加载
评论 #4481256 未加载
评论 #4482038 未加载
wookietraderover 12 years ago
From a data analyst's perspective, let's go through what he says.<p>First he states something along the lines of "More data does not always help." This is right from a theoretical perspective. But: it never hurts. This is also right from a theoretical perspective, it's a result from probability theory: additional observations will always lead to less or equal variance in your estimations. There is no data like more data. There is no down side with more data.<p>I am not sure in what way (2) and (3) relate to big data. I'd even say that (3) is pro big data.<p>Then there is this term "intelligent data". Actually, I can't emphasize how badly chosen this term is. Intelligence is related to the quality of actions someone takes. Data does not take actions, It just "is". Data cannot be intelligent, just as a stone cannot be intelligent. He also thinks that data measurements should be repeatable. Guess what, in all interesting cases data measurements are <i>not</i> repeatable due to randomness in the source itself. One of the main challenges of data analysis is to still get robust results. He also thinks that data should be concise, e.g. that the data set at hand should be as minimal as possible to lead to the same actions. This sounds like a chicken and egg problem. How would you be able to even assess this without trying it out?
评论 #4480945 未加载
评论 #4480825 未加载
评论 #4481923 未加载
photorizedover 12 years ago
Data can't be intelligent.