TechEcho

7 comments

I've had a similar question before. I have heard of as few as 60,000 observations was considered "big data" [1], yet at my company, we generate about 60 million pharmacy claims every 3 months, and no one here calls it big data. In terms of storage, it's on the order of a couple hundred terabytes for all our data. This is considered small enough that we can query it with traditional SQL.Big Data, the experts say, is more about the novel way you analyze data, relative to the difficulty of the problem. A speaker at PyCon who was talking about algorithms and data structures for handling genetic data had a term that I like quite a bit better: "Data of Unusual Size" (C. Titus Brown at MSU was the speaker)."Big Data" is a really big buzzword right now, but the term is overused and often does not convey the meaning it's supposed to. "Big" is a relative term. The novelty of how much data is being used as opposed to how much used to be used (in the sumo case, they had never handled so much data before) is what makes it big.As for Hadoop, you'd want to use it when it's no longer feasible to keep your data stored in an RDBMS, or when speed becomes an issue, or when you want your schema to be more flexible than an RDBMS. If you are not concerned with the reliability of your data (RDBMS make the safety of the data a paramount priority--read the wikipedia page on ACID to see these guarantees), there's plenty of reasons for choosing Hadoop.[1]: <a href="http://www.wired.com/wiredenterprise/2013/03/big-data/" rel="nofollow">http://www.wired.com/wiredenterprise/2013/03/big-data/</a> [2]: <a href="http://en.wikipedia.org/wiki/ACID" rel="nofollow">http://en.wikipedia.org/wiki/ACID</a> [3]: <a href="http://hortonworks.com/blog/4-reasons-to-use-hadoop-for-data-science/" rel="nofollow">http://hortonworks.com/blog/4-reasons-to-use-hadoop-for-data...</a>

评论 #5590206 未加载

评论 #5590405 未加载

macleeabout 12 years ago

<a href="http://www.youtube.com/watch?v=B27SpLOOhWw" rel="nofollow">http://www.youtube.com/watch?v=B27SpLOOhWw</a>Look at the above it will give you a good idea. I'm not going to go into examples as the above video really give some good examples. A Terabytes of data is nothing these days, I see Terabytes databases within companies on a regular bases within my job. The size does not mater, but the three big components of "Big Data" are: multiple sources of information in multiple formats, volume of data and rate of ingest/rate of new incoming data. The basic idea is to be able to process all of the incoming data within your company and get some kind of intelligent information out of all this data that you can use.

chris_dcostaabout 12 years ago

In my experience, and I have worked in this field since 2001, "Big Data" is the the answer to the problem of poorly implemented reporting models. The excuse for bad performance has always been "the size of the dataset" not ultimately the real culprit: lack of technical knowledge on how to build an appropriate and performant model. Size of the data set is almost irrelevant, when it's done right, but when it's done wrong, even a small ( 100 000 records ) dataset can be sold as "the problem".Business people love to chase a holy grail, this being yet another one.

brudgersabout 12 years ago

"If a program manipulates a large amount of data, it does so in a small number of ways." Alan Perlis"Big Data" has an operational definition. It's relative to current technology. Less than two decades ago a terrabyte was big enough that Microsoft created TerraServer as a technology demo [<a href="http://en.wikipedia.org/wiki/TerraServer-USA" rel="nofollow">http://en.wikipedia.org/wiki/TerraServer-USA</a>]. Terraserver would dwarf big data of the time when Perlis wrote Epigram 4. Today, TerraServer is dwarfed by Youtube.

ibudialloabout 12 years ago

I don't know if I'm alone in this but I have this question.What is Big Data ? I hear it a lot from non technical people, so my first thought was that it is q company . After doing little research I am starting to think it is a concept. Sometimes it sometimes it sounds like marketing term. What is it ?

评论 #5592293 未加载

webnrrd2kabout 12 years ago

I think of it in more human terms -- it has to do with time to process than actual size. I think data becomes big when it takes longer than I'd like to answer the questions I want answered. As a corollary,the longer it takes, the "bigger" it becomes.

johnwardabout 12 years ago

Not as big as extreme data.

7 comments

christopheradenabout 12 years ago

评论 #5590206 未加载

评论 #5590405 未加载

macleeabout 12 years ago

chris_dcostaabout 12 years ago

brudgersabout 12 years ago

ibudialloabout 12 years ago

评论 #5592293 未加载

webnrrd2kabout 12 years ago

johnwardabout 12 years ago

Not as big as extreme data.

Ask HN: How big is Big Data?

7 comments

Ask HN: How big is Big Data?

7 comments