TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

In Media, Big Data Is Booming but Big Results Are Lacking

29 点作者 carlosgg将近 12 年前

5 条评论

ryanlchan将近 12 年前
I think a major hurdle we have to overcome with big data is separating causation vs correlation. As the data set scales, we gain ever-increasing confidence in the correlation, but an ever more complex set of causations.<p>Take their House of Cards example. Netflix saw a strong correlation between David Fincher, Political Thrillers, and Kevin Spacey. Fantastic. But why? What did people like about these things? Why did this 'work'?<p>Let's try to replicate this decision: take great directors (Wachowski siblings), a strong cast (Emille Hirsch, John Goodman, Susan Sarandon), and nearly unlimited budget ($200m) to reboot an existing, well received franchise. Should be a hit, right? Wrong - it's a complete and utter failure known as 2008's Speed Racer.<p>When we say we want to be data-driven we actually mean we want to be insights-driven. We want to understanding the "Why?" from the data's "What"; it's the 'Why' which lets us know how to react next. It's easy to confuse the data's specificity with insight's certainty, but they are distinctly not the same: We can pinpoint conversions down to 6 digits of significance without having a clue why it occurs.<p>What we really need is Big Insight, but that's a significantly harder problem, not because we don't have the technology to create a solution, but because don't even know what the right questions are.<p>I'm optimistic about the possibilities of a system like IBM's Watson in helping solve this, but as it stands, Big Data's utility is giving us 99.755% certainty that we have no idea what is going on.
评论 #5770348 未加载
karterk将近 12 年前
Engineers are better equipped to collect and store massive amounts of data than to analyze it to draw meaningful conclusions from it or to drive decisions. It takes business acumen to ask the right questions to the data.<p>The way "big data" is stored today also poses problems with respect to queryability. NoSQL systems are great at storing huge amounts of data efficiently but don't help us slice and dice the data easily. Having to write map reduce jobs for every "query" is painful and time consuming. Tools like Hive, Pig and Cascading help in writing MR jobs succinctly but are still very slow when someone wants to quickly filter and explore the data.
评论 #5769921 未加载
stfu将近 12 年前
I am just hoping to hear one day the "real" story of how "House of Cards" came into place. Somehow my gut tells me that using this as the poster child of big data is only a half-truth. I for one believe that it is just a lucky guess, e.g. not really surprising that some quality stories do work with a paying audience. Something along the lines of Boston Legal would have probably worked just well for the professional, 30 plus, male, metro area audience.<p>Unless we now see a consistent strike of at least 5-10 original series of original programming that become hit series the whole bid data thing is based on lucky guessing (original programming - not some recycled material - after all House of Cards was already a success in the UK and next in line, Arrested Development, is having already a hardcore fan basis).<p>Btw, is anyone having some numbers how "House of Cards" is performing in comparison to original HBO programming?
kaa2102将近 12 年前
Data is data. The most important data is the the data that impacts the strategic objectives of your business or organization. There is zero value in boiling the ocean.
tytung2020将近 12 年前
Are there anyone using evolutionary algo to sort these data? I am no expert in these but I did some research in AL (Artificial Life) back in undergrad. I think evolutionary algo will beat any AI in sorting and finding connections.