TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Data Science of the Facebook World

206 pointsby mh_about 12 years ago

13 comments

taliesinbabout 12 years ago
I did the analysis and worked with Stephen on the science side of it.<p>If anyone would like to ask questions about what we did, I'd be happy to answer them.<p>There's still lots more interesting stuff to do, but it was enough for a blog post. Suggest away if you think we missed something obvious!
评论 #5604330 未加载
评论 #5603691 未加载
评论 #5607775 未加载
评论 #5603787 未加载
评论 #5603850 未加载
评论 #5603965 未加载
评论 #5604079 未加载
评论 #5604156 未加载
评论 #5604041 未加载
kevinalexbrownabout 12 years ago
I found this interesting. What I would love to have seen, however, is a probe into the dynamics. You did a nice abstraction over time as you measured property X as age was varied. I would have loved to have seen the manner in which topics and ideas spread over your network.<p>For instance: If an event occurred in New York, say, how long would it have taken to spread to San Francisco? If there were no progression, topic times would center around the same time. This would indicate that people were getting their information from national, not local sources (e.g. the evening news), then talking about it on facebook. On the other hand, if a local topic was spread on facebook alone, we should see some sort of progression.<p>It's possible that this progression could take more interesting forms besides geolocation, but that might require a more extensive network. A simple experiment would work like this: A few thousand people who are not friends but have a similar interest (say an interest in Elizabeth Warren) post independently a video of her. This particular esoteric interest is unlikely to be valued a priori by their friends, but perhaps they are compelled to repost the information. What's the threshold of "esotericness" such that it won't "go viral?" Is there a way to predict virality as a function of how popular it is to begin with? Is there no actual progression across the network, but rather a small bump in topic expression, until it is picked up by larger media sources at which point the entire network is inundated with people reposting Elizabeth Warren recaps from HuffPo et al?<p>The reason this is interesting is that it sheds insight into the role of social networks: are we fundamentally disposed toward central sources like the NYTimes, or is facebook a fundamental <i>sharing</i> mechanism? That is, do I post on facebook just to have my views expressed, validated, and challenged, so that they might change the world over a few years? Or do I post on facebook to have my views <i>propagate</i> across the world much more quickly?<p>Finally, a question: How did you estimate the power law? I know how difficult it is to do this (e.g. not linear regression on a log-log scale). Did you compare the power law fit to other, similar distributions, like lognormal? Preferential attachment is indeed a beautiful theoretical result, because it implies the existence of power law degree distributions. Unfortunately, many networks are not as well represented by power laws as by alternative distributions, which casts doubt on the preferential attachment hypothesis as is. (Also, many sampling methods give rise to fictive power laws). That said, a fat tail can still be interesting.<p>In any case, this is a beautiful piece of work.
评论 #5604213 未加载
greiskulabout 12 years ago
I wonder if the higher friends count of Brazilian users is caused by the previous use of Orkut, where it was popular to try to have as many friends as possible.
评论 #5605097 未加载
geekamabout 12 years ago
How is it possible that Facebook, which owns the data, does not give tools like these but others tap this using their data?
评论 #5603937 未加载
评论 #5603710 未加载
评论 #5604384 未加载
austinlabout 12 years ago
I've been doing Facebook network visualization for a while now with Gephi. Here are some of the graphs I came up with: <a href="http://visualizingpolitics.wordpress.com/2012/05/02/facebook-network-visualizations/" rel="nofollow">http://visualizingpolitics.wordpress.com/2012/05/02/facebook...</a>
评论 #5604299 未加载
sskatesabout 12 years ago
Wow- this is awesome! It's really cool how people's friend distribution by age is a convolution of their age and the age of the general facebook population. It's also scary in a way to see a snapshot of how I'm likely to change in the future with regards to my clusters of friends, my relationship status, and what I'll talk about.
xk_idabout 12 years ago
The traditional way to plot the assortativity by age is using a scatter plot / heatmap. This is similar to what they did for country homophily on p12 of the Facebook anatomy paper. The result would be a plot with a prominent diagonal, illustrating that "same attracts same".<p>That aside, imo, Facebook is an incredibly idiosyncratic "app", which makes almost no sense. And yet, it gave us so many opportunities for interesting discussions, like the insights in this blog post. Nice job.
评论 #5610726 未加载
photorizedabout 12 years ago
One thing that bugs me is how comments are linked to "interest". There are many topics that interest people (passive consumption), that do not necessarily translate into engaging in a conversation with others publicly.<p>As a marketing term - sure, that would be a good indicator of interest. Since this article is more scientific than marketing-oriented, I would clarify what some of the metrics mean (or don't mean).<p>Excellent, fantastic visualizations though!
评论 #5604444 未加载
pbnjayabout 12 years ago
How much of the friends with zero friends is simply because that information is blocked? If my friends "donated" their data, I would show as having 0 friends if I've blocked that information to apps.
评论 #5604002 未加载
评论 #5604026 未加载
CurtMonashabout 12 years ago
Introducing "data science for Facebook" in 2013 is ... odd.<p>All the more so because Jeff Hammerbacher is often credited with coining the term "data science", and he started doing it at -- that's right -- Facebook.
评论 #5604773 未加载
brown9-2about 12 years ago
Very nice looking graphs, but running "Wolfram Alpha Personal Analytics for Facebook" for my own profile comes with a rather nerve-wracking warning:<p><i>Wolfram Connection would like to access your public profile, friend list, email address, custom friends lists, News Feed, relationships, birthday, status updates, checkins, education history, hometown, current city, photos, religious and political views, videos, likes and your friends' relationships, birthdays, education histories, hometowns, current cities, photos, religious and political views and videos.</i>
评论 #5603913 未加载
jonpedaabout 12 years ago
The Mathematica system makes some beautiful, informative graphs, and presumably users can make those graphs with a minimum of fuss and bother. It's technically very nice.<p>Yet, in the entire blog post, is there one insight that wasn't a priori obvious? Maybe the bits about migration.<p>I don't see the "art and science" in this analysis, I see "stamp collecting" (<a href="http://en.wikiquote.org/wiki/Ernest_Rutherford" rel="nofollow">http://en.wikiquote.org/wiki/Ernest_Rutherford</a>)
评论 #5603527 未加载
评论 #5603659 未加载
评论 #5603560 未加载
jonpedaabout 12 years ago
People <i>donate</i> their data to support Wolfram's closed-source, paid-license, for-profit program?
评论 #5603515 未加载
评论 #5603536 未加载
评论 #5604344 未加载
评论 #5603739 未加载
评论 #5610365 未加载
评论 #5603938 未加载
评论 #5603394 未加载