If I had the resources zuckerberg has, and the access to information and data that he does, my posts would be optimized to show asynchronously by region, and to show first in the region most likely to respond positively to my post. That way it sets a tone of positivity for the rest of the commenters. At least in theory.
Nice write up! I would assume in a large distributed back-end like facebook that notifications are not sent immediately and definitely not sent to everyone at the same time. This probably means that it is impossible to rely on them to get a first post comment.
While the analysis was interesting, I can't help but wonder why so many people write pointless comments to what is probably some marketing person impersonating Mark.
I would probably go with auto-reply as soon as he post something, receive the text message and then edit the reply to something meaningful. Fun read! :)
This is why I love this industry. He started with a somewhat "bad" idea- even an automated reply would have a really hard chance being the first comment. But, he was able to experiment and tweak the idea to do something pretty cool with his work while learning. Well done!
There is an amazing amount of Interesting™ stuff you can do with social media. I've been actively looking for and following people in the "social media influencer" game to see just how they pull it off (things like getting apps to the top of the app store, building gigantic and legit Twitter and Instagram accounts, etc.)<p>On this particular story, doesn't Facebook let you create notifications in any way? I get instant notifications from certain best friends for some reason. Maybe you could create a burner FB account and make Zuckerberg the only friend. I'd also consider trying this on someone like Robert Scoble or a tech journalist - someone who has a gigantic following, but relatively low comment velocity.
Love seeing someone using selenium and I had no idea you could use anything else than the Chrome driver, what's a headless browser?! I automated whole sections of my old Sales Rep job with Selenium it was so awesome
I'm really curious why browsers can't handle 18mb of comment data.<p>I've observed the problem myself when trying to load up old fb conversations to search for some detail I knew was there.<p>What's 18mb when my machine has 12gb ram?
I can't tell if saying 'Mark Zuckerberg – the Bill Gates of our time' was a joke or not. I mean, isn't Gates very much still going strong? I get the analogy I suppose, but doesn't that imply Gates is all washed up?
I thought it seemed odd that Python supported the AND operator on lists, but it doesn't.<p>You do need to convert to a set first, e.g.<p><pre><code> [1,2,3] & [1,2]
</code></pre>
gives a TypeError, while<p><pre><code> set([1,2,3]) & set([1,2])
</code></pre>
gives set([1,2]) as expected.
can you figure out a posting schedule for this Mark Zuckerberg guy, maybe there are certain times he is more likely to post. Then you can increase your rate of monitoring during those times, without such a high chance of being labelled a spammer by Facebook?
Looking at the graph of comment text, it's remarkable how many people are convinced that Mark Zuckerberg should be giving them money for some reason.
I agree that it was difficult parsing data on a post about a specific topic, Christmas.<p>However, most comments will be in reply to a post which usually has a central theme.
Enjoyed the article. For once, this article showed some flaws in the original process/idea, and showed very nicely how an original seed idea turned into something different, and more involved/interesting.<p>Clearly a lot of work and sweat went into getting the results you did, and the final output looks very polished.<p>Congrats for having the courage to post this.
does Mark respond to any of the posts? if so, what kind. Do some types of posts always generate more "buzz" among other commenters? Those are some of the questions that would be interesting to answer.<p>In this case the process was far more interesting than the findings, thanks for citing the book and videos.
Lovely write-up, made me smile. I'll be reading your other posts on related subjects.<p>Great to read a walk through of a "directed trial and error" approach. Out of curiosity, how did you select NLTK and a graphing approach? Did you consider other techniques for ploughing through the data?
It would just be much easier to use the Facebook graph api, there is an official Python module and is well documented, and would be less likely to hit rate limits or other blocks - ironically that was one of the reasons that the author used scraping instead of the api.
Cool experiment!<p>BNtw. is there any similarity measure that can be calculated with less complexity? (e.g. without the need to compare every pair of comments?)
This is all so sad. WTF!<p><a href="http://knowyourmeme.com/memes/i-hope-senpai-will-notice-me" rel="nofollow">http://knowyourmeme.com/memes/i-hope-senpai-will-notice-me</a>