TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Some Fresh Twitter Stats (as of July 2012, Dataset Included)

54 pointsby ryannielsenalmost 13 years ago

5 comments

citricsquidalmost 13 years ago
I would be really interested to see some analysis on the latest ~50m accounts because I've noticed an insane amount of bot accounts recently. They all follow the same pattern: Never tweet (or tweet once), follow ~2000 people and are being followed by ~3 and have a bio + avatar. They're used by the Twitter follower selling companies and there's an absolute metric fucktonne of these accounts, some examples of such accounts:<p><a href="http://twitter.com/Yahairayqlcd" rel="nofollow">http://twitter.com/Yahairayqlcd</a> <a href="http://twitter.com/Kenyetta992" rel="nofollow">http://twitter.com/Kenyetta992</a> <a href="http://twitter.com/Jade_482" rel="nofollow">http://twitter.com/Jade_482</a> <a href="http://twitter.com/Mozella_nxi" rel="nofollow">http://twitter.com/Mozella_nxi</a> <a href="http://twitter.com/Alane508" rel="nofollow">http://twitter.com/Alane508</a><p>It's becoming a really big problem, every day I come across accounts that have bought followers\* (a few belonging to HN users) and it's... disappointing. Inevitable, but disappointing. I started putting together a website for checking if someone had bought their followers but haven't found the desire to finish it yet, but there are so many that do it, anywhere from purchasing 4,000 to 80,000 followers which can be done for ~$300.<p>I've been reporting the people doing this to Twitter but nothing has been done about it, ultimately it looks good for Twitter metrics so I doubt they care that there are people with 80k fake followers, that's an extra 80k users for their investors to salivate over.<p>As a side note, it's fun to pick out a random account (like those listed above) and see who they're following, there are some people you'd not expect to be purchasing followers that are. An alternate theory is that Twitter is responsible (after all, how can they not detect these obvious bots?) but I can't see why... well I can, but I don't think they would.<p>*I'm probably alone in this but people that buy followers really irk me because I like numbers to be accurate. The site I started making had a directory of people that bought followers, I'm wondering if I should finish it up, I figure it would be interesting for people to see that a large number of "media personalities" are just buying up their followers.
评论 #4319468 未加载
评论 #4315754 未加载
评论 #4315952 未加载
simonwalmost 13 years ago
There's a flaw in this bit, which leads to the 530 million account estimate:<p>"The highest Twitter user id when I started the experiment was around 637M (found by trial and error). I figured there would be gaps in user ids mostly because of massive deletions of spammer accounts, and a quick sample estimated the gaps to be on the order of 20%. So I generated 1.25M unique user ids in the range 0-637M, and tried to fetch the profile details for them.<p>[...]<p>After fetching the 12,500 batches I was left with 1,039,556 Twitter profiles. This means that there must exist approximately 530 million Twitter accounts: 83% of 637M."<p>The problem is that Twitter account IDs used to be sequential - every integer would correspond to an account, unless that account had been deleted. Then in 2011 Twitter introduced the Snowflake update <a href="https://dev.twitter.com/docs/twitter-ids-json-and-snowflake" rel="nofollow">https://dev.twitter.com/docs/twitter-ids-json-and-snowflake</a> which changed the way IDs were generated (for scaling reasons - it's much better to have separate machines able to deal out IDs rather than rely on a single point of failure).<p>This means that if you were to create a pool of random IDs between 1 and 637,000,000 you'll find that the IDs below a certain number (the highest ID at the time snowflake kicked in) almost all correspond to an account, whereas the IDs above that number have a much higher number of misses.
评论 #4315994 未加载
评论 #4325091 未加载
dewittalmost 13 years ago
By way of comparison, I ran the numbers back in 2009 using a similar sampling technique:<p><a href="http://blog.unto.net/sampling-twitter.html" rel="nofollow">http://blog.unto.net/sampling-twitter.html</a><p>At the time, I estimated roughly 1,200,000 active and connected users on Twitter. This author currently estimates around 80M active users (as of mid-2012), or a 80x increase over three years.<p>Note that both samples considered "active" to be someone who posts, which is quite a bit stricter than the (reasonable) definition of consumption that the industry has been stabilizing on.<p>Neither of us knew how to account for spam/fake accounts, which must represent some non-trivial part of the ecosystem (at least judging from the followers my own dormant account continues to attract: <a href="http://twitter.com/#!/dewitt/followers" rel="nofollow">http://twitter.com/#!/dewitt/followers</a>).<p>I found it interesting, though in hindsight not surprising at all, that the average length of username is also going up over the years.
denzil_correaalmost 13 years ago
I just downloaded your data set and here is one particular observation - Are you retrieving the all the tweets of every user? Twitter allows you retrieve upto 3200 tweets of a user(if public) via pagination. You can download them to understand how "active" they are for a much better analysis.
评论 #4316156 未加载
carleverettalmost 13 years ago
Only 1 graph? Is Chrome broken for me? I thought we were going to get more visual representations like the Twitter "follow" graph with 33 billion edges!