twitter/the-algorithm

374 pointsby johnsabout 3 years ago

47 comments

johnsabout 3 years ago

Will Norris who works on OSS at Twitter posted this[0]: "watch this space <a href="https://github.com/twitter/the-algorithm" rel="nofollow">https://github.com/twitter/the-algorithm</a>"[0]: <a href="https://twitter.com/willnorris/status/1518694675909013504" rel="nofollow">https://twitter.com/willnorris/status/1518694675909013504</a>

评论 #31161750 未加载

评论 #31167776 未加载

Irishsteveabout 3 years ago

Many people have commented that it is empty. However what they do not realize is that there has never actually been an algorithm and that is why it is empty.

评论 #31160847 未加载

评论 #31160989 未加载

评论 #31160689 未加载

评论 #31161165 未加载

评论 #31163927 未加载

评论 #31162189 未加载

评论 #31161617 未加载

评论 #31160842 未加载

评论 #31161852 未加载

评论 #31161236 未加载

评论 #31161302 未加载

评论 #31161142 未加载

评论 #31161363 未加载

评论 #31161871 未加载

评论 #31161262 未加载

评论 #31160986 未加载

jdrcabout 3 years ago

I know the algorithm i use, it ends with ORDER BY date DESC.

评论 #31161530 未加载

评论 #31161102 未加载

评论 #31166411 未加载

评论 #31161309 未加载

shrimpxabout 3 years ago

I don't understand the concept of open-sourcing "the algorithm".First of all, "the algorithm" is probably hundreds of thousands of lines of code, including all the tedious boilerplate like cache policies and multi-AZ logic.And second of all, doesn't the algorithm include machine learning components, which are trained on terabytes of data? That data will likely be impossible to open source. And open sourcing the neural nets without the training data is mostly meaningless from a transparency perspective?

评论 #31161225 未加载

评论 #31161085 未加载

评论 #31161074 未加载

评论 #31161273 未加载

评论 #31161104 未加载

评论 #31161071 未加载

评论 #31161493 未加载

评论 #31161072 未加载

评论 #31167363 未加载

mrintellectualabout 3 years ago

This seems to be a practical joke by a Twitter engineer as opposed to an actual release.

评论 #31161906 未加载

评论 #31161794 未加载

评论 #31160785 未加载

axg11about 3 years ago

I've worked on very large scale recommendation systems at a FAANG. If Twitter's system resembles anything like ours, the concept of publishing or open sourcing "the algorithm" doesn't make sense.Even if we were to open source all associated code and publish all related documents it would be very difficult to make sense of the entire system. That is precisely why companies such as Twitter A/B test the hell out of everything. What most people think of as "the algorithm" is a complex system that receives many inputs (maybe hundreds) and has dependencies on many other internal Twitter services. Tweets likely pass through multiple filtering steps as well as scoring before you ever see them. Each of these steps is highly contextual, depending on: location, past tweets, verification status, etc. You can attempt to predict the effect of a certain change, but you never know the actual outcome until you test it.I think what will ultimately happen is that _some_ details will be published. Elon will parade that around as a victory for free speech as Twitter is now more "open". In reality, nothing of value will be gained as "the algorithm" isn't a simple function.

评论 #31161347 未加载

评论 #31161350 未加载

评论 #31161327 未加载

评论 #31161266 未加载

评论 #31161123 未加载

评论 #31161485 未加载

评论 #31161317 未加载

评论 #31161592 未加载

评论 #31161670 未加载

评论 #31161263 未加载

评论 #31161211 未加载

评论 #31161121 未加载

评论 #31161566 未加载

评论 #31161177 未加载

评论 #31161477 未加载

评论 #31161224 未加载

评论 #31161349 未加载

评论 #31161542 未加载

评论 #31161696 未加载

评论 #31161243 未加载

评论 #31161381 未加载

评论 #31161757 未加载

评论 #31161517 未加载

评论 #31161204 未加载

评论 #31161158 未加载

评论 #31161578 未加载

transitivebsabout 3 years ago

Is this supposed to be a joke? It's clearly an empty repo.Either this is a mistake, or this is a really, really misguided attempt at a joke from Twitter.

评论 #31161227 未加载

评论 #31160686 未加载

评论 #31161338 未加载

评论 #31160710 未加载

mrkramerabout 3 years ago

Imagine having something like this for Google's and YouTube's algorithms; $100bn+ SEO industry would go bankrupt or at least they would pivot to some sort of advising but there wouldn't be the mayhem that we have today.

评论 #31161207 未加载

评论 #31161828 未加载

CincinnatiManabout 3 years ago

Makes me wonder how Twitter employees internally are handling the news. If they are celebrating or commiserating?

评论 #31160758 未加载

评论 #31160900 未加载

评论 #31161029 未加载

评论 #31160999 未加载

评论 #31161370 未加载

评论 #31160892 未加载

standyroabout 3 years ago

I tried to make a pull request already, haha.error forking repo: HTTP 403: The repository exists, but it contains no Git content. Empty repositories cannot be forked. (<a href="https://api.github.com/repos/twitter/the-algorithm/forks" rel="nofollow">https://api.github.com/repos/twitter/the-algorithm/forks</a>)My thoughts:- Explicit rules for temporary and permanent bans- Edit button- More fun and thoughtful conversations like HN- Less thought bubble Brooklyn based reporters, less VC and side grind hustle snake oil, maybe more comedians and memes?

评论 #31161011 未加载

edouard-harrisabout 3 years ago

Assuming Twitter is serious about publishing their feed algorithm [1], it's possible they're merely anticipating the EU's upcoming Digital Services Act which was finalized over the weekend. Among other things, the Act will compel large platforms to "make the working of their recommender algorithms (used for sorting content on the News Feed or suggesting TV shows on Netflix) transparent to users." [2]Twitter's EU user base is probably [3] above the 45 million threshold that triggers the strictest transparency requirements under the Act. So perhaps they figure if they're going to be forced to disclose anyway, they might as well do it proactively.[1] If it's even coherent to talk about their feed ranking system as a single algorithm — see the other comments in this thread.[2] <a href="https://www.theverge.com/2022/4/23/23036976/eu-digital-services-act-finalized-algorithms-targeted-advertising" rel="nofollow">https://www.theverge.com/2022/4/23/23036976/eu-digital-servi...</a>[3] <a href="https://www.statista.com/statistics/242606/number-of-active-twitter-users-in-selected-countries/" rel="nofollow">https://www.statista.com/statistics/242606/number-of-active-...</a>

nighthawk454about 3 years ago

Seems weird to start as a non-private repo until there's some content. Also bit of an unusual name. Can't tell if this is internal trolling or the future

评论 #31161002 未加载

评论 #31160967 未加载

nickysielickiabout 3 years ago

Surely you guys don’t think that twitters sorting algorithm is already factored out into its own repo. Of course it’s empty.That doesn’t mean it’s a joke, I see it as a show of goodwill — that there are a handful of people inside Twitter that are excited for transparency and for a revenue model that isn’t entirely based on ads, that are excited to get to work on this right away.

pddproabout 3 years ago

Does it remind anyone of Po and the Dragon scroll from Kung Fu Panda?

rickreynoldssfabout 3 years ago

Wait until Musk finds out its a bunch of gnarly PHP 5.4 code much of which is a black box everyone is afraid to touch.

评论 #31161986 未加载

paxysabout 3 years ago

I'm going to guess some engineers at Twitter with Github org permissions are having fun with the "release the algorithm" discussion.

评论 #31161996 未加载

Barrin92about 3 years ago

whatever will show up in this repo, I hope people realize that depending on what data you put into some algorithm you can get whatever output you want, and twitter is never going to (and neither can or should they) publish everyone's personal information and interaction on the site.So I'm not sure what the ultimate point of this exercise is other than producing faux-transparency.

评论 #31161348 未加载

xenaabout 3 years ago

I can't believe they missed the chance to make it a rick roll. Such a wasted opportunity.

NaturalPhallacyabout 3 years ago

Not "the algorithm", but you can check if twitter is silently suppressing your account here: <a href="https://taishin-miyamoto.com/ShadowBan/" rel="nofollow">https://taishin-miyamoto.com/ShadowBan/</a>

unethical_banabout 3 years ago

There are elements of their algo that I think should be openly defined, and perhaps there should be some regulatory branch that reports to Congress that has full access. However, obfuscation is often necessary to countering bad actors.

评论 #31161455 未加载

评论 #31161133 未加载

newbambooabout 3 years ago

The government, at federal, state and local levels, all rely on Twitter to conduct official taxpayer funded work. Taxpayer funded work should not happen on proprietary systems that operate with zero oversight or public transparency.Elon polled Twitter users about this and the response was overwhelmingly in favor of open source and transparency. Everyone on Twitter got a vote.If you oppose transparency, as many now are, you lose your credibility. So it’s another one of Elon’s people hacks, and look at all the morons falling for it.

EMIRELADEROabout 3 years ago

Kind of unrealistic but I hope Twitter now open-sources not only the algorithm but also the Rails monolith itself. Would be kind of interesting to see how everything is done

评论 #31160819 未加载

qginabout 3 years ago

I have literally no idea how a "twitter algorithm" could be published on github. Maybe I've been doing recommender systems wrong.

评论 #31161135 未加载

bpodgurskyabout 3 years ago

I'm very technical and I think it would still be valuable to have a list of all the things that weight into the timeline view, even without the models or underlying data.Like, there's no public admission right now of whether "shadow banning" or "ghost banning" is even officially a thing!Some transparency seems unquestionably more powerful than none, and we can work from there.

yabonesabout 3 years ago

There is something vaguely threatening about this.

rvzabout 3 years ago

Perhaps Twitter will be the new Mozilla if it decides to open-source 'everything' then.Maybe that is where it is going.

holtkam2about 3 years ago

I don't get it ¯\_(ツ)_/¯

Trasterabout 3 years ago

At the time of posting, Will Norris (the open source lead at twitter, admin of their github account presumably) posted this. It has 44 retweets, 193 likes, 17 quote tweets, on github it has 1.6k stars.That seems... bizarre to me?

评论 #31161678 未加载

评论 #31161656 未加载

sakopovabout 3 years ago

I agree that there is no such thing as "the algorithm." It is Twitter in its entirety. And with that I have a wild question. Can Musk make Twitter fully open-source on GitHub?

g105babout 3 years ago

Can someone explain this to me? All I can see from this link is an empty GitHub repository. Not sure what I'm missing here.

评论 #31161419 未加载

threeseedabout 3 years ago

Anyone who actually uses Twitter already knows the algorithm:* Chronological - reverse sort by date* Home - for all of the followed topics, recommended topics, retweets and tweets in the past day determine the estimated level of engagement, include the highest and reverse sort by date. This is likely to be a fairly basic ML model.It will be uncontroversial, technically unsophisticated and of no practical use to anyone - users, developers or researchers.This is not going to be PageRank where some genuine new insight was discovered.

评论 #31160992 未加载

评论 #31160991 未加载

Synaesthesiaabout 3 years ago

So nobody is being shadowbanned or suppressed?

评论 #31161101 未加载

hazbabout 3 years ago

"The algorithm" could mean a lot of things. Whatever it means, it probably spans hundreds or even thousands of services. That doesn't mean it cannot be made open-source.I imagine they'd probably start with documentation and white-papers that communicate "here's how we intend for it to work".It's seriously unlikely anyone in Twitter knows actually works how any non-trivial algorithm in the company works. To figure THAT out, they could decide to do a company-wide documentation and instrumentation push like they probably would've had to do for GDPR anyway, which is painful and boring and going to take a very long time.Failing that, they could just say 'the algorithm as it stands is no longer fit for purpose, given part of its core requirement has become that it needs to be transparent and publishable, and presumably legible. We need to make a new one. Publish the core algorithm. We probably won't deploy it in that exact state, it's going to span multi-services and so on, you obviously don't get the data we used to train the models, but we will work backwards from it and here's an open mechanism to measure how true-to-form it actually is'

u1tronabout 3 years ago

It's already gone.

tmalyabout 3 years ago

I could see GPT-3 being added in the empty space.

minrootabout 3 years ago

Why do we want to know the "algorithm"?

评论 #31161454 未加载

zelon88about 3 years ago

if ($has_blue_checkmark) show_post_to($everyone);

评论 #31161000 未加载

qudatabout 3 years ago

I’ve spent the better part of a decade writing open source projects for few to see. An empty repo gets hundreds of stars immediately. It’s all a popularity contest.

drnonsense42about 3 years ago

Apples are red. The sky is blue. Twitter shadowbans and tinkers with who sees who. I wonder what the old guard will do with the codebase over the next few months.

LugarOSabout 3 years ago

It's empty.

评论 #31160671 未加载

评论 #31160665 未加载

评论 #31160677 未加载

a-dubabout 3 years ago

it's probably just a ripoff of pagerank with a separate spam filtering and banning system along with an army of contractors manually fixing it up.if twitter is a game, sinking $43bn into it is kinda like winning or losing the grand final boss level. (unclear which)wish elon would get back to facilitating the building of useful things. we still don't have a great clean energy generation story.

TrapLord_Rhodoabout 3 years ago

Musks first order of business?

评论 #31161443 未加载

oxplotabout 3 years ago

Musk has repeatedly talked about "open sourcing" twitter's algorithm. Given Musk is (understandably) super impatient, this repo may be his first move. I expect this to start with bunch of readme and other high level docs and evolve into details and eventually code.

评论 #31161793 未加载

asd88about 3 years ago

#drama?

4e530344963049about 3 years ago

Nice, making it much easier to game!

arthurcolleabout 3 years ago

is this performance art?

ArtWombabout 3 years ago

It was all in your head ;)