How Not To Sort By Average Rating

195 pointsby marcusover 16 years ago

20 comments

ntoshevover 16 years ago

Why get the lower bound: this means you systematically underestimate the items with fewer ratings. Also this formula assumes normal distribution.There is another solution called 'True Bayesian Average' that is used on IMDB.com, for example. For the formula and the explanation how it works see here:<a href="http://answers.google.com/answers/threadview/id/507508.html" rel="nofollow">http://answers.google.com/answers/threadview/id/507508.html</a>

评论 #478963 未加载

DavidSJover 16 years ago

Actually, I think Amazon has it right, for several reasons:1. The user can actually understand and predict the behavior.2. The "problem" the OP identifies is partially self-correcting, because items with a few positive ratings get more attention as a result of their high ranking, and if they deserve more poor ratings, they'll get them.3. As long as you tell users how many ratings there are, they can use their own judgment as to how important that is.

jim-greerover 16 years ago

On Kongregate we do the same thing that Newgrounds does - games don't display an average rating or appear in the rankings until they have a minimum number of ratings. For us that's 75 ratings.Of course we only get a hundred or so new games a week, so it's not hard to get that many ratings. Much harder for a site with lots of stuff, especially if they have lots of stuff from day one.

spolskyover 16 years ago

Speaking of Amazon -- I'd rather buy a book with lots of 1s and 5s than a book with straight 3s. Wouldn't you?

评论 #478931 未加载

评论 #478928 未加载

评论 #478971 未加载

评论 #478926 未加载

sachinagover 16 years ago

OK, I came to this late, but there are other solutions.On Dawdle, we spent a ton of time thinking about this. For their Marketplace sellers, eBay does number one; Amazon does number two. What we do is not three, but something that I think is better: we just rank our users.Look, you can't ask buyers on the site to rank any particular transaction against all others; that's crazy talk, especially as you get into large numbers and people forget about all their experiences ever on a site. But we guide all our buyers to leave feedback of 3 - not 5 or a positive. The hope is that you'll have a semi-normal distribution of all feedback. Then, and only then, do we do all sorts of stuff to that. We have bonus points for good behaviors, some of which we talk about (shipping quickly, using Delivery Confirmation, linking your Dawdle account to Facebook, MySpace, XBL, PSN, etc) and some we don't. Then we just throw the ranks on a five point scale.We call this our Seller Rating, not a feedback rating. Feedback is just the beginning of the process. KFC starts with the chicken - necessary but not sufficient - then adds their 13 herbs and spices. That's what we do, and it's why we don't even allow users to see the individual feedbacks. They're designed to be useless individually, and I don't want some new eBay expat bitching about not having 100% feedback.Does this mean that some people are going to end up with 1s on a 5 point scale? You betcha. And we don't want them anyway - they're more hassle than they're worth. They can go to eBay or Amazon.

gojomoover 16 years ago

Yelling "WRONG" at two popular options isn't much of an argument. I suspect that Amazon has a good profit-maximizing reason for their ordering.

评论 #478893 未加载

raganwaldover 16 years ago

Could it be that the business requirement for the ranking system is actually that users be given the illusion of participating in a social site? because ultimately the rankings have very little impact on overall sales?While over there, the "Users who bought X also bought Y" feature has a very strong impact on sales, so the engineers spend all of their time tweaking its algorithm?

acangianoover 16 years ago

The first example is wrong. 60 - 40 = 20 which is greater than 100 - 100 = 0.

评论 #478725 未加载

评论 #479238 未加载

dilapover 16 years ago

Silly that he messes up the first example, but I've wanted something like this third solution when sorting by rating on Amazon -- when a product has only a couple of positive reviews, it's almost the same as being unreviewed. (On the other hand, just a couple of negative reviews can often be quite helpful, since they can list concrete problems making the product bad -- a positive review has the much harder case to make of "there will be nothing bad about this product".)

joshuover 16 years ago

Yeah, this always drives me nuts.Assuming that someone's rating is only an estimator of their true rating, and then clipped to an integer - the more ratings there are, the less the maximum.You never see something with hundreds of ratings actually score a 5.0, even if most people love it. And the fact that there ARE hundreds of ratings of that thing, and not some comparison thing, is also important.

评论 #478861 未加载

aneeshover 16 years ago

This is a decent way to estimate the average score. But sometimes you're not really trying to estimate the average score. You're really trying to estimate the score that the user who's currently viewing the page would give it, especially for someone like Amazon (ie, People Like You Rated this Product as 1-star).His estimator might be a decent one at picking the average score, but in Amazon's case, it's not a great estimator of the rating I would give the product I am viewing. If you have such an extensive record of my past-purchases, use it to predict how I would like certain products! Surely the 5-star ratings of SICP are more applicable to me than the 1-star ratings.

tptacekover 16 years ago

This is, I think, the bestest funniest post to hit Hacker News in awhile.

评论 #479214 未加载

评论 #478917 未加载

timcedermanover 16 years ago

I wrote a little bit about a (mathematically naive) solution to the averages problem on my blog (props to Eric Liu for his feedback) -- <a href="http://www.cederman.com/?p=116" rel="nofollow">http://www.cederman.com/?p=116</a>

kinover 16 years ago

I actually have yet to encounter a single website with this sophisticated of a rating system.It is particularly hard to find quality content on websites with a large database (i.e. YouTube). Newer videos will always have a higher 'rating' because it is 'newer'. Older videos will always have the most 'views' because it has had more time to get to that state. If you can't reach a consensus about what the most optimal rating system is, implement a nested rating system. This way, the end-user can specify to sort by 'rating' and then by 'date' and then by 'views', or however order he/she feels most optimal.

jackowayedover 16 years ago

How Not To Sort By Average RatingOr, "How to Make Your Host Very Happy Because You Suddenly Have to Move Up To a Server With Double the Processing Cycles"

评论 #479302 未加载

biohacker42over 16 years ago

Every site's crappy sorting has always bothered me, it reminds me of AltaVista (before Google kids).I have not checked the math on this, but damn near anything should be better then what most sites are doing today.In fact a better way to sort user ratings is a good idea for a startup.

lackerover 16 years ago

The simple formula is to assume a Dirichlet prior. This gives you<pre><code> score = positives / (negatives + x) </code></pre> and you can fiddle with x to get something that looks reasonable.

diN0botover 16 years ago

pedantic: the urban dictionary example is voting not rating. that's what leads to the confusion of adding up all the positives and negatives.amazone is rating. it's a scale and there is a clear mean, medium and standard deviation.whether solution #2 is wrong kind of depends. some of this is a social problem on what items get the kind of followers who are likely to rate online. at the extreme case you could imagine a product where (not) liking it was caused by an inability to use html forms.

swombatover 16 years ago

Excellent. Can we implement that on comments here? How hard would it be?

time_managementover 16 years ago

An easier solution is to give each item a number of implied mediocre ratings (say 10 3s) to start.1 rating, 5.0 = 35/11 = 3.18; 90 ratings, 4.5 = 435/100 = 4.35The downside is that, since items with few ratings get mediocre scores, if there isn't a way for them to be visible, they won't get out of that rut. So a better approach might be to feature (on a "top 10" list) seven high scorers and 3 "rising stars" selected purely on average score.