TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Normalizing Ratings

51 点作者 Symmetry10 天前

18 条评论

nlh9 天前
Similarly - one of my biggest complaints about almost every rating system in production is how just absolutely lazy they are. And by that, I mean everyone seems to think &quot;the object&#x27;s collective rating is an average of all the individual ratings&quot; is good enough. It&#x27;s not.<p>Take any given Yelp &#x2F; Google &#x2F; Amazon page and you&#x27;ll see some distribution like this:<p>User 1: &quot;5 stars. Everything was great!&quot;<p>User 2: &quot;5 stars. I&#x27;d go here again!&quot;<p>User 3: &quot;1 star. The food was delicious but the waiter was so rude!!!one11!! They forgot it was my cousin&#x27;s sister&#x27;s mother&#x27;s birthday and they didn&#x27;t kiss my hand when I sat down!! I love the food here but they need to fire that one waiter!!&quot;<p>Yelp: 3.6 stars average rating.<p>One thing I always liked about FourSquare was that they did NOT use this lazy method. Their score was actually intelligent - it checked things like how often someone would return, how much time they spent there, etc. and weighted a review accordingly.
评论 #43875473 未加载
评论 #43875922 未加载
评论 #43875747 未加载
评论 #43875148 未加载
tibbar9 天前
One of my favorite algorithms for this is Expectation Maximization [0].<p>You would start by estimating each driver&#x27;s rating as the average of their ratings - and then estimate the bias of each rider by comparing the average rating they give to the estimated score of their drivers. Then you repeat the process iteratively until you see both scores (driver rating, and user bias) converge.)<p>[0] <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Expectation%E2%80%93maximization_algorithm" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Expectation%E2%80%93maximizati...</a>
stevage9 天前
I like rating systems from -2 to +2 for this reason.<p>The big rating problem I have is with sites like boardgamegeek where ratings are treated by different people as either an objective rating of how good the game is within its category, or subjectively how much they like (or approve of) the game. They&#x27;re two very different things and it makes the ratings much less useful than they could be.<p>They also suffer a similar problem in that most games score 7 out of 10. 8 is exceptional, 6 is bad, and 5 is disastrous.
homeonthemtn9 天前
I&#x27;d rather we just did an increment of 3 rating. 1. Bad 2. Fine 3. Great<p>2 and 4 are irrelevant and&#x2F;or a wild guess or user defined&#x2F;specific.<p>Most of the time our rating systems devolve into roughly this state anyways.<p>E.g.<p>5 is excellent 4.x is fine &lt;4 is problematic<p>And then there&#x27;s a sub domain of the area between 4 and 5 where a 4.1 is questionable, 4.5 is fine and 4.7+ is excellent<p>In the end, it&#x27;s just 3 parts nested within 3 parts nested within 3 parts nested within....<p>Let&#x27;s just do 3 stars (no decimal) and call it a day
评论 #43876077 未加载
Retr0id9 天前
&gt; I&#x27;m genuinely mystified why its not applied anywhere I can see.<p>I wonder if companies are afraid of being accused of &quot;cooking the books&quot;, especially in contexts where the individual ratings are visible.<p>If I saw a product with 3x 5-star reviews and 1x 3-star review, I&#x27;d be suspicious if the overall rating was still a perfect 5 stars.
mzmzmzm9 天前
A problem with accounting for &quot;above average&quot; service is sometimes I don&#x27;t want it. If a driver goes above and beyond, offering a water bottle or something else exceptional, occasionally I would rather be left alone during a quiet, impersonal ride.
parrit9 天前
For uber you don&#x27;t need a rating at all. The tracking system knows if they were late, if they took a good route and if they dropped you off at the wrong location.<p>Anything really bad can be dealt with via a complaint system.<p>Anything exceptional could be asked by a free text field when giving a tip.<p>Who is going to read all those text fields and classify them? AI!
评论 #43875582 未加载
pbronez9 天前
One formal measure of this is Inter-Rater Reliability<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Inter-rater_reliability" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Inter-rater_reliability</a>
rossdavidh9 天前
I have often had the same thought, and I have to believe the reason is that the companies&#x27; bottom line is not impacted the tiniest bit by their ratings&#x27; systems. It wouldn&#x27;t be that hard to do better, but anything that takes a non-zero amount of attention and effort to improve, has to compete with all of those other priorities. As far as I can tell, they just don&#x27;t care at all about how useful their rating system is.<p>Alternatively, there might be some hidden reason why a broken rating system is better than a good one, but if so I don&#x27;t know it.
adrmtu9 天前
Isn&#x27;t this basically a de-biasing problem? Treat each rider’s ratings as a random variable with its own mean μᵤ and variance σᵤ², then normalize. Basically compute z = (r – μᵤ)&#x2F;σᵤ, then remap z back onto a 1–5 scale so “normal” always centers around ~3. You could also add a time decay to weight recent rides higher to adapt when someone’s rating habits drift.<p>Has anyone seen a live system (Uber, Goodreads, etc.) implement per-user z-score normalization?
parrit9 天前
<a href="https:&#x2F;&#x2F;xkcd.com&#x2F;1098&#x2F;" rel="nofollow">https:&#x2F;&#x2F;xkcd.com&#x2F;1098&#x2F;</a><p><a href="https:&#x2F;&#x2F;xkcd.com&#x2F;937&#x2F;" rel="nofollow">https:&#x2F;&#x2F;xkcd.com&#x2F;937&#x2F;</a>
nmstoker9 天前
Does anyone else get that survey rating effect where you start off thinking the company is reasonable, you give a 4 or 5, then the next page asks for why you chose this and as you think it through you realise more and more shitty things they did, so you go back to bring them down to a 2 or 3. Effectively by asking in detail they undermine the perception of them
enaaem9 天前
Check the bad reviews. If the 1-2 star reviews are mostly about the rude owner, then you know the food is good.
lordnacho9 天前
Has anyone done a forced ranking rating?<p>&quot;Here&#x27;s your last 5 drivers, please rank them&quot;
评论 #43876181 未加载
xnx9 天前
I don&#x27;t understand why letter grades aren&#x27;t more popular for rating things in the US.<p>&quot;A+&quot; &quot;B&quot; &quot;C-&quot; &quot;F&quot;, etc. feel a lot more intuitive than how stars are used.
评论 #43875353 未加载
评论 #43875466 未加载
评论 #43875314 未加载
评论 #43875299 未加载
JSR_FDED9 天前
A++++ article!
评论 #43875584 未加载
jonstewart9 天前
I give five stars always because I’m not a rat.
User239 天前
Same for peer reviews. Giving anything less than a four is saying fire this person. And even too many fours is PIP territory.