TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: A client-side Bayes classifier for Hacker News

119 pointsby rogerbraunover 13 years ago

5 comments

moconnorabout 13 years ago
I tried to train bayesian (and other) classifiers to reliably pick the same stories to read as I would. Despite looking at a variety of things - title, poster, domain, corpus from the article, corpus of the comments, I found their accuracy was never really better than 60%.<p>Then I tried rating the same set of articles myself several times. My accuracy was only around 60% too.<p>Figures.
评论 #3618134 未加载
polyfractalover 13 years ago
Very cool! I've been hacking around with modifying HN's interface via JS a lot recently - this will be a welcome tool in my experiments.<p>One comment: The up/down votes are really "strong" visually. Perhaps make them smaller and/or lighter in color?
gauravk92about 13 years ago
Maybe it's easier simply to classify things you wouldn't want to read and hide those as less interesting. Because of the variety of topics, training something to figure out what you like seems much more restricting on the flow.<p>E.g. if you rarely read things with ".js" (stupid amounts of js library posts here), it'll be easier to say this is uninteresting to me, vs classifying everything as interesting so the algorithm has to infer that you find js libraries uninteresting.<p>Although I'm pretty interested in node but not js libraries for api's necessarily, tough problem indeed.
Gringabout 13 years ago
As an alternative, just trust the HN home page algorithm.<p>Stories seem to move up to a relevant max rank position, stay there and then move back down. Big stories stay in the top 5 for 20+ hours.<p>Here's what I do: If I only have time to look at 5 stories per day, I visit once per day at any point in time and look at the first 5 stories. If I have time to look at 20, look at the first 20.<p>Set yourself a timeout, start reading at the top, stop when the time is up, repeat after 12 or 24 hours. Works very well for me, I get the best stories, and feel pretty well informed.
评论 #3618311 未加载
评论 #3618892 未加载
growtover 13 years ago
Nice work. I hope it gets more attention in the next hours. Seems like an interesting starting point for all kinds of experiments.
评论 #3618057 未加载