科技回声

1 comment

RSS readers need algorithic feeds [1] but unfortunately everyone interested in RSS thinks algorithm = bad.My YOShInOn reader downloads somewhere between 3,000 to 30,000 items in a cycle [2] and chooses 300 top-scoring items out of 20 clusters. When I complete those, it runs another cycle. It has extra screens that show articles that it thinks would get >10 votes or a comment/vote ratio > 0.5 on HN as well as screens to show top-scoring articles from particular sites and feeds (arXiv, lobsters, ...)Articles in the primary feed are shown to me one at a time, I thumbs up or I thumbs down. The RoC for the classifier is about 0.78, I read TikTok gets 0.84 so I'm pretty happy.The problems with it: (1) It depends on arangodb for which the license doesn't allow me to commercialize it and I wouldn't feel OK with open sourcing it. Right now I'm writing a python-arango replacement which will get it and my image sorter running on postgres out of a single code base. (2) the batch organization doesn't work well for certain topics like sports where articles have a shelf life.[1] doesn't have to be "creepy blond girls want to follow you" or all outrage all the time, an algorithmic feed can apply any heuristic that you like.[2] depending on how fast I am reading, quality gets better when I am reading slow. The system blends in a certain percentage of randomly chosen results to maintain calibration -- I've been thinking about making it run at a target quality level where it blends in more randoms if it thinks it is showing me too many good results.

评论 #43822505 未加载

评论 #43824062 未加载

评论 #43822720 未加载

评论 #43822897 未加载

RSS doesn't necessarily means firehose

1 comment

RSS doesn't necessarily means firehose

1 comment