Ask HN: Addressing cold start problems in recommender systems?

5 点作者 apurva将近 15 年前

Hi All, I have been working on a recommender engine for a while now and have now stumbled across the cold start problem. The problem here is that whatever data I collect is only an indicator of the likes of the user (for eg., if browsing history is taken as a source, then the basic assumption that people don't browse for what they don't like stands true) So in such a case, any ideas as to how I train the system initially for dislikes?? I do know that the system will gradually tune to the user preferences with continuous feedback, but I would not like the first run to be very erratic either by choosing random dislikes... Any ideas folks?? Any help in the matter is greatly appreciated....

3 条评论

nostrademons将近 15 年前

Many machine-learning systems get bootstrapped by their implementer sitting at a website clicking "Like" and "Dislike" buttons for a large randomly-chosen sample of possible data.<p>If this strikes you as incredibly boring, you can farm it out with Amazon Mechanical Turk or other crowdsourcing schemes. You could also do cleverer variants of this, like putting image-recognition or OCR training sets into CAPTCHAs, submitting possible links to Reddit or Digg, or hosting Internet surveys with the questions of interest.

评论 #1403897 未加载

what将近 15 年前

I read somewhere about a recommender system for movies (I think) and what they did is force a user to rate 5 random movies as part of the registration process. The movies weren't entirely random, but ones they thought were significant in identifying a user's tastes.<p>In your example of browsing patterns, maybe you could ask new users if they do or do not like to read certain types of articles. ie: are you interested in technology, sports, entertainment, random pictures of cats etc and seed their profiles based on their expressed level of interest for those things (maybe including dislikes from people who claimed to have similar interests).<p>But I would think that dislikes are not so important in the beginning. Although I don't know how your algorithm works, if you have a rough idea of what a person likes, shouldn't you be able to recommend things that they might like just based on that? When you end up recommending something that they don't like, you'll get some dislike data and can start factoring that in.

AmberShah将近 15 年前

My startup faces a variant of this problem. Not exactly a recommender system, but it will get more accurate as time goes on. I am compensating by putting an initial value that is a guess, and then it will adjust as time goes on. Sort of messy but necessary.<p>It sounds like you're saying that in your case, the value will be different for each person, so you don't have a way of seeding it correctly for different people. I sort of think you have to have SOME info to go off of. Sort of like hunch asks you questions, maybe you could do something like that?

评论 #1410762 未加载