"We could also take other data into account, like the user who submitted the article, and generate features indicating things like the karma of the user..."<p>I'm wondering how helpful could that be in prediction though? Would it actually help if I wish to predict how many upvotes my headline would get, and I add my karma as a feature? I think in fact such features would degrade generalisation performance, as they stand in like placeholders (when training), i.e. high karma users are correlated to higher probability for a "hit story".