"We can take data sets with millions and millions of data points and figure out what’s related to a given item in a few milliseconds. Most recommendations engines pre-compute stuff rather than generating the recommendations in real-time like we do"<p>I've been looking into recommendation algorithms recently (started with the excellent book "Programming Collective Intelligence"), and this sounds lightyears ahead of the way we currently do things. I suppose you could take one of the algorithms that requires pre-computing and throw resources at it, but it seems like they are talking about something new.<p>Since I'm just getting started, I'd like to find some academic (or blog-faux-academic) articles on whatever recent advances behind recommendations without the need for precomputing. Anyone know where to look?
"Directed Edge truly believes that we’re about to see a shift on the web away from search and towards recommendations."<p>The difference is somewhat arbitrary when you think about it. When I Google, I'm asking it to recommend me stuff related to what I'm looking for. Google is nothing but the world's best recommendation engine.<p>There are about 1,000 sites that could use good recommendation technology to enhance their profits though, so I like this company's monetization chances. Easy elevator pitch too: recommendations as a service.
Question:<p>Greg Linden, who worked on Amazon.com's recommendation engine, has referred to what he calls the "harry potter problem". To quote from his blog:<p>'...this calculation would seem to suffer from what we used to call the "Harry Potter problem", so-called because everyone who buys any book, even books like Applied Cryptography, probably also has bought Harry Potter. Not compensating for that issue almost certainly would reduce the effectiveness of the recommendations, especially since the recommendations from the two clustering methods likely also would have a tendency toward popular items.'<p>How did you compensate for this problem? Do you simply ignore vertices in the graph that have a large degree?<p>Or, are you using non-linear weighting functions, such as a perceptron's sigmoid function?<p>With regard to Wikipedia, almost everyone who has edited an article has also edited the article on Bill Clinton. So, if you are using the edit-history metadata to compute recommendations, you would have to compensate for the "Bill Clinton problem".
I have been following the "reccomendation algos as a service" for about 2 years now. This definetely seems interesting but a side opportunity could be to do aggregator/optimizer of recco algos for merchant/publishers similar to what Rubicon Project/Pubmatic does on aggregating/optimizing ad networks.<p>The reccoAlgo aggregatot would take all the various recco algo services such as DirectedEdge, Aggregate Knowledge, Loomia, Minekey, Persai (now dead) and many others and keep running tests (similar to the netflix prize) and whatever is better is given more airtime on suggesting related products/pages for retailers/publishers.
The compensation model would work on a percentage of revenue for additional clicks/purchases on the suggestions.
"....we’ve gone from having a graph-store to having a proper graph database..."<p>A graph database and a "triple store" in semantic technologies are essentially the same thing. This company makes some very aggressive claims that allegrograph, Jena, Oracle (with Spatial), sesame and others (including the korean arm of my current company) have also made. Typically, such claims fail to live up to the marketing. I wonder how this solution compares to these traditional triple stores?