> This generally results in the cold start problem<p>If I'm an online store with 100 products, couldn't I just punch the products into Amazon on a fresh account, then copy the search results? 100 products would maybe take me 20 minutes to do a day, but if you're saying there's a 4-6% lift, seems like it's worth it?<p>If it was 1,000 products, maybe I do this once a week for 200 minutes? Etc. etc.<p>Here's what'll happen: Your online store won't have most of the products on Amazon's recommended list. Isn't that the problem?<p>So no matter what, don't I eventually have to scale to Amazon size to get the value out of collaborative filtering?<p>Maybe no small business has that real supply chain. They are just front-running other stuff. But hey, that's their prerogative - to try to be Amazon without doing the stuff that actually makes Amazon successful.<p>> Netflix Prize<p>They don't even use those methods anymore. And that competition was much more about how to do IT and ensemble methods than any one particular approach, since that's how you get to #1.<p>Netflix Prize is sort of the opposite narrative of what you're actually doing. If you're seeking something that normal people recognize, just stick to talking about Amazon.<p>> Do you think our approach seems interesting, crazy, lazy or somewhere in the middle?<p>At least the premise doesn't square away.<p>Considering the data gathering, it seems easier to do user-product collaborative filtering.<p>Considering the math, it seems easier to do user-product collaborative filtering. You can bootstrap weights data for a e.g. non-negative matrix factorization collaborative filtering from existing recommendations.<p>Is there going to be something important encoded in the image or metadata you can relate to other things? It seems easiest to just use the keywords. Like you don't need a picture of guacamole to know it goes with tortilla chips, it's in the keywords.<p>Then again, the whole point is to find serendipitous stuff from your existing user data. If you only offer 100 products, none of them will serendipitously be shopping carts together because that's so few products. It's already curated to such a degree collaborative filtering will not find anything you don't already know.