TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Launch HN: Depict.ai (YC S20) – Product recommendations for any e-commerce store

126 点作者 antonoo超过 4 年前
Hey there! We are Oliver and Anton, and are founders at Depict.ai. We help online stores challenge Amazon by building recommender systems that don&#x27;t require any sales or behavioral data at all.<p>Today, most recommender systems are based on a class of methods commonly called ‘collaborative filtering’ - which means that they generate recommendations based on a user&#x27;s past behavior. This method is successfully used by Amazon and Netflix (see the <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Netflix_Prize" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Netflix_Prize</a>). They are also very unsuccessfully used by smaller companies that lack the critical mass of historical behavioral data required to use those models effectively. This generally results in the cold start problem (<a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Cold_start_(recommender_systems)" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Cold_start_(recommender_system...</a>) and a worse customer experience. We solve this by not focusing on understanding the customer but instead focus on understanding the product.<p>The way we do this is with machine learning techniques that create vector representations of products based on the products’ images and descriptions, and recommend matching using these vector representations. More specifically, we have found a way to scrape the web and then train massive neural networks on e-commerce products. This makes it possible to leverage large amounts of product metadata to make truly impressive recommendations for any e-commerce store.<p>One analogy we like is that just as almost no single company has enough sales or behavioral data to consistently predict, for instance, credit card frauds on their own, almost no e-commerce company has enough data to generate good recommendations based only on their own information. Stripe can make excellent fraud detection models by pooling transactions from many smaller companies, and we can do the same thing for personalizing e-commerce stores by pooling product metadata.<p>Through A&#x2F;B-tests we have proved that we can increase top-line revenue with 4-6% for almost any e-commerce store. To prove our value we offer the tests and setup 100% for free. We make money by taking a cut of the revenue uplift we generate in the A&#x2F;B-tests. We have also found that the sales and decision cycle gets much shorter by being independent of customer&#x27;s user data. You can see us live at Staples Nordics and kitchentime.com, among others.<p>Oliver and I have several years of experience applying recommender systems within e-commerce and education respectively and felt uneasy about a winner-takes-it-all development where the largest companies could use their data supremacy to out-personalize any smaller company. Our goal is to build a company that can offer the best personalization to any e-commerce store, not just the ones with enough data.<p>Do you think our approach seems interesting, crazy, lazy or somewhere in the middle? We’d love any feedback - please feel free to shoot us comments below or DM, we’ll be here to answer your thoughts and gather feedback!

26 条评论

riddlemethat超过 4 年前
I worked with a team that built an engine that did exactly this. There are complex issues to resolve. Mostly, the market for retail analytics is very small so even though this type of data fabric can offer retailers incredible insights into what they should be purchasing and how to bundle&#x2F;market their products together, they won&#x27;t pay what it costs to scrape such much data to generate the recommendations.<p>The product recommendation angle for eCommerce is a better angle but only works well for big companies where you have enough data at the onset to drive better recommendations. With smaller companies and lesser known products you must make probabilistic determinations based on image analysis and context structure that will be mostly guess work until you have real data. Such as you surmised with your A&#x2F;B testing.<p>Anyway, it seems you already got some major clients under your belt and have proven a track record. Hope you are able to succeed in your quest to make better recommendations work for small business with the data fabric you created.<p>Happy to chat through my experiences if you have interest. hn (at) strapr (dot) com is my email.
评论 #24258243 未加载
serendipityrecs超过 4 年前
Cool idea. I&#x27;ve been working on an adjacent product (serendipityrecs.com), but mine is more targeted towards consumers as opposed to B2B. I think I gravitated towards B2C by default because as an engineer, I don&#x27;t want to deal with sales, but your product makes sense. I&#x27;ll be interested in following your progress over the next few years to see how your thesis plays out.<p>Couple questions<p>- How well do your recommendations hold up against Amazon&#x27;s? Since you&#x27;re scraping the metadata, you should be able to generate recs for Amazon items from their own catalogue. This might be an interesting product &#x2F; demo for potential customers.<p>- Once you hook up your system to your customer&#x27;s back end, how do you learn from the behavioral data you get from them? That&#x27;s straightforward for cf&#x2F;mf, but can be tricky to integrate into what you already have. - You talk about Stripe pooling the data from their customers. I think the analogue for you would be pooling the behavioral data from your customers as opposed to the metadata. Have you thought about this?<p>- It sounds like you&#x27;re doing nearest neighbors on the vector representation. You may already know this, but LSH is a fast way to do this when you have many items.<p>- Do you embed all items from your different customers into the same vector space? That would be ideal from the POV of creating a pooled dataset that would be helpful for all future customers, but sounds tricky given that everyone likely has their own idiosyncratic system.<p>Best of luck! Lmk if you&#x27;d like to talk shop sometime, I also have several years of experience with recommender systems (my email is in my profile).
评论 #24254541 未加载
chudaka_pi超过 4 年前
So are a you scraping (data acquisition) engine, an AI engine, a catalogue as a service engine, a recommendation engine, or all of the above? No info on your model capabilities, data scope (brands&#x2F;retailers&#x2F;SKUs)? It seems like you are vectorizing SKUs, which is a well-understood problem. How does your platform compare to markable.ai (look beyond the Visual AI, these guys have a robust vectorization pipeline, and they are scraping continuously which takes a team of engineers) or visenze.com (massive platform of most of of e-commerce, almost a billion SKUs, styles, occasions, lots of AI). You guys seem knowledgeable but it looks like a multiple products al in one and a fairly small team. Good luck, regardless!
sanj超过 4 年前
I built a recommended like this at a prior job. We carefully tested it against the original “naive” algorithm which was used direct user behavior clustering.<p>What was interesting is that the naive algorithm got better over time and the incremental benefit of our new code got smaller.<p>Why?<p>Because the training data for the naive algo included user behavior from the new one. As we created better recommendations, users clicked on them and that fed into the old algo!<p>Coming to your product: what is to prevent a customer from using it for a few weeks, copying down the results, and then using those recommendations forever?<p>They’ll get most of the benefit for very small cost.
评论 #24255509 未加载
评论 #24255498 未加载
jonas_b超过 4 年前
Congratulations guys. I met two of you when you visited our agency and told your story. Very impressive, and for those of you reading, these guys were still in high-school when we met them, maybe still are? Stoked that you got into YC!!
评论 #24253546 未加载
an_opabinia超过 4 年前
&gt; This generally results in the cold start problem<p>If I&#x27;m an online store with 100 products, couldn&#x27;t I just punch the products into Amazon on a fresh account, then copy the search results? 100 products would maybe take me 20 minutes to do a day, but if you&#x27;re saying there&#x27;s a 4-6% lift, seems like it&#x27;s worth it?<p>If it was 1,000 products, maybe I do this once a week for 200 minutes? Etc. etc.<p>Here&#x27;s what&#x27;ll happen: Your online store won&#x27;t have most of the products on Amazon&#x27;s recommended list. Isn&#x27;t that the problem?<p>So no matter what, don&#x27;t I eventually have to scale to Amazon size to get the value out of collaborative filtering?<p>Maybe no small business has that real supply chain. They are just front-running other stuff. But hey, that&#x27;s their prerogative - to try to be Amazon without doing the stuff that actually makes Amazon successful.<p>&gt; Netflix Prize<p>They don&#x27;t even use those methods anymore. And that competition was much more about how to do IT and ensemble methods than any one particular approach, since that&#x27;s how you get to #1.<p>Netflix Prize is sort of the opposite narrative of what you&#x27;re actually doing. If you&#x27;re seeking something that normal people recognize, just stick to talking about Amazon.<p>&gt; Do you think our approach seems interesting, crazy, lazy or somewhere in the middle?<p>At least the premise doesn&#x27;t square away.<p>Considering the data gathering, it seems easier to do user-product collaborative filtering.<p>Considering the math, it seems easier to do user-product collaborative filtering. You can bootstrap weights data for a e.g. non-negative matrix factorization collaborative filtering from existing recommendations.<p>Is there going to be something important encoded in the image or metadata you can relate to other things? It seems easiest to just use the keywords. Like you don&#x27;t need a picture of guacamole to know it goes with tortilla chips, it&#x27;s in the keywords.<p>Then again, the whole point is to find serendipitous stuff from your existing user data. If you only offer 100 products, none of them will serendipitously be shopping carts together because that&#x27;s so few products. It&#x27;s already curated to such a degree collaborative filtering will not find anything you don&#x27;t already know.
评论 #24253948 未加载
mlthoughts2018超过 4 年前
Taking a cut of revenue seems extreme because what you are doing is extremely commodity.<p>ML teams at hosting platforms like Wix or Shopify or Squarespace could offer the same as a built in or slight higher tiered premium feature, paying a tiny fixed cost instead of a share of revenue uplift.<p>This could even be basically an intern or a new grad project at tech companies like that, the technology for the model is very simple. The devil would be in the details of integrating with the data model backing those platforms ecommerce shop products, but you could solve once and then immediately offer it for all your customers and out of the box for new customers.<p>The part of your idea that makes me skeptical is the scalability of applying your recommendation approach to bespoke customers. Like, I’m sure you can do it, but with nowhere near the same reach or efficiency or price point as well capitalized major store hosting platforms.
评论 #24258410 未加载
sheeshkebab超过 4 年前
It’s an interesting service, although your pricing model makes for a very tough sell (and complex to even technically consider - ab tests? Need detailed sales data too? Forget it...)
评论 #24253379 未加载
zkid18超过 4 年前
Oliver, Anton congrats with a launch. RecSys analyst here.<p>Correct me if I wrong, but afaiu, you have designed a black-box content-based recommender system for e-commerce domain by scrapping publicly available data. I love your business model, though I have a couple questions:<p>1. A&#x2F;B testing in RecSys is a tricky process in terms of further interpretation. How do you choose the control and test group? I would love to go beyond revenue percentage influx while considering the new model. Btw, do you have your own A&#x2F;B testing environment?<p>2. Are you targeting one specific problem, like cold start or checkout recommendation or have a general solution?<p>3. Are you planning to open-source your model?<p>4. Do you have any Wordpress&#x2F;Shopify plugins?<p>Anyway, I really like your idea and would love to contribute.<p>Let&#x27;s stay in touch via twitter: @kidrulit.
评论 #24259336 未加载
KaoruAoiShiho超过 4 年前
How does it compare to recombee or AWS personalize? I&#x27;m in the market for this but I guess I&#x27;m a bit more technical than the people you&#x27;re selling to and so can use stuff that&#x27;s a tad lower level?
评论 #24258322 未加载
brecs超过 4 年前
This sounds very exciting! I&#x27;m interested in the A&#x2F;B tests you ran to show revenue lift from your recommendations. What is the baseline model you test against? It seems to me that the most &quot;fair&quot; comparison would be to set up exactly the same vector representations and neural network model for only a single company at a time, and compare the performance to demonstrate that it is really your approach of combining different companies&#x27; datasets that provides the extra value here. Is that what you guys did?
评论 #24258379 未加载
bartkappenburg超过 4 年前
We [0] use a combination of Elastic’s “More Like This” query (which uses the said vector spaces et al) and view, clicks and sales data and ML (behaviour on—site). This prevents the Cold Start Problem and improves with more data.<p>How does your scraping hold up against the already pretty effective More Like This query in ES? That one is backed by years of research and gives very good results.<p>[0] <a href="https:&#x2F;&#x2F;www.conversify.com" rel="nofollow">https:&#x2F;&#x2F;www.conversify.com</a>
评论 #24253868 未加载
thegginthesky超过 4 年前
Hi! Congrats on the launch!<p>Using content based recommendation is interesting but requires a constant scraping for more data. Plus the whole cost of curating the dataset and guaranteeing data quality can be extra challenging. How are you getting around these problems?<p>Also, your approach with A&#x2F;B Test is interesting, but how would you do it for smaller shops? Wouldn&#x27;t it take too long to give appropriate results? Or are you using a Bayesian Test Methodology?
评论 #24253436 未加载
notdang超过 4 年前
For example I am into coffee roasting at home. The cheapest device to roast coffee at home is a popcorn maker, not all, just some brands that meet some power requirements. When I look at those devices on Amazon, they recommend me green coffee beans, which is correct and helpful. I guess Amazon is using colaborative filtering. What in this case a content base recommendation system will do?
Boxxed超过 4 年前
Without the behavioral data you&#x27;ll just plain miss out on correlations that simply can&#x27;t be described by product metadata alone. Is this a big deal? It seems like that would be one of the big advantages of collaborative filtering. I guess that advantage is probably larger on sites with a huge product variety (e.g. Amazon) which aren&#x27;t really your audience.
评论 #24258306 未加载
ssharp超过 4 年前
What was the control in the A&#x2F;B test?<p>I&#x27;ve seen lots of recommendation algorithms fail against curated recommendations. It would be really interesting to see where &#x2F; what this approach beats.<p>Would also be really curious to how stores have reacted to the revenue model. Is that a one-time fee based on the A&#x2F;B test results or are you capturing a cut of the uplift in perpetuity?
评论 #24253515 未加载
Winterflow3r超过 4 年前
Hey! Grattis! Really happy for the success of my fellow Stockholmare! Can I ask how do you integrate with your customer&#x27;s online stores? I&#x27;ve always puzzled how third party recommender services integrate with someone&#x27;s existing shop. Is there some sort of JS widget or similar you add to the customer&#x27;s existing site?
评论 #24254332 未加载
swyx超过 4 年前
great pitch! i dont work in ecommerce so i cant attest to the appeal of it but sounds like it might work to layperson ears.<p>one nit - &quot;we lift revenue by 4-6%&quot; doesn&#x27;t feel like a very impressive number (it may be within the bounds of normal noise for a smaller ecom site?). that said, im very much not an ecomm guy. is this a bigger deal than it initially reads?<p>i also feel like recommender systems work much better for netflix (infinite consumption) than for ecommerce (where if i already bought a shoe i normally dont want another). perhaps this tech is better applied to <i>media</i> than to ecomm?
评论 #24253269 未加载
评论 #24253475 未加载
tariqueshams超过 4 年前
It makes sense with a recommended system for customers to spend more money on a site, but how does this help stores compete with Amazon? Interesting revenue model.
评论 #24258350 未加载
Finbarr超过 4 年前
Very cool! Most ecommerce stores don&#x27;t have enough data to do product recommendations with the existing tools. This is much needed.
rrwright超过 4 年前
Website link: <a href="https:&#x2F;&#x2F;depict.ai&#x2F;" rel="nofollow">https:&#x2F;&#x2F;depict.ai&#x2F;</a>
eries超过 4 年前
Impressive idea - best of luck with it
评论 #24252675 未加载
i386超过 4 年前
I’d be keen to try this - can you work with Squarespace? james@wildspiritdistilling.co
ianmchenry超过 4 年前
How do you use the meta data associated with the images to power better recommendations?
评论 #24252652 未加载
noteanddata超过 4 年前
for this part, &quot;scrape the web&quot;, what is the web part, is it other e-commerce sites including amazon and so on? do you foresee any issues&#x2F;risks about that?
评论 #24253896 未加载
shayankh超过 4 年前
why not call it content based recommender systems? also do you think in future you could turn it into a hybrid system when you have user feedback?