Show HN: We made an open-source personalization engine

284 点作者 shutty大约 3 年前

Hey, HN! You probably know that the ordering of products on Amazon, posts in FB, and search results in Google is personalized for each visitor, as it directly affects conversion, click rate and engagement. But not everyone can afford to hire an army of PhDs to squeeze every penny out of the ranking, and not everyone agrees on the current (im)balance between privacy and profits.So we built Metarank, an open-source and privacy-focused personalization engine. It can rerank in real-time any type of content, using only the data you allow, and optimize metrics you define.We made a lot of proprietary DIY services for personalization in e-commerce in our past careers and heard so many complaints from other companies also struggling to implement personalization. It’s often considered "too risky" to spend 6+ months on an in-house moonshot project to reinvent the wheel without an experienced team and no existing open-source tools. Like other people in the industry, we were tired of building everything from the bottom up each time we approached personalization - it should be easy not only for Amazon to do such magical ML tricks, but for everyone else.A small demo of the tool with personalized recommendations: <a href="https://demo.metarank.ai" rel="nofollow">https://demo.metarank.ai</a>A blog post on how this demo was made: <a href="https://medium.com/metarank/personalizing-recommendations-with-metarank-f2644112536b" rel="nofollow">https://medium.com/metarank/personalizing-recommendations-wi...</a>The project itself: <a href="https://github.com/metarank/metarank" rel="nofollow">https://github.com/metarank/metarank</a>

19 条评论

shutty大约 3 年前

I’m one of the contributors to this project. The idea of the tool is to focus on typical ML feature engineering challenges. It takes a stream of business events like clicks and impressions, and computes a ton of common ML features on top:* Parse User-Agent field, make a GeoIP lookup* Count number of clicks over different items on multiple time windows, like 1-2-3-4 weeks* Conversion and CTR rates* Basic customer profiling, like “you clicked on a red item in the past, and this item is also red”There is just a LambdaMART with xgboost inside, no rocket science. It won’t replace an in-house highly-focused solution, but building everything from scratch may take a ton of time. With Metarank you can quickly hack a good enough solution in a day, hopefully :)

评论 #30778775 未加载

评论 #30778818 未加载

评论 #30782198 未加载

danpalmer大约 3 年前

> Metarank is industry-agnostic and can be used in any place of your application where some content is displayed.I'm afraid I'm skeptical.Content ranking in small, well defined contexts is not hard to do and doesn't require an ML approach – rules based systems are often easier to specify, easier for both creators and users to understand, and easier to make conform to business rules.When ML does need to be introduced, when the scale or complexity is large enough that a rules-based approach will be infeasible or worse, having a generic implementation is unlikely to return useful results. So much of the work of optimising an ML approach is engineering features out of the data that make sense and that don't introduce bias.It's that last point that's really important because if you do the wrong feature engineering, then the bias introduced effectively means you're back to building a rules-based system, just one that has a bunch of inaccuracy built in, and where you don't understand what rules you've specified, or even that you have specified them.I'm not an expert here, but I've worked on basic recommender systems for products, and worked with people who were far more knowledgeable about this, all of whom seemed to have a low opinion of generic systems.

评论 #30790056 未加载

Sharma大约 3 年前

BTW, accessing metarank.ai gives warning. May be because it has Meta in its domain name but Metamask shows this message --This domain is currently on the MetaMask domain warning list. This means that based on information available to us, MetaMask believes this domain could currently compromise your security and, as an added safety feature, MetaMask has restricted access to the site. To override this, please read the rest of this warning for instructions on how to continue at your own risk.

评论 #30778962 未加载

评论 #30778970 未加载

评论 #30778858 未加载

评论 #30779136 未加载

dmitrykan大约 3 年前

Great project! Elasticsearch / OpenSearch / Solr have their own learning to rank plugins. Have you considered integrating Metarank with such systems? Or is your vision to provide a reranker layer, that can be independent of the underlying search engine architecture?

评论 #30787907 未加载

mushufasa大约 3 年前

This is super interesting!On the demo page, nothing is happening when I try clicking on any of the buttons. I'm in a browser with no adblocking or jsblocking. Is this just the hug of death, or am I holding it wrong?

评论 #30779121 未加载

评论 #30778832 未加载

评论 #30778780 未加载

thih9大约 3 年前

What's a scenario or a method to apply a personalization engine that gives the lowest chance of making the overall UX worse?I usually dislike personalized content, I prefer search results that accurately match my query I and find it distracting to see suggestions or uncommon ordering (to the point that I search for Netflix movies via an external website to avoid going through their UI).

评论 #30780789 未加载

charcircuit大约 3 年前

Honestly, personalization seems crazy to me. I can't believe how well it works and how fast I can get personalized stuff. I wouldn't know where to start to design a system to handle it. Sites like YouTube or Pixiv have no much content that it seems hard to rank it all for a single person.

评论 #30778565 未加载

GrumpyNl大约 3 年前

I get cross policy warnings on the demo page Access to XMLHttpRequest at '<a href="https://demo-api.metarank.ai:3000/movies?user=pnsar&session=9cwo9h&tag=aliens" rel="nofollow">https://demo-api.metarank.ai:3000/movies?user=pnsar&session=...</a>' from origin '<a href="https://demo.metarank.ai" rel="nofollow">https://demo.metarank.ai</a>' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.

评论 #30788184 未加载

themgt大约 3 年前

Could you publish the Dockerfile? When attempting to run train in Docker according to tutorial instructions I get "Cannot load library: java.lang.UnsatisfiedLinkError: /tmp/lightgbm[123]/lib_lightgbm.so: libgomp.so.1"I logged into the container as root and ran "apt-get update && apt-get install libgomp1" and then training worked, but it'd be nice to be able to view/tweak your existing Dockerfile if/when needed. Thanks, cool project!

评论 #30788155 未加载

sebrindom大约 3 年前

Soo cool would love to see this integrated with <a href="https://github.com/medusajs/medusa" rel="nofollow">https://github.com/medusajs/medusa</a>

评论 #30780755 未加载

bredren大约 3 年前

This is cool. Not having read the behavior, I expected the demo to allow me to downvote films as well.Partly, because there is a settings icon overlay. Maybe I’m missing something on mobile.Also, it drove me nuts that I couldn’t like only the first matrix film.

评论 #30787491 未加载

czbond大约 3 年前

Very cool - I haven't had time to peruse the offering or code, but it seems like a very needed tool for industries and small businesses which don't have the resources to make it happen.

评论 #30778608 未加载

nwsm大约 3 年前

Hug of death on the demo app. (504 on calls to <a href="https://demo-api.metarank.ai:3000/movies" rel="nofollow">https://demo-api.metarank.ai:3000/movies</a>)

orliesaurus大约 3 年前

Are there any privacy implications? i.e. you're learning to show me the best results based on my experience, what happens to that learning when I leave the site?

评论 #30779700 未加载

nelsondev大约 3 年前

Very cool! Thanks for sharing.Rather than an offline model, why not use an online, continuously relearning model like a Multi-Armed Bandit to do the re-ranking?

评论 #30780869 未加载

gizmodo59大约 3 年前

When I promoted dark knight it just shows all other super hero movies when I really like Nolan movies more than other action hero movies.

评论 #30788178 未加载

nonoesp大约 3 年前

Congrats on the launch.It's a bit uneasing to hit the landing page and find a typo in "personalizaton made easy."

评论 #30778619 未加载

评论 #30778551 未加载

minroot大约 3 年前

Why do people use Scala?

评论 #30780777 未加载

评论 #30781526 未加载

评论 #30782127 未加载

nonoesp大约 3 年前

Crazy that the GitHub repo went from almost no starts to 800+ in one day.

评论 #30790030 未加载

19 条评论

shutty大约 3 年前

评论 #30778775 未加载

评论 #30778818 未加载

评论 #30782198 未加载

danpalmer大约 3 年前

评论 #30790056 未加载

Sharma大约 3 年前

评论 #30778962 未加载

评论 #30778970 未加载

评论 #30778858 未加载

评论 #30779136 未加载

dmitrykan大约 3 年前

评论 #30787907 未加载

mushufasa大约 3 年前

评论 #30779121 未加载

评论 #30778832 未加载

评论 #30778780 未加载

thih9大约 3 年前

评论 #30780789 未加载

charcircuit大约 3 年前

评论 #30778565 未加载

GrumpyNl大约 3 年前

评论 #30788184 未加载

themgt大约 3 年前

评论 #30788155 未加载

sebrindom大约 3 年前

Soo cool would love to see this integrated with <a href="https://github.com/medusajs/medusa" rel="nofollow">https://github.com/medusajs/medusa</a>

评论 #30780755 未加载

bredren大约 3 年前

评论 #30787491 未加载

czbond大约 3 年前

Very cool - I haven't had time to peruse the offering or code, but it seems like a very needed tool for industries and small businesses which don't have the resources to make it happen.

评论 #30778608 未加载

nwsm大约 3 年前

Hug of death on the demo app. (504 on calls to <a href="https://demo-api.metarank.ai:3000/movies" rel="nofollow">https://demo-api.metarank.ai:3000/movies</a>)

orliesaurus大约 3 年前

Are there any privacy implications? i.e. you're learning to show me the best results based on my experience, what happens to that learning when I leave the site?

评论 #30779700 未加载

nelsondev大约 3 年前

Very cool! Thanks for sharing.Rather than an offline model, why not use an online, continuously relearning model like a Multi-Armed Bandit to do the re-ranking?

评论 #30780869 未加载

gizmodo59大约 3 年前

When I promoted dark knight it just shows all other super hero movies when I really like Nolan movies more than other action hero movies.

评论 #30788178 未加载

nonoesp大约 3 年前

Congrats on the launch.It's a bit uneasing to hit the landing page and find a typo in "personalizaton made easy."