Improving recommendation systems and search in the age of LLMs

408 点作者 7d7n大约 2 个月前

17 条评论

x1xx大约 2 个月前

> Spotify saw a 9% increase in exploratory intent queries, a 30% rise in maximum query length per user, and a 10% increase in average query length—this suggests the query recommendation updates helped users express more complex intentsTo me it's not clear that it should be interpreted as an improvement: what I read in this summary is that users had to search more and to enter longer queries to get to what they needed.

评论 #43452173 未加载

评论 #43452227 未加载

评论 #43452647 未加载

评论 #43458845 未加载

评论 #43462449 未加载

评论 #43458368 未加载

评论 #43452482 未加载

评论 #43455202 未加载

评论 #43454262 未加载

评论 #43459128 未加载

novia大约 2 个月前

I started listening to this article (using a text to speech model) shortly after waking up.I thought it was very heavy on jargon. Like, it was written in a way that makes the author appear very intelligent without necessarily effectively conveying information to the audience. This is something that I've often seen authors do in academic papers, and my one published research paper (not first author) is no exception.I'm by no means an expert in the field of ML, so perhaps I am just not the intended audience. I'm curious if other people here felt the same way when reading though.Hopefully this observation / opinion isn't too negative.

评论 #43453382 未加载

评论 #43454178 未加载

评论 #43456776 未加载

评论 #43454176 未加载

softwaredoug大约 2 个月前

A lot of teams can do a lot with search with just LLMs in the loop on query and index side doing enrichment that used to be months-long projects. Even with smaller, self hosted models and fairly naive prompts you can turn a search string into a more structured query - and cache the hell out of it. Or classify documents into a taxonomy. All backed by boring old lexical or vector search engine. In fact I’d say if you’re NOT doing this you’re making a mistake.

评论 #43454859 未加载

jamesblonde大约 2 个月前

It is very interesting that Eugene does this work and publishes it so soon after conferences. Traditionally this would be a literature survey by a PhD student and would take 12 months to come out as some obscure journal behind a walled garden. I wonder if it is an outlier (Eugene is good!) or a sign of things to come?

评论 #43452238 未加载

tullie大约 2 个月前

The other direction that isn’t explicitly mentioned in this post is the variants of SASRec and Bert4Rec that are still trained on ID-Tokens but showing scaling laws much like LLMs. E.g. Meta’s approach <a href="https://arxiv.org/abs/2402.17152" rel="nofollow">https://arxiv.org/abs/2402.17152</a> (paper write up here: <a href="https://www.shaped.ai/blog/is-this-the-chatgpt-moment-for-recommendation-systems">https://www.shaped.ai/blog/is-this-the-chatgpt-moment-for-re...</a>)

anon8764352大约 2 个月前

@7d7n Eugene / others experienced in recommendation systems: for someone who is new to recommendation systems and uses variants of collaborative filtering for recommendations, what non-LLM approach would you suggest to start looking into? The cheaper the compute (ideally without using GPUs in the first place) the better, while also maximizing the performance of the system :)

评论 #43461214 未加载

thaumiel大约 2 个月前

ah this explains why my spotify experience has gotten worse over time.

评论 #43452977 未加载

whatever1大约 2 个月前

Why we don’t have an LLM based search tool for our pc / smartphones?Specially for the smartphones all of your data is on the cloud anyway, instead of just scraping it for advertising and the FBI they could also do something useful for the user?

评论 #43451238 未加载

评论 #43453258 未加载

评论 #43451307 未加载

评论 #43452204 未加载

评论 #43466337 未加载

评论 #43451186 未加载

anthk大约 2 个月前

Use 'Recoll' and learn to use search strings. For Windows users, older Recoll releases are standalone and have all the dependencies bundled, so you can search into PDF's, ODT/DOCX and tons more.

stuaxo大约 2 个月前

Off topic - but I think joining recommendation systems and forums (aka all the social media that isn't bsky or fedi) has been a complete disaster for society.

anonymousDan大约 2 个月前

It's interesting that none of these papers seem to be coming out of academic labs....

评论 #43451635 未加载

评论 #43453128 未加载

memhole大约 2 个月前

It looks like a great overview of recommendation systems. I think my main takeaways are:1. Latency is a major issue.2. Fine tuning can lead to major improvements and I think reduce latency. If I didn’t misread.3. There’s some threshold or problems where prompting or fine tuning should be used.

a_bonobo大约 2 个月前

Elicit has a nice new feature where given a research question, it seems to give the question to an LLM with the prompt to improve the question. It's a neat trick.As an example, I gave it 'What is the impact of LLMs on search engines?' and it suggested three alternative searches under keywords, the keyword 'Specificity' has the suggested question 'How do large language models (LLMs) impact the accuracy and relevance of search engine results compared to traditional search algorithms?'It's a really cool trick that doesn't take much to implement.

bookofjoe大约 2 个月前

Perplexity Pro suggested several portable car battery chargers, which led me to search online reviews, whose consensus (five or so review sites) highest-rated chargers were the first two on Perplexity's recommendation list. In other words, the AI was an helpful guide to focused deeper search.

thorum大约 2 个月前

In the age of local LLMs I’d like to see a personal recommendation system that doesn’t care about being scalable and efficient. Why can’t I write a prompt that describes exactly what I’m looking for in detail and then let my GPU run for a week until it finds something that matches?

评论 #43451680 未加载

评论 #43451652 未加载

评论 #43451599 未加载

评论 #43451581 未加载

评论 #43451703 未加载

评论 #43451666 未加载

评论 #43452241 未加载

评论 #43451879 未加载

onel大约 2 个月前

Another amazing post from Eugene

anon373839大约 2 个月前

Terrific post. Just about everything Eugene writes about AI/ML is pure gold.

评论 #43451416 未加载