TechEcho

Hi HN!My latest side project is knowledge graph that maps the French culinary network using data extracted from restaurant reviews from LeFooding.com. The project uses LLMs to extract structured information from unstructured text.Some technical aspects you may be interested in:- Used structured generation to reliably parse unstructured text into a consistent schema- Tested multiple models (Mistral-7B-v0.3, Llama3.2-3B, gpt4o-mini) for information extraction- Created an interactive visualization using gephi-lite and Retina (WebGL)- Built (with Claude) a simple Flask web app to clean and deduplicate the data- Total cost for inferencing 2000 reviews with gpt4o-mini: less than 1€!You can explore the visualization here: [Interactive Culinary Network](<a href="https://ouestware.gitlab.io/retina/1.0.0-beta.4/#/graph/?url=https%3A%2F%2Fgist.githubusercontent.com%2Ftheophilec%2F351f17ece36477bc48438d5ec6d14b5a%2Fraw%2Ffa85a89541c953e8f00d6774fe42f8c4bd30fa47%2Fgraph.gexf&r=x&sa=re&ca[]=t&ca[]=ra-s&st[]=u&st[]=re&ed=u" rel="nofollow">https://ouestware.gitlab.io/retina/1.0.0-beta.4/#/graph/?url...</a>)The code for the project is available on GitHub: - Main project: <a href="https://github.com/theophilec/foudinge">https://github.com/theophilec/foudinge</a> - Data cleaning tool: <a href="https://github.com/theophilec/foudinge-scrub">https://github.com/theophilec/foudinge-scrub</a>Happy to get feedback!

13 comments

tantalor3 months ago

The embedding is kind of weird. Like, there's no reason a "degree: 1" node should be so far away from its sibling.Example: <a href="https://imgur.com/a/7Cktyzp" rel="nofollow">https://imgur.com/a/7Cktyzp</a>This makes the graph look more random/noisy/disorganized than it actually is.

评论 #43244857 未加载

评论 #43243323 未加载

评论 #43244026 未加载

评论 #43253902 未加载

arnath3 months ago

This is a super cool idea! I've sort of mused about an idea for general web search that's very similar to this concept, where you start with a set of trusted entities and then branch out from there, but choosing how you establish trust is really important. But this is a really clever application, well done!

评论 #43245742 未加载

moandcompany3 months ago

Very cool work.It's worth mentionion that the Graph browser using "Retina" is a project from Ouestware (<a href="https://www.ouestware.com/en/" rel="nofollow">https://www.ouestware.com/en/</a>) which is also contributor to the GraphCommons and GephiLite projects.

评论 #43245092 未加载

评论 #43243113 未加载

nluken3 months ago

Given the structured nature of the data, how does this compare to running a specialized classification model that looks for specific words in a review and uses those to assign Chefs to Restaurants? With some fine tuning, you might get more consistent results than feeding the reviews into a generative model.

评论 #43244464 未加载

jonnycoder3 months ago

This looks great! I was just looking for a good web knowledge graph visualizer.

评论 #43264555 未加载

bevan3 months ago

This was inspiring, what a cool idea. Just curious—-for 4o mini isn’t there a json mode that reliably produces structured output? Was that what you were referring to / ended up using?

评论 #43245875 未加载

holtwork3 months ago

Great project. I propose an improvement over this conventional kind of object-style graph. Instead, every single item should be a node or an edge. The objects are needless complexities that obscure pure graph relations. Like this: <a href="https://memelang.net/03/" rel="nofollow">https://memelang.net/03/</a>

drabbiticus3 months ago

Very interesting. A small tweak and it seems like this could be applied to the problem of identifying degree of separation from political dissidents or other targets with the right data source. Lots of tools already exist that do that, but it's kind of wild how accessible and scalable certain techniques have become.

nswanberg3 months ago

Nice! How'd the local models do vs gpt4o-mini? Did you spend much time playing with datasette?

评论 #43244907 未加载

pranavm273 months ago

Do you think this will work as effectively with Google or Social Media review and rating datasets? As every country may not have a LeFooding.comWould like to here everyone's thoughts

nickthegreek3 months ago

Graph embed does not appear to work in FF 135. Loaded in Chrome though.Edit: Seems to be a me issue.

评论 #43243579 未加载

评论 #43243576 未加载

repsiace3 months ago

Looks interesting, have you tried utilizing a multimodal model?

martinky243 months ago

What's the use case for maintaining a list of restaurants that use LLMs?

评论 #43243013 未加载

评论 #43243016 未加载

评论 #43243291 未加载

13 comments

tantalor3 months ago

评论 #43244857 未加载

评论 #43243323 未加载

评论 #43244026 未加载

评论 #43253902 未加载

arnath3 months ago

评论 #43245742 未加载

moandcompany3 months ago

评论 #43245092 未加载

评论 #43243113 未加载

nluken3 months ago

评论 #43244464 未加载

jonnycoder3 months ago

This looks great! I was just looking for a good web knowledge graph visualizer.

评论 #43264555 未加载

bevan3 months ago

This was inspiring, what a cool idea. Just curious—-for 4o mini isn’t there a json mode that reliably produces structured output? Was that what you were referring to / ended up using?

评论 #43245875 未加载

holtwork3 months ago

drabbiticus3 months ago

nswanberg3 months ago

Nice! How'd the local models do vs gpt4o-mini? Did you spend much time playing with datasette?

评论 #43244907 未加载

pranavm273 months ago

Do you think this will work as effectively with Google or Social Media review and rating datasets? As every country may not have a LeFooding.comWould like to here everyone's thoughts

nickthegreek3 months ago

Graph embed does not appear to work in FF 135. Loaded in Chrome though.Edit: Seems to be a me issue.

Show HN: Knowledge graph of restaurants and chefs, built using LLMs

13 comments

Show HN: Knowledge graph of restaurants and chefs, built using LLMs

13 comments