TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Anything To JSON – a language model for structured extraction

18 pointsby eduntemanabout 1 year ago
Hey HN! Erik here from banana.dev<p>We’ve trained a small(ish) language model on structured extraction, and today we’re launching a playground for it at <a href="https:&#x2F;&#x2F;anythingtojson.com" rel="nofollow">https:&#x2F;&#x2F;anythingtojson.com</a>. Give it a try!<p>This model continues our work on structured generation, following last week’s launch of Fructose[1], a python client for strongly-typed LLM responses.<p>There seem to be two distinct halves of the problem intended to be solved by Fructose and structured generation:<p>1. the reasoning ability of the model, such as performing chain of thought, creative acts, and natural language tasks. In a way, the “business logic”.<p>2. the structured json response, to make sure the receiving code doesn&#x27;t break<p>AnythingToJson is intended to solve the latter. No big-brain work, just find data in text and extract it. Constrained generation at inference time helps keep it on track, and our finetuning has improved the accuracy and consistency of extracted data.<p>There’s much more progress to be made (longer context window, hallucination to squash out, etc), but ship early and fast.<p>[1] <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39619053">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39619053</a>

3 comments

yeldarbabout 1 year ago
Nice, makes sense to chain models so you don&#x27;t waste the attention of the big smart model on the grunt work of JSON structure.<p>I bet there are some good post-processing heuristics you could also apply for hallucination with a flag for &quot;this should be in the text verbatim&quot; &amp; then string matching whether the answer it outputted was a string from the text or not.
评论 #39770866 未加载
Orasabout 1 year ago
How does it differentiate from using json output with OpenAI?<p>I think it can be useful for some cases, but you’ll always have the limitation of the model size to compete against large closed source models.
评论 #39770752 未加载
edwinweeabout 1 year ago
can&#x27;t speak to the actual service, but can confirm they have excellent bananas (not kidding)
评论 #39770441 未加载