Hey HN! Erik here from banana.dev<p>We’ve trained a small(ish) language model on structured extraction, and today we’re launching a playground for it at <a href="https://anythingtojson.com" rel="nofollow">https://anythingtojson.com</a>. Give it a try!<p>This model continues our work on structured generation, following last week’s launch of Fructose[1], a python client for strongly-typed LLM responses.<p>There seem to be two distinct halves of the problem intended to be solved by Fructose and structured generation:<p>1. the reasoning ability of the model, such as performing chain of thought, creative acts, and natural language tasks. In a way, the “business logic”.<p>2. the structured json response, to make sure the receiving code doesn't break<p>AnythingToJson is intended to solve the latter. No big-brain work, just find data in text and extract it. Constrained generation at inference time helps keep it on track, and our finetuning has improved the accuracy and consistency of extracted data.<p>There’s much more progress to be made (longer context window, hallucination to squash out, etc), but ship early and fast.<p>[1] <a href="https://news.ycombinator.com/item?id=39619053">https://news.ycombinator.com/item?id=39619053</a>
Nice, makes sense to chain models so you don't waste the attention of the big smart model on the grunt work of JSON structure.<p>I bet there are some good post-processing heuristics you could also apply for hallucination with a flag for "this should be in the text verbatim" & then string matching whether the answer it outputted was a string from the text or not.
How does it differentiate from using json output with OpenAI?<p>I think it can be useful for some cases, but you’ll always have the limitation of the model size to compete against large closed source models.