科技回声

16 条评论

If anyone needs a more powerful constrain outputs, llama.cpp support gbnf:<a href="https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md">https://github.com/ggerganov/llama.cpp/blob/master/grammars/...</a>

评论 #42348234 未加载

评论 #42350394 未加载

评论 #42349982 未加载

评论 #42352538 未加载

评论 #42350648 未加载

chirau5 个月前

This is wonderful news.I was actually scratching my head on how to structure a regular prompt to produce csv data without extra nonsense like "Here is your data" and "Please note blah blah" at the beginning and end, so this is much welcome as I can define exactly what I want returned then just push structured output to csv.

评论 #42346983 未加载

评论 #42347297 未加载

quaintdev5 个月前

Yay! It works. I used gemma2:2b and gave it below text<pre><code> You have spent 190 at Fresh Mart. Current balance: 5098 </code></pre> and it gave below output<pre><code> {\n\"amount\": 190,\n\"balance\": 5098 ,\"category\": \"Shopping\",\n\"place\":\"Fresh Mart\"\n}</code></pre>

评论 #42350331 未加载

guerrilla5 个月前

No way. This is amazing and one of the things I actually wanted. I love ollama be because it makes using an LLM feel like using any other UNIX program. It makes LLMs feel like they belong on UNIX.Question though. Has anyone had luck running it on AMD GPUs? I've heard it's harder but I really want to support the competition when I get cards next year.

评论 #42348576 未加载

bluechair5 个月前

Has anyone seen how these constraints affect the quality of the output out of the LLM?In some instances, I'd rather parse Markdown or plain text if it means the quality of the output is higher.

评论 #42346823 未加载

评论 #42347170 未加载

评论 #42346734 未加载

评论 #42346535 未加载

评论 #42346664 未加载

评论 #42347215 未加载

评论 #42346650 未加载

quaintdev5 个月前

So I can use this with any supported models? The reason I'm asking is because I can only run 1b-3b models reliably on my hardware.

评论 #42346526 未加载

JackYoustra5 个月前

PRs on this have been open for something like a year! I'm a bit sad about how quiet the maintainers have been on this.

评论 #42351214 未加载

评论 #42350686 未加载

评论 #42354947 未加载

lxe5 个月前

I'm still running oobabooga because of its exlv2 support which does much more efficient inference on dual 3090s

评论 #42347090 未加载

highlanderNJ5 个月前

What's the value-add compared to `outlines`?<a href="https://www.souzatharsis.com/tamingLLMs/notebooks/structured_output.html#outlines" rel="nofollow">https://www.souzatharsis.com/tamingLLMs/notebooks/structured...</a>

评论 #42351243 未加载

xnx5 个月前

Is there a best approach for providing structured input to LLMs? Example: feed in 100 sentences and get each one classified in different ways. It's easy to get structured data out, but my approach of prefixing line numbers seems clumsy.

评论 #42347218 未加载

ein0p5 个月前

That's very useful. To see why, try to get an LLM _reliably_ generate JSON output without this. Sometimes it will, but sometimes it'll just YOLO and produce something you didn't ask for, that can't be parsed.

rcarmo5 个月前

I must say it is nice to see the curl example first. As much as I like Pydantic, I still prefer to hand-code the schemas, since it makes it easier to move my prototypes to Go (or something else).

seertaak5 个月前

Could someone explain how this is implemented? I saw on Meta's Llama page that the model has intrinsic support for structured output. My 30k ft mental model of LLM is as a text completer, so it's not clear to me how this is accomplished.Are llama.cpp and ollama leveraging llama's intrinsic structured output capability, or is this something else bolted ex-post on the output? (And if the former, how is the capability guaranteed across other models?)

评论 #42354766 未加载

vincentpants5 个月前

Wow neat! The first step to format ambivalence! Curious to see how well does this perform on the edge, our overhead is always so scarce!Amazing work as always, looking forward to taking this for a spin!

lormayna5 个月前

This is a fantastic news! I spent hours on fine tuning my prompt to summarise text and output in JSON and still have some issues sometimes. Is this feature available also with Go?

评论 #42349126 未加载

diimdeep5 个月前

Very annoying marketing and pretending to be anything other than just wrapper around llama.cpp.

评论 #42350588 未加载

16 条评论

rdescartes5 个月前

评论 #42348234 未加载

评论 #42350394 未加载

评论 #42349982 未加载

评论 #42352538 未加载

评论 #42350648 未加载

chirau5 个月前

评论 #42346983 未加载

评论 #42347297 未加载

quaintdev5 个月前

评论 #42350331 未加载

guerrilla5 个月前

评论 #42348576 未加载

bluechair5 个月前

Has anyone seen how these constraints affect the quality of the output out of the LLM?In some instances, I'd rather parse Markdown or plain text if it means the quality of the output is higher.

评论 #42346823 未加载

评论 #42347170 未加载

评论 #42346734 未加载

评论 #42346535 未加载

评论 #42346664 未加载

评论 #42347215 未加载

评论 #42346650 未加载

quaintdev5 个月前

So I can use this with any supported models? The reason I'm asking is because I can only run 1b-3b models reliably on my hardware.

评论 #42346526 未加载

JackYoustra5 个月前

PRs on this have been open for something like a year! I'm a bit sad about how quiet the maintainers have been on this.

评论 #42351214 未加载

评论 #42350686 未加载

评论 #42354947 未加载

lxe5 个月前

I'm still running oobabooga because of its exlv2 support which does much more efficient inference on dual 3090s

评论 #42347090 未加载

highlanderNJ5 个月前

评论 #42351243 未加载

xnx5 个月前

评论 #42347218 未加载

ein0p5 个月前

rcarmo5 个月前

I must say it is nice to see the curl example first. As much as I like Pydantic, I still prefer to hand-code the schemas, since it makes it easier to move my prototypes to Go (or something else).

seertaak5 个月前

评论 #42354766 未加载

vincentpants5 个月前

lormayna5 个月前

This is a fantastic news! I spent hours on fine tuning my prompt to summarise text and output in JSON and still have some issues sometimes. Is this feature available also with Go?

评论 #42349126 未加载

diimdeep5 个月前

Very annoying marketing and pretending to be anything other than just wrapper around llama.cpp.

评论 #42350588 未加载

Structured Outputs with Ollama

16 条评论

Structured Outputs with Ollama

16 条评论