The thing I most want from this project is a technical explanation of what it's actually doing for me and how it works.<p>I dug into this the other day, and just about figured out how the old text-davinci-003 version works.<p>When it runs against a text completion model (like text-davinci-003) the trick seems to be that it breaks your overall Mustache-templated program up into a sequence of prompts.<p>These are executed one at a time. Some of them will be open ended, but some of them will include restrictions based on the rules that you laid out.<p>So you might have a completion prompt that asks for a maximum of 1 token and uses the logit_bias argument to ensure that the returned value can only come from a specific set of tokens. That's how you would answer a piece in the program that says "next should be just the sequence 'true' or 'false'" for example.<p>What I don't yet understand is how it works against non-completion models. There are open issues complaining about broken examples using it with gpt-3.5-turbo for example.<p>And how does it work with models other than the OpenAI ones?
Is this microsoft guidance? It looks like it is and they spun it out.<p>I find guidance to be fantastic for doing complicated prompting. I haven't used the 'controlling' the output feature as much as used it for chain prompting. Ask to come up with answers to a prompt N times, then discuss pros and cons of each answer, then make a new answer based on the best parts of the output. Stuff like that.
I’ve found using a JSON schema and function calling, as described in this blog post, to be just as effective and less opaque than this library:<p><a href="https://blog.simonfarshid.com/native-json-output-from-gpt-4" rel="nofollow noreferrer">https://blog.simonfarshid.com/native-json-output-from-gpt-4</a><p>(it works perfectly with GPT-3.5 as well)
I found that the approach of template processing at large prompts leads to difficulty in reading programs. Their attractive part is that control flow is not separate from prompt as in langchain, which allows you to write prompts as classical programs. But the problem remains in unintuitive syntax for large programs
Logit-bias guidance goes a long way -- LLM structure for regex, context-free grammars, categorization, and typed construction. I'm working on a hosted and model-agnostic version of this with thiggle<p>[0] <a href="https://thiggle.com" rel="nofollow noreferrer">https://thiggle.com</a>
I've been trying to figure out how projects like this, semantic kernel (also msft), and langchain add value. Is the paradigm sort of like a web framework? It reduces the boilerplate you need to write so you can focus on the business problem?<p>Is that needed in the LLM space yet? I'm just not convinced the abstraction pays for itself in reduced cognitive load, or at least not yet, but very happy to be convinced otherwise.
The thing that’s bugging me about this eco system is the library, although it augments, has to become the thing running the LLM, I can’t use guidance as a plug-in on some other LLM system.<p>I look forward to when we have something that can run any LLM without compatibility issues, can expose APIs etc and has a robust plugin or augmentation system.
Is this alive? Last release June 21<p>There are many projects like these I'm tracking, but they all kinda cool off after the initial prototype and have thus many quirks and limitations<p>So far the only one that I could reliably use was llamacpp grammars, and those are fairly slow
I'm hacking on a library (<a href="https://github.com/gsuuon/ad-llama">https://github.com/gsuuon/ad-llama</a>) inspired by guidance, but in TS and for the browser. I think structured inference and controlled sampling are really good ways of getting consistent responses out of LLM's. It lets smaller models really punch above their weight.<p>I wonder what other folks are building on this sort of workflow? I've been playing around with it and trying to figure out interesting applications that weren't possible before.
I've seen this link pop up in various places now, but it seems like it's still mostly not being developed? Is there a reason it was posted today? Some new development in it?
I've been using this library a lot, it's amazing. However, I noticed a very considerable degradation (time taken + generation quality) with versions > 0.0.58 when used with local LLMs.<p>I haven't taken time to compare between the different releases but if anyone is having the same type of issues, I recommend downgrading even if it might mean less features.