This is very neat, thanks for sharing. I was wondering about a related thing — is there a way to query a llama.cpp (or other such local model) via an API from Python? In other words, I see a lot of cool applications being built with langchain + ClosedAPI, so I’m wondering if an API call to a local model could be a drop-in replacement for the ClosedAPI call?