Lately, I've been tinkering with llama.cpp and the ollama server. The speed of these tools caught my attention, even on my modest 4060 setup. I was quite impressed with the generation quality of models like Mistral.<p>But I was a bit unhappy at the same time because whenever I explore a topic, there is a lot of typing involved when using the chat interface. So I needed a tool to not only give a response but also generate a set of "suggestions" which can be explored further just by clicking.<p>My experience in front-end development is limited. Nonetheless, I tinkered together a small web app to achieve the same goal. It is built with vuejs3+vuetify.<p>Code: <a href="https://github.com/charstorm/llmbinge/">https://github.com/charstorm/llmbinge/</a>
Interesting concept, can you share some more detail about the implementation? How are you generating the different portions of the interface? Seems like you have a couple canned prompts that trigger a few exploratory ideas in addition to a primary response.