科技回声

Hi HN,We're a small team building AI tutors out of India, and as you might guess, this means we spend a ton of time writing, testing, and refining prompts for LLMs. When we started out, we were using the OpenAI playground but things became tedious when we wanted to compare responses from different models. We tried a bunch of other playgrounds but found them lacking in some features so we built our own.Quick Links:Github: <a href="https://github.com/supernova-app/ai-playground">https://github.com/supernova-app/ai-playground</a>Hosted demo: <a href="http://playground.getsupernova.ai" rel="nofollow">http://playground.getsupernova.ai</a>Demo video: <a href="https://www.youtube.com/watch?v=I01_t75FT-c" rel="nofollow">https://www.youtube.com/watch?v=I01_t75FT-c</a>TLDR:Main features are:- Monaco editor for writing prompts.- Variable support in prompts {{}}.- Syntax highlighting for tags like XML.- Generate multiple completions with same model.- Chat with multiple models simultaneously.- Save prompt and conversations as JSON.- Easy to self host.Key Features:1. Monaco Editor for Writing PromptsWhen we were working on long, detailed prompts, writing them in plain text felt clunky and error-prone. Small issues—like missing a tag or having weird formatting—could break things.So, we integrated the Monaco editor (used in VS Code). It gives us:- Line numbers (so we don't get lost in long prompts).- White space detection to catch formatting issues early.- Syntax highlighting for tags like XML.- Code folding to collapse parts of a prompt we're not actively working on.These might sound like small things, but they've been a huge help when we're dealing with large, complex prompts that need constant tweaking.2. Variable Support for Dynamic PromptsYou can define placeholders in your prompt using double curly braces ({{ }}) and fill them in via a friendly UI.3. Testing for Consistency Across CompletionsOne of the hardest parts of building AI tutors has been ensuring reliable outputs. Even when a prompt seems fine, it can fail unexpectedly—or worse, it works sometimes but not always.To address this, we made it easy to generate multiple completions from the same model at once. This lets us quickly see:- If the prompt is consistently producing good results.- Where the AI might misinterpret our intent.For example, we'd often run 5–6 completions to see if the AI consistently understood our instructions, rather than getting lucky once or twice.4. Comparing Models Side by SideThe main reason we built the playground in the first place. You can set up API keys for multiple providers and see how different models handle the same task. This helped us:- Optimize prompts for specific models.- Choose the best model for a particular use case.5. Saving Conversations as Test CasesAnother pain point was testing how a prompt or conversation would evolve over time. Sometimes, we'd go back and forth with the AI to simulate real user interactions, but we had no way to save that conversation for future reference.Now, with the playground, we can save these conversations as test cases. Here's how it works: 1. We create a conversation (or simulate a long interaction). 2. At any point, we can save it as a JSON file. 3. The JSON includes the full conversation, the system prompt, and any variables we used.We then use this JSON file and use it in our code for running test cases or run evals.6. Simple Self-HostingFinally, we wanted to make sure the playground was easy for others to set up. The only dependencies are:- A Postgres database.- API keys for the AI providers you want to use.It supports Google login, so you can setup OAuth and can restrict access to only your domain.The app is open source and we are running a hosted version of it here: <a href="http://playground.getsupernova.ai" rel="nofollow">http://playground.getsupernova.ai</a>.You can check out the repo here: <a href="https://github.com/supernova-app/ai-playground">https://github.com/supernova-app/ai-playground</a>. It's easy to self-host, and we're actively working on new features.If you give it a try, let us know what you think! Feedback, feature ideas, and contributions are all welcome.

Show HN: Open Source AI Playground for Prompt Engineers

暂无评论

Show HN: Open Source AI Playground for Prompt Engineers

暂无评论