TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Empirical – test framework for JavaScript developers building with LLMs

4 点作者 arjun2712 个月前
Hi HN!<p>This is Arjun and Saikat, and like other product engineers, we&#x27;ve been excited to build with LLMs. Getting powerful models available as off-the-shelf HTTP endpoints is a huge leap forward to integrate and ship ML to end-users.<p>While building on top of LLMs, we&#x27;ve also experienced the pain of non-deterministic behavior – especially for applications that require smaller models. Iterating through model configuration while ensuring no regressions across hundreds of scenarios is a tricky balance.<p>To make this easier, we built Empirical. Here’s a demo video: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=p8gSGphcOSU" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=p8gSGphcOSU</a><p>We&#x27;ve focused on:<p>- Fast iteration cycles and interactivity when you need to change the prompt or add a new sample. We wanted to build something that feels like “hot reload” for LLM development<p>- A capable UI that combines objective and subjective evaluation, since eye-balling outputs makes it easier to build intuition around model behavior<p>- Ability to customize which model to test, or how to score it — with JavaScript (or Python, if you really must)<p>- Embedded analytics for evaluation results, powered by DuckDB under the hood (more coming up on this!)<p>You can try Empirical today – with a one line CLI command – locally or on CI&#x2F;CD. And oh, Empirical is 100% open source – so file an issue and we’d be happy to make it work for your use-case<p>$ npx empiricalrun<p>GitHub: <a href="https:&#x2F;&#x2F;github.com&#x2F;empirical-run&#x2F;empirical">https:&#x2F;&#x2F;github.com&#x2F;empirical-run&#x2F;empirical</a><p>Docs: <a href="https:&#x2F;&#x2F;docs.empirical.run&#x2F;" rel="nofollow">https:&#x2F;&#x2F;docs.empirical.run&#x2F;</a>

暂无评论

暂无评论