TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Ragas – Open-source library for evals and testing RAG systems

15 点作者 shahules大约 1 年前
Ragas is an open-source library designed for evaluating and testing RAG (Retrieval-Augmented Generation) and other LLM applications. It offers a diverse set of metrics and methods, including synthetic test data generation, to help you assess your RAG applications. Ragas was initially developed to address our own needs for evaluating RAG chatbots last year.<p>### Problems Ragas Can Solve:<p>- How can you select the best components for your RAG, such as the retriever, reranker, and LLM?<p>- How can you create a test dataset without incurring significant expenses and time?<p>We believe there&#x27;s a need for an open-source standard for evaluating and testing LLM applications. Our vision is to establish this standard for the community. We&#x27;re addressing this challenge by adapting ideas from the traditional ML lifecycle for LLM applications.<p>### ML Testing Evolved for LLM Applications<p>Ragas is founded on the principles of metrics-driven development. Our goal is to develop and innovate techniques inspired by the latest research to address the challenges in evaluating and testing LLM applications.<p>We don&#x27;t think that merely building a sophisticated tracing tool will solve the evaluation and testing challenges. Instead, we aim to tackle these issues from a foundational level. To this end, we&#x27;re introducing methods such as automated synthetic test data curation, metrics, and feedback utilization. These approaches are inspired by lessons learned from deploying stochastic models throughout our careers as machine learning engineers.<p>While our current focus is on RAG pipelines, we intend to expand Ragas to test a broad spectrum of compound systems. This includes systems based on RAGs, agentic workflows, and various transformations.<p>### Try Ragas<p>Experience Ragas by trying it out in Google Colab [here](<a href="https:&#x2F;&#x2F;colab.research.google.com&#x2F;github&#x2F;shahules786&#x2F;openai-cookbook&#x2F;blob&#x2F;ragas&#x2F;examples&#x2F;evaluation&#x2F;ragas&#x2F;openai-ragas-eval-cookbook.ipynb" rel="nofollow">https:&#x2F;&#x2F;colab.research.google.com&#x2F;github&#x2F;shahules786&#x2F;openai-...</a>). For more information, read our [documentation](<a href="https:&#x2F;&#x2F;docs.ragas.io&#x2F;">https:&#x2F;&#x2F;docs.ragas.io&#x2F;</a>).<p>We would love to hear feedback from the Hacker News community :)

6 条评论

jcyriac大约 1 年前
The synthetic test data generation seems very useful. Do you have any idea of the cost of running this?
评论 #39767816 未加载
diyaliza大约 1 年前
How does Ragas handle the challenge of adapting traditional ML testing methodologies to suit the intricacies of LLM applications?
kurianbenoy大约 1 年前
How is the synthetic test data generation done in ragas?<p>Can I use custom Open source models like Mistral 7B to generate synthetic test data?
ajinmsaji大约 1 年前
Is there support to open-source models? btw love your work!!
评论 #39767750 未加载
thomaspeter1998大约 1 年前
How do you actually use models for evaluation?
donalex98大约 1 年前
Can I use OSS models like Mixtral with it?
评论 #39767721 未加载