TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Ragas – Open-source library for evals and testing RAG systems

15 pointsby shahulesabout 1 year ago
Ragas is an open-source library designed for evaluating and testing RAG (Retrieval-Augmented Generation) and other LLM applications. It offers a diverse set of metrics and methods, including synthetic test data generation, to help you assess your RAG applications. Ragas was initially developed to address our own needs for evaluating RAG chatbots last year.<p>### Problems Ragas Can Solve:<p>- How can you select the best components for your RAG, such as the retriever, reranker, and LLM?<p>- How can you create a test dataset without incurring significant expenses and time?<p>We believe there&#x27;s a need for an open-source standard for evaluating and testing LLM applications. Our vision is to establish this standard for the community. We&#x27;re addressing this challenge by adapting ideas from the traditional ML lifecycle for LLM applications.<p>### ML Testing Evolved for LLM Applications<p>Ragas is founded on the principles of metrics-driven development. Our goal is to develop and innovate techniques inspired by the latest research to address the challenges in evaluating and testing LLM applications.<p>We don&#x27;t think that merely building a sophisticated tracing tool will solve the evaluation and testing challenges. Instead, we aim to tackle these issues from a foundational level. To this end, we&#x27;re introducing methods such as automated synthetic test data curation, metrics, and feedback utilization. These approaches are inspired by lessons learned from deploying stochastic models throughout our careers as machine learning engineers.<p>While our current focus is on RAG pipelines, we intend to expand Ragas to test a broad spectrum of compound systems. This includes systems based on RAGs, agentic workflows, and various transformations.<p>### Try Ragas<p>Experience Ragas by trying it out in Google Colab [here](<a href="https:&#x2F;&#x2F;colab.research.google.com&#x2F;github&#x2F;shahules786&#x2F;openai-cookbook&#x2F;blob&#x2F;ragas&#x2F;examples&#x2F;evaluation&#x2F;ragas&#x2F;openai-ragas-eval-cookbook.ipynb" rel="nofollow">https:&#x2F;&#x2F;colab.research.google.com&#x2F;github&#x2F;shahules786&#x2F;openai-...</a>). For more information, read our [documentation](<a href="https:&#x2F;&#x2F;docs.ragas.io&#x2F;">https:&#x2F;&#x2F;docs.ragas.io&#x2F;</a>).<p>We would love to hear feedback from the Hacker News community :)

6 comments

jcyriacabout 1 year ago
The synthetic test data generation seems very useful. Do you have any idea of the cost of running this?
评论 #39767816 未加载
diyalizaabout 1 year ago
How does Ragas handle the challenge of adapting traditional ML testing methodologies to suit the intricacies of LLM applications?
kurianbenoyabout 1 year ago
How is the synthetic test data generation done in ragas?<p>Can I use custom Open source models like Mistral 7B to generate synthetic test data?
ajinmsajiabout 1 year ago
Is there support to open-source models? btw love your work!!
评论 #39767750 未加载
thomaspeter1998about 1 year ago
How do you actually use models for evaluation?
donalex98about 1 year ago
Can I use OSS models like Mixtral with it?
评论 #39767721 未加载