Deepeval aims to make writing tests for LLM applications (such as RAG) as easy as writing Python unit tests.<p>For any Python developer building production-grade apps, it is common to set up PyTest as the default testing suite as it provides a clean interface to quickly write tests.<p>However, it is often uncommon for many machine learning engineers as their feedback is often in the form of an evaluation loss.<p>With the advent of agents, LLMs and AI, there is yet to be a tool that can provide software-like tooling and abstractions for machine learning engineers where the feedback loop of these iterations can be significantly reduced.<p>It is therefore important then to build a new type of testing framework for LLMs to ensure engineers can keep iterating on their prompts, agents and LLMs while being able to continuously add to their test suite.<p>Introducing DeepEval.