A new open source project to support testing prompts at scale by the creator of Airflow & Superset. The toolkit brings many of the ideas from test-driven development to the prompt engineering world, so that people integrating AI in their product case assert how it's performing as they iterate on prompts and models. The author talks about his use case where he used this toolkit to test text-to-SQL against a large corpus (thousands of prompts cases) against different models and through iteration cycles.