How do people feel about LLM generated tests?<p>I tried creating some on a personal project just using ChatGPT and it saved me a lot of toil on tests I probably wouldn’t have written. I did find I had low trust in refactoring my code, but higher than if I’d had no tests.<p>It seemed like a net positive for low risk cases.
Per the cited real world figures, that's about 1 in 40 tests that pass human review, or a success rate of about 2.5%.<p>It's hard to see value in spending resources this way right now - most notably, engineer time to review the generated tests. Improve the hit rate by an order of magnitude, and I suspect I'd feel differently.
Tried this out on a Ruby codebase and it generated Python tests: <a href="https://github.com/Codium-ai/cover-agent/issues/17">https://github.com/Codium-ai/cover-agent/issues/17</a>. Is there any data available on whether this actually works?
Using ChatGPT to generate unit tests works great almost out of the box, but I guess this system solves the remaining 5% to make it fully automated end-to-end. I believe this will work and help us write better software, given that I have experienced numerous cases where the generated tests (even with inferior models) catch no-so-obvious bugs.
To the OP:<p>Is your name a reference to Gronky Scripples? <a href="https://www.youtube.com/watch?v=4KG3v365mq4" rel="nofollow">https://www.youtube.com/watch?v=4KG3v365mq4</a>