Code coverage is useful as a metric as long as it's not turned into a target. But it's primary usefulness is to tell a developer if they've given enough thought to how to test their code is correct or not.<p>A robot that back-fills coverage in tests seems... counterproductive to me.
Knee jerk reaction is to hate this, but if the PRs are small enough to quickly review and/or fix, having your test suite start to execute branches of code that it previously wasn’t has some value? It seems much (or exactly?) like using copilot— it’s going to be wrong, but sometimes it is 80% there nearly instantly, and you spend all of your time on the last 20%. Still, time saved, and value gained, so long as you know what you are doing yourself. Maybe at least annotate the bot tests to not mix them in with the intentionally added human tests, then it’s even harder to justify throwing this idea out completely.<p>Even if the test is COMPLETELY off, it might be enough to get someone thinking “but wait, that gives me enough of an idea to test this branch of code” or even better “wait, this branch of code is useless anyway, let’s just delete it”
Those are some pointless tests.<p>E.g. test_activation_stats_functions [1] that just checks that the returned value is a float, and that it can take random numbers as input.<p>test_get_state_dict_custom_unwrap [2] is probably supposed to check that custom_unwrap is invoked, but since it doesn't either record being called, or transform its input, the assertions can't actually check that it was called.<p>[1] <a href="https://github.com/huggingface/pytorch-image-models/pull/2331/files#diff-33c13e0b177bacd2f02e29bcb8aea5b49e7ce34901fd8f41fefb65defba1bd33R116">https://github.com/huggingface/pytorch-image-models/pull/233...</a><p>[2] <a href="https://github.com/huggingface/pytorch-image-models/pull/2331/files#diff-33c13e0b177bacd2f02e29bcb8aea5b49e7ce34901fd8f41fefb65defba1bd33R164">https://github.com/huggingface/pytorch-image-models/pull/233...</a>
Okay, but what do these tests <i>mean</i>? It's easy to add tests that are meaningless and either test the obvious or are just for coverage.<p>But some of the buggiest stuff I've dealt with were in codebases that had full coverage. Because none of the tests were designed to test the original intent of the designed code.
What would it look like if a tool like this was taken to its logical conclusion: an agent that could write unit tests for any existing code and get it to 100% code coverage. Not only that, it would have tests for every combinatorial code path reachable. Test files would outnumber human code by orders of magnitude. The ultimate pipe dream of some software developers would be fulfilled.<p>Would such a tool be helpful? probably in some circumstances (e.g. spacecraft software perhaps), but I sure wouldn't want such a tool. If this is less than ideal, then how do we reach a compromise? What code branches and functions should be left untested? Is this question even answerable from the textual representation of code and documentation?
Will we reach Dead GitHub eventually?<p>- DependaBot<p>- CoverageBot<p>- ReplyBot<p>- CodeOfConductEnforcementBot<p>- KudosBot<p>Old projects can live actively forever without any intervention.