Software testing, and why I'm unhappy about it

78 点作者 drothlis超过 2 年前

10 条评论

13of40超过 2 年前

I have a lot of bitter things to say about automated testing, having spent 14 years of my life trying to knead it into a legitimate profession, but here's the most significant:You test case is more useless than a turd in the middle of the dining room table unless you put a comment in front of it that explains what it assumes, what it attempts, and what you expect to happen as a result.Because if you just throw in some code, you're only giving the poor bastard investigating it two puzzles to debug instead of one.

评论 #34437345 未加载

评论 #34437305 未加载

评论 #34437153 未加载

评论 #34440160 未加载

评论 #34438696 未加载

评论 #34438735 未加载

评论 #34437650 未加载

评论 #34444339 未加载

评论 #34439084 未加载

mixedCase超过 2 年前

> During day-to-day development, the important bit isn't that there are no failures. The important bit is that there are no regressions.And that's why we test and why tests shouldn't be allowed to fail.Just because the scenarios described make testing hard does not change reality of what makes tests valuable.If pre-existing failures are halting the production pipeline and you don't like it, switch off trunk based development and see if you like the waits and constant rebasing in large projects/teams. But don't eff with the bloody tests!

评论 #34436477 未加载

drewcoo超过 2 年前

Doctor, it hurts when I punch myself in the head!If testing that way is painful (and it is), then work with people to remove the pain. Tests are supposed to help developers, not constrain or punish them.Put tests in the same repo as the SUT. Do more testing closer to the code (more service and component tests) and do less end-to-end testing. Ban "flakey" tests - they burn engineering time for questionable payoff.Test failures can be thought of as "things developers should investigate." Make sure the tests are focused on telling you about those things as fast as possible.Also, take the human out of the "wait for green, then submit PR" steps. Open a PR but don't alert everyone else about it until you run green, maybe?

评论 #34438249 未加载

评论 #34438679 未加载

gampleman超过 2 年前

Seems to me like you're underinvesting in tooling. It's a mistake a lot of development shops make - you focus on your product, so you can't spend time building something completely orthogonal, but in the process you suddenly waste man-years wasting time on a broken PR process, instead of spending a month early on building some tooling that would have removed the pain in the first place.

andreareina超过 2 年前

The continuous testing is something I’ve thought about and it’s a tricky one. We use property tests[1] so here’s a quick stab at how I’d like it to look like:Test starts failing, immediately send a report with the failing input, then continue with the test case minimisation and send another report when that finishes.Concurrently, start up another long running process to look for other failures, skipping the input that caused the previous failure. We do want new inputs for the same failure though. This is the tricky one. We could probably make it work by having the prop test framework not reuse previously-failing inputs, but that’s one of the big strategies it uses to catch regressions.[1] specifically, hypothesis on python

ranting-moth超过 2 年前

> The above development practice works well when the SUT and TB are both defined by the same code repository and are developed together.I once witnessed a team creating an app, specs and tests in three respective repositories. For no other reason than "each project should be in it's own repository".The added work/maintenance around that is crazy, for absolutely no gain in that case.

nurettin超过 2 年前

If I am given the time and resources I do this:Phase 1. Code and test basic functions concerning any kind of arithmetic, mathematical distribution, state machines, file operations and datetimes. This documents any assumptions and makes a solid foundation.Phase 2. Write a simulation for generating randomized inputs to test the whole system. Run it for hours. If I can't generate the inputs, find as big a variety of inputs as possible. Collect any bugs, fix, repeat. This reduces the chances of finding real time bugs by three orders of magnitude.This has worked really well in the past whether I'm working on games, parsers or financial software. I don't conform to corporate whatever driven testing patterns because they are usually missing the crucial part 2 and time part 1 incorrectly.

theamk超过 2 年前

The author's problem is pretty simple: the test repo is required for pre-merge tests to pass, but it can be updated independently, without having pre-merge tests pass.And the answer is pretty simple: pin the specific test repo version! Use lockfiles, or git submodules, or put "cd tests && git checkout 3e524575cc61" in your CI config file _and keep it in the same repo as source code_ (that part is very important!).This solves all of author problems:> new test case is added to the conformance test suite, but that test happens to fail. Suddenly nobody can submit any changes anymore.Conformance test suite is pinned, so new test is not used. A separate PR has to update conformance test suite version/revision, and it must go through regular driver PR process and therefore must pass. Practically, this is a PR with 2 changes: update pin and disable new test.> are you going to remember to update that exclusion list?That's why you use "expect fail" list (not exclusion) and keep it in driver's dir. Ad you submit your PR you might see a failure saying: "congrats, test X which was expect-fail is now passing! Please remove it from the list". You'll need to make one more PR revision but then you get working tests.> allowing tests to be marked as "expected to fail". But they typically also assume that the TB can be changed in lockstep with the SUT and fall on their face when that isn't the case.And if your TB cannot be changed in lockstep with SUT, you are going to have truly miserable time. You cannot even reproduce the problems of the past! So make sure your kernel is known or at least recorded, repos are pinned. Ideally the whole machine image, with packages and all is archived somehow -- maybe via docker or raw disk image or some sort of ostree system.> Problem #2 is that good test coverage means that tests take a very long time to run.The described system sounds very nice, and I would love to have something like this. I suspect it will be non-trivial to get working, however. But meanwhile, there is a manual solution: have more than one test suite. "Pre-merge" tests run before each merge and contain small subset of testing. A bigger "continuous" test suite (if you use physical machines) or "every X hours" (if you use some sort of auto-scaling cloud) will run a bigger set of tests, and can be triggered manually on PRs if a developer suspects the PR is especially risky.You can even have multiple levels (pre-merge, once per hour, 4 times per day) but this is often more trouble than it worth.And of course it is absolutely critical to have reproducible tests first -- if you come up to work and find a bunch of continuous failures, you want to be able to re-run with extra debugging or bisect what happened.

评论 #34446638 未加载

评论 #34439174 未加载

drothlis超过 2 年前

Some good ideas here for when your tests are in a separate repo than the system under test (GPUs/drivers/compilers in the case of the author, but it's applicable to a variety of industries).

评论 #34436353 未加载

t00超过 2 年前

Have I misunderstood the article or it is just a matter of separating feature branches and putting relevant tests in a feature branch while keeping regression in a master branch?