SMURF: Beyond the Test Pyramid

72 点作者 BerislavLopac7 个月前

13 条评论

I _really_ have to dispute the idea that unit tests score the maximum on maintainability. The fact that they are _so_ tightly tied to lower-level code makes your code _miserable_ to maintain. Anyone who's ever had to work on a system that had copious unit tests deep within will know the pain of not just changing code to fix a bug, but having to change a half-dozen tests because your function interfaces have now changed and a healthy selection of your tests refuse to run anymore.The "test diamond" has been what I've been working with for a long while now, and I find I greatly prefer it. A few E2E tests to ensure critical system functionality works, a whole whack of integration tests at the boundaries of your services/modules (which should have well-defined interfaces that are unlikely to change frequently when making fixes), and a handful of unit tests for things that are Very Important or just difficult or really slow to test at the integration level.This helps keep your test suite size from running away on you (unit tests may be fast, but if you work somewhere that has a fetish for them, it can still take forever to run a few thousand), ensures you have good coverage, and helps reinforce good practices around planning and documentation of your system/module interfaces and boundaries.

评论 #41892767 未加载

评论 #41894636 未加载

评论 #41911249 未加载

评论 #41892679 未加载

评论 #41893590 未加载

usbsea7 个月前

This is obvious, as anoter commenter said, but this is nonetheless useful.You can use it to show graduates. Why have them waste time relearning the same mistakes. You probably need a longer blog post with examples.It is useful as a check list, so you can pause when working earlier in the lifecycle to consider these things.I think there is power in explaining out the obvious. Sometimes experienced people miss it!The diagram can be condensed by saying SMUR + F = 1. IN other words you can slide towards Fidelity, or towards "Nice Testibility" which covers the SMUR properties.However it is more complex!Let's say you have a unit test for a parser within your code. For a parser a unit test might have pretty much the same fidelity as an intergation test (running the parse from a unit test, rather than say doing a compilation from something like Replit online). But the unit test has all the other properties be the same in this instance.Another point is you are not testing anything if you have zero e2e tests. You get a lot (a 99-1 not 80-20) by having some e2e tests, then soon the other type of tests almost always make sense. In addition e2e tests if well written and considers can also be run in production as synthetics.

评论 #41911349 未加载

评论 #41891044 未加载

imiric7 个月前

This is interesting, but I see a few issues with it:- Maintainability is difficult to quantify, and often subjective. It's also easy to fall into a trap of overoptimizing or DRYing test code in the pursuit of improving maintainability, and actually end up doing the opposite. Striking a balance is important in this case, which takes many years of experience to get a feel for.- I interpret the chart to mean that unit tests have high maintainability, i.e. it's a good thing, when that is often not the case. If anything, unit tests are the most brittle and susceptible to low-level changes. This is good since they're your first safety net, but it also means that you spend a lot of time changing them. Considering you should have many unit tests, a lot of maintenance work is spent on them.I see the reverse for E2E tests as well. They're easier to maintain, since typically the high-level interfaces don't change as often, and you have fewer of them.But most importantly, I don't see how these definitions help me write better tests, or choose what to focus on. We all know that using fewer resources is better, but that will depend on what you're testing. Nobody likes flaky tests, but telling me that unit tests are more reliable than integration tests won't help me write better code.What I would like to see instead are concrete suggestions on how to improve each of these categories, regardless of the test type. For example, not relying on time or sleeping in tests is always good to minimize flakiness. Similarly for relying on system resources like the disk or network; that should be done almost exclusively by E2E and integration tests, and avoided (mocked) in unit tests. There should also be more discussion about what it takes to make code testable to begin with. TDD helps with this, but you don't need to practice it to the letter if you keep some design principles in mind while you're writing code that will make it easier to test later.I've seen many attempts at displacing the traditional test pyramid over the years, but so far it's been the most effective guiding tool in all projects I've worked on. The struggle that most projects experience with tests stems primarily from not following its basic principles.

评论 #41891473 未加载

评论 #41891679 未加载

评论 #41891392 未加载

sverhagen7 个月前

This model ("mnemonic") feels like a good tool to reason about your testing strategy. I ran across the "testing trophy" in the past, which really changed my thinking already, having been indoctrinated with the testing pyramid for such a long time before that. I wanted to share my favorite links about the testing trophy, for those interested:<a href="https://tanzu.vmware.com/content/videos/tanzu-tv-springonetour-win-a-spring-testing-trophy-day-1" rel="nofollow">https://tanzu.vmware.com/content/videos/tanzu-tv-springoneto...</a><a href="https://kentcdodds.com/blog/the-testing-trophy-and-testing-classifications" rel="nofollow">https://kentcdodds.com/blog/the-testing-trophy-and-testing-c...</a>

Vadim_samokhin7 个月前

I think test pyramid is a great idea, in theory. In practice, having lots of unit tests with mocked dependencies doesn’t make me sure that everything works as it should. Thus, I use real database in my unit tests. There, I test serialization errors, authentication problems, network issues, etc. All the real problems which can occur in real life. Leaving these scenarios for integration tests layer will turn a test pyramid to a test diamond.And what was the rationale behind mocking a database in the first place, speed? Disable synchronous wal writes, or run your postgres instance in ram. Your test suite execution speed will skyrocket.

shepherdjerred7 个月前

I have found testcontainers to be an excellent way to write integration/end-to-end tests as easily as unit tests.It takes care of the chore of setting up test environments, though it won’t solve all of your problems.I took this approach when testing an application at my last workplace. It made writing tests significantly easier, and, IMO, fun.<a href="https://testcontainers.com/" rel="nofollow">https://testcontainers.com/</a>

satisfice7 个月前

This entire heuristic is not even about testing. The people who created it aren’t interested in testing— they want an excuse to release their shitty products.They believe that experiencing a product is just an afterthought. They are like chefs who never taste the food they cook.Testing is a process of investigation and learning. What this post covers is mechanical fact checking.

nemetroid7 个月前

End-to-end tests verify high-level expectations based on the specification of the system. These high-level expectations generally stay stable over time (at least compared to the implementation details verified by lower-level tests), and therefore end-to-end tests should have the best maintainability score.

评论 #41891996 未加载

评论 #41891725 未加载

jbjbjbjb7 个月前

Not to be pedantic, but practically speaking it looks like there are two dimensions: fidelity and then the rest (the SMUR).

lurking157 个月前

another unit test defense: they're the most accessible and inspectable in the sense that you have practically zero barrier of running it immediately in your IDE and stepping through if necessary

pydry7 个月前

They should at least admit they made a mistake with the "test pyramid".

评论 #41891110 未加载

评论 #41891038 未加载

评论 #41891049 未加载

eiathom7 个月前

It always amazes me how speed and testing are placed in the same bracket. I want solid verification, and a strong pattern to repeat verification no matter what. This then allows for fast implementation. So then something involving integrating a number of components as possible reliably makes the most sense (verification wise): I want to verify value early. It is eyebrow raising this pyrmaid nonsense has hung around.

评论 #41891589 未加载

评论 #41891716 未加载

grahamj7 个月前

This is all pretty obvious

评论 #41891139 未加载