I also wasn't aware that "unit" referred to an isolated test, not to the SUT. I usually distinguish tests by their relative level, since "unit" can be arbitrary and bring up endless discussions about what it actually means. So low-level tests are those that test a single method or class, and integration and E2E tests confirm the functionality at a higher level.<p>I disagree with the premise that "unit", or low-level tests, are not useful because they test the implementation. These are the tests that check every single branch in the code, every possible happy and sad path, use invalid inputs, etc. The reason they're so useful is because they should a) run very quickly, and b) not require any external state or setup, i.e. the traditional "unit". This does lead to a lot of work maintaining them whenever the implementation changes, but this is a necessary chore because of the value they provide. If I'm only relying on high-level integration and E2E tests, because there's much fewer of them and they are slower and more expensive to run, I might miss a low-level bug that is only manifested under very specific conditions.<p>This is why I still think that the traditional test pyramid is the best model to follow. Every new school of thought since then is a reaction towards the chore of maintaining "unit" tests. Yet I think we can all agree that projects like SQLite are much better for having very high testing standards[1]. I'm not saying that every project needs to do the same, but we can certainly follow their lead and aspire to that goal.<p>[1]: <a href="https://www.sqlite.org/testing.html" rel="nofollow noreferrer">https://www.sqlite.org/testing.html</a>
This resonates. I learned the hard way that you want your main tests to integrate all layers of your system: if the system is an HTTP API, the principal tests should be about using that API. All other tests are secondary and optional: can be used if they seem useful during implementation or maintenance, but should never be relied upon to test correctness. Sometimes you have to compromise, because testing the full stack is too expensive, but that's the only reason to compromise.<p>This is largely because if you try to test parts of your system separately, you have to perfectly simulate how they integrate with other parts, otherwise you'll get there worst case scenario: false test passes. That's too hard to do in practice.<p>I suspect that heavy formalization of the parts' interfaces would go a long way here, but I've not yet seen that done.
The big TDD misunderstanding is that most people consider TDD a testing practice.
The article doesn’t talk about TDD, it gives the reader some tips on how to write tests. That’s not TDD.
> Now, you change a little thing in your code base, and the only thing the testing suite tells you is that you will be busy the rest of the day rewriting false positive test cases.<p>If there is anything that makes me cry, it’s hearing “it’s done, now I need to fix the tests”
> <i>Tip #1: Write the tests from outside in.</i><p>> <i>Tip #2: Do not isolate code when you test it.</i><p>> <i>Tip #3: Never change your code without having a red test.</i><p>> <i>Tip #4: TDD says the process of writing tests first will/should drive the design of your software. I never understood this. Maybe this works for other people but it does not work for me. It is software architecture 101 — Non-functional requirements (NFR) define your architecture. NFR usually do not play a role when writing unit tests.</i><p>The one time I ever did "proper" red/green cycle TDD, it worked because I was writing a client library for an existing wire protocol, and knew in adance <i>exactly</i> what it needed to do and how it needed to do it.<p>Item 2 is right, but this also means that #1 is wrong. And knowing what order #2 requires, means knowing how the code is designed (#4).
Part of the problem is caused by all sides using the same terms but with a different meaning.<p>> You just don’t know if your system works as a whole, even though each line is tested.<p>... even though each line has been executed.<p>One test per line is strongly supported by tools calculating coverage and calling that "tested".<p>A test for one specific line is rarely possible. It may be missing some required behavior that hasn't been challanged by any test, or it may be inconsistent with other parts of the code.<p>A good start would be to stop calling something just executed "tested".
My view on unit testing is if there are no dependencies, there is no real reason not to write tests for all behaviours. While you may have a wonderful integration testing suite, it is still great to know that building blocks work as intended.<p>The problems arise with dependencies as now you need to decide to mock them or use concrete implementations. The concrete implementation might be hard to set up , slow to run in a test - or both. Using a mock, on the other hand, is essentially an alternate implementation. So now your code has the real implementation + one implementation per test (in the limit) which is plainly absurd.<p>My current thinking (after writing a lot of mocks) is to try to shape code so that more of it can be tested without hard to setup dependencies. When this can't be done, think hard about the right approach. Try to put yourself in the shoes of a future maintainer. For example, instead of creating a bespoke mock for just your particular test, consider creating a common test utility that mocks a commonly used dependency in accordance with common testing patterns. This is just one example. Annoyingly, a lot of creativity is required once dependencies of this nature are involved which is why it is great to shape code to avoid it where possible.
In my experience a lot of engineers are stuck thinking in MVC terms an fail to write modular code. As a result most business logic is part of a request / response flow. This makes it infeasible to even attempt to write tests first, thus leaving integration or e2e tests as the only remaining options.
> Never change your code without having a red test<p>I'll never understand why people insist on this. If you want to write your tests first, that is fine. Noone is going to stop you. But why must you insist everyone does it this way?
I think that unit tests are super valuable because when used properly, they serve as micro-specifications for each component involved.<p>These would be super hard to backfill later, because usually only the developer who implements them knows everything about the units (services, methods, classes etc.) in question.<p>With a strongly typed language, a suite of fast unit tests can already be in feature parity with a much slower integration test, because even if mocked out, they essentially test the whole call chain.<p>They can offer even more, because unit tests are supposed to test edge cases, all error cases, wrong/malformed/null inputs etc. By using integration tests only, as the call chain increases on the inside, it would take an exponentially higher amount of integration tests to cover all cases. (E.g. if a call chain contains 3 services, with 3 outcomes each, theoretically it could take up to 27 integration test cases to cover them all.)<p>Also, ballooning unit test sizes or resorting to unit testing private methods give the developer feedback that the service is probably not "single responsibility" enough, providing incentive to split and refactor it. This leads to a more maintainable service architecture, that integration tests don't help with.<p>(Of course, let's not forget that this kind of unit testing is probably only reasonable on the backend. On the frontend, component tests from a functional/user perspective probably bring better results - hence the popularity of frameworks like Storybook and Testing Library. I consider these as integration rather than unit tests.)
Was 'unit' originally intended to be a test you could run in isolation? I don't think so. I'm not an expert in testing history, but this Dec 2000 Software Quality Assurance guide from the Nuclear Regulatory Commission defines Unit Testing as:<p>> Unit Testing - It is defined as testing of a unit of software such as a subroutine that can be compiled or assembled. The unit is relatively small; e.g., on the order of 100 lines. A separate driver is designed and implemented in order to test the unit in the range of its applicability.<p>NUREG-1737 <a href="https://www.nrc.gov/docs/ML0101/ML010170081.pdf" rel="nofollow noreferrer">https://www.nrc.gov/docs/ML0101/ML010170081.pdf</a><p>Going back, this 1993 nuclear guidance has simililar language:<p>> A unit of software is an element of the software design
that can be compiled or assembled and is relatively small
(e.g., 100 lines of high-order language code). Require that
each software unit be separately tested.<p>NUREG/BR-0167 <a href="https://www.nrc.gov/docs/ML0127/ML012750471.pdf" rel="nofollow noreferrer">https://www.nrc.gov/docs/ML0127/ML012750471.pdf</a>
When I first leant about unit tests / TDD, I was confused because everyone assumes you are doing OOP. What am I supposed to do with my C code? I can just test a function, right? Or do I have to forcefully turn my program into some OO-syle architecture?<p>But then I realized it does not matter, there is only important thing about unit tests: that they exists. All the rest is implementation detail.<p>Mocking or not, isolated "unit" or full workflow, it does not matter. All I care about is that I can press a button (or type "make test" or whatever) and my tests run and I know if I broke something.<p>Sure, your tests need to be maintainable, you should not need to rewrite them when you make internal changes, and so on. You'll learn as you go. Just write them and make them easy to run.
Read <a href="https://www.manning.com/books/unit-testing" rel="nofollow noreferrer">https://www.manning.com/books/unit-testing</a> it's the best book on the subject and is presenting the matter with good evidence.<p>"Tip #4: TDD says the process of writing tests first will/should drive the design of your software. "<p>Yes and if that does not happen during TDD i would argue you are not doing TDD. Sure you always have some sort of boundaries but design up front is a poor choice when you try to iterate towards the best possible solution.
This article is internally inconsistent. It leads with considering "unit" to be "the whole system" being bad, and then tip #1 is to test from the outside in, at whole system granularity. On the other hand, it does point out that "design for test" is a nonsense, so that meets my priors.<p>By far the worst part of TDD was the proposed resolution to the tension with encapsulation. The parts one wants to unit test are the small, isolated parts, aka "the implementation", which are also the parts one generally wants an abstraction boundary over. Two schools of thought on that:<p>- one is to test through the API, which means a lot of tests trying to thread the needle to hit parts of the implementation. The tests will be robust to changes in the implementation, but the grey box coverage approach won't be, and you'll have a lot of tests<p>- two is to change the API to expose the internals, market that as "good testable design" and then test the new API, much of which is only used from test code in the immediate future. Talk about how one doesn't test the implementation and don't mention the moving of goal posts<p>Related to that is enthusiasm for putting test code somewhere separate to production code so it gets hit by the usual language isolation constraints that come from cross-module boundaries.<p>Both of those are insane nonsense. Don't mess up your API to make testing it easier, the API was literally the point of what you're building. Write the tests in the same module as the implementation and most of the API challenge evaporates. E.g. in C++, write the tests in an anonymous namespace in the source file. Have more tests that go through the interface from outside if you like, but don't only have those, as you need way more to establish whether the implementation is still working. Much like having end to end tests helps but having only end to end tests is not helpful.<p>I like test driven development. It's pretty hard to persuade colleagues to do it so multi-developer stuff is all end to end tested. Everything I write for myself has unit tests that look a lot like the cases I checked in the repl while thinking about the problem. It's an automated recheck-prior-reasoning system, wouldn't want to be without that.
For some classic wisdom about writing tests see the classic "Art of Software Testing" by Glenford Myers. It's $149+++ on Amazon, but only $5 on ebay:<p><a href="https://www.ebay.com/sch/i.html?_from=R40&_trksid=p3671980.m570.l1313&_nkw=art+of+software+testing+myers&_sacat=0" rel="nofollow noreferrer">https://www.ebay.com/sch/i.html?_from=R40&_trksid=p3671980.m...</a><p>This was originally published before TDD was a thing, but is highly applicable.
Which companies or large projects use TDD at the moment? There's always such intense discussion about what it is and its benefits, yet I don't see anyone actually doing TDD.
I think it's both attention-getting and distracting to start with a definition of unit testing that hardly anybody uses. Now I'm not interested in the article because I have to see what your sources are and whether you're gaslighting me.<p>The reason people use the term unit test to mean the size of the system under test is because that's what it's generally meant. Before OO, it would mean module. Now it means class. The original approach would be to have smaller, testable functions that made up the functionality of the module and test them individually. Decoupling was done so that you didn't need to mock the database or the filesystem, just the logic that you're writing.<p>Some people disagree with unit testing and focus on functional testing. For example, the programming style developed by Harlan Mills at IBM was to specify the units very carefully using formal methods and write to the specification. Then, black-box testing was used to gain confidence in the system as a whole.<p>I feel that a refactor shouldn't break unit tests, at least not if the tools are smart enough. If you rename a method or class, its uses should have been renamed in the unit tests. If you push a method down or up in a hierarchy, a failing test tells you that the test is assuming the wrong class. But most cases of failing tests should be places where you made a mistake.<p>However, I agree that functional tests are the hurdle you should have crossed before shipping code. Use unit testing to get 100ms results as you work, functional tests to verify that everything is working correctly. Write them so that you could confidently push to production whenever they're green.
The article highlights this claim:<p><i>"Now, you change a little thing in your code base, and the only thing the testing suite tells you is that you will be busy the rest of the day rewriting false positive test cases."</i><p>Whenever this is the case, it would seem at least one of the following is true:<p>1) There are many ways the 'little change' could break the system.<p>2) Many of the existing tests are testing for accidental properties which are not relevant to the correct functioning of the system.<p>If only the second proposition describes the situation, then, in my experience, it is usually a consequence of tests written to help get the implementation correct being retained in the test suite. That is not necessarily a bad thing: with slight modification, they <i>might</i> save time in writing tests that are useful in getting the new implementation correct.<p>I should make it clear that I don't think this observation invalidates any of the points the author is making; in fact, I think it supports them.
TDD can be valuable but sometimes hindering. I find myself often with an incomplete idea of what I want and, thus, no clear API to start testing. Writing a quick prototype -- sometimes on godbolt or replit -- and then writing tests and production code will actually yield me a better productivity.<p>I usually test all of the public API of something and only it. Exported functions, classes, constants and whatever should be tested and properly documented. If writing tests for the public surface is not enough, most likely the underlying code is poorly written, probably lacking proper abstractions to expose the adequate state associated with a determined behaviour (e.g.: a class that does too much).
I think it also heavily depends on the language you are working with. For instance, unit tests are much more important in a duck-typed language than a strongly typed language, since the compiler is less capable of catching a number of issues.
Huh, I found this more interesting than I thought I would. I hadn't heard before that the "unit" in "unit test" just meant "can run independently". I once failed an interview partly because of only writing "feature tests" and not "unit tests" in the project I showed. But actually those tests didn't depend on each other, so... looks like they really were unit tests!<p>Anyway, I'm still not totally sure about TDD itself - the "don't write any code without a red test" part. I get the idea, but it doesn't feel very productive when I try it. Of course maybe I'm just bad at it, but I also haven't seen any compelling arguments for it other than it makes the tests stronger (against what? someone undoing my commits?). I think even Uncle Bob's underlying backing argument was that TDD is more "professional", leading me to believe it's just a song-and-dance secret handshake that helps you get into a certain kind of company. OR, it's a technique to try and combat against lazy devs, to try and make it impossible for them to write bad tests. And maybe it is actually good but only for some kinds of projects... I wish we had a way to actually research this stuff rather than endlessly share opinions and anecdotes.
One way I think about it is that unit tests help me maintain invariances that the compiler for the current language can't do for me.<p>That saves me from having to test every combination of inputs.
Michael Feathers in "working with legacy code" tackled the unit test definition, and in the end defined a unit test as a test that runs fast.<p>Also a very confusing topic for me early on as I tried to grasp how to actually do it. So I read Kent Beck's book on TDD and was even more confused because he did not write tests in isolation there, and I was told to write unit tests like that. But then it hit me: most people are just cargo cultists and repeat what somebody told them.
> Write the tests from outside in. With this, I mean you should write your tests from a realistic user perspective. To have the best quality assurance and refactor resistance, you would write e2e or integration tests.<p>Yah, yah. But good look trying to figure out what went wrong when only the failing test you have is an e2e or integration test.
I think this article misunderstands the motivation for unit tests being isolated and ends up muddying the definition unnecessarily. The unit of unit testing means we want to assign blame to a single thing when a single test fails. That’s why order dependencies have to be eliminated, but that’s an implementation detail.
<< Originally, the term “unit” in “unit test” referred not to the system under test but to the test itself. >><p>Retroactive continuity - how does it work?<p>For today's lucky 10,000: "unit test", as a label, was in wide use prior to the arrival of the Extreme Programming movement in the late 1990s. And the "unit" in question was the test subject.<p>But, as far as I can tell, _Smalltalk_ lacked a testing culture (Beck, 1994), so perhaps the testing community's definitions weren't well represented in Smalltalk spaces.<p>"The past was alterable. The past never had been altered."<p>(Not particularly fair to single out this one author - this origin myth has been common during the last 10 years or so.)
I am not going to say that some of these testing religions don't have a place, but mostly they miss the point. By focusing on TDD or "code coverage" the essential questions are missed. Instead of focusing on methodology instead I recommend asking yourself simple questions starting with:<p>1. How do I meet my quality goals with this project (or module of a project)?<p>This is the root question and it will lead to other questions:<p>2. What design and testing practice is most likely to lead to this outcome?<p>3. What is the pay off value for this module for a given type of testing?<p>4. How can I be confident this project/module will continue to work after refactoring?<p>etc. etc.<p>I have used TDD style unit testing for certain types of modules that were very algorithmic centric. I have also used integration testing only for other modules that were I/O centric without much logic. I personally think choosing a testing strategy as the "one right way" and then trying to come up with different rules to justify it is exactly inverse of how one should be thinking about it (top down vs bottom up design of sorts).
I took a break from a big tech and joined a startup. It was infuriating how people were actively opposing my TDD approach. It was redeeming when I was shipping one of my projects (service for integration with 3rd party) - and product managers and others were expecting we will need weeks to test and fix the bugs - but instead the other party just said "it's perfect, no comments".<p>All because I was "wasting my time" on writing "useless tests" that helped me identify and discuss edge cases early in the process. Also, I could continue working on parts even while waiting for a response from product managers or while DevOps were still struggling to provision the resources.