Unit tests are a liability, and I am not being ironic.

I have always been an advocate of proper unit testing. I believe it is a necessary attribute of building maintainable software. I also thought myself pretty experienced in writing sensible (unit) tests, but reading Unit Testing Principles, Practices and Patterns was an eye-opener in several ways. Vladimir Khorikov spends his 300+ pages teaching us what distinguishes high-quality tests from the time wasters. He assumes you know your way around the tooling already. It’s not a hard book and if you have testing experience you should get through it quickly. The examples are in C# but translate easily to Java. Let me summarize what has stuck with me most.

Scaffolding work in the historic town of Sandwich, Kent
  1. Tests are a liability. They cost time to write and maintain, sometimes without adding value. Focus on quality tests that bring value and never just drive for coverage.
  2. Unit tests should focus on units of observable behavior, not units of code. There’s no need to have a test suite for every class or source file. Implementation details don’t merit their own tests as long as their code is covered.
  3. Speed is of the essence: test suites should run quickly. Refactoring of production code should not require changes in tests.

A liability in accounting terms constitutes a debt of sorts, something you have to pay. It can also refer to a situation or a person that may cause trouble in the future. Test of questionable quality can do just that. They take time to write and maintain. They take time to run while you wait for them to complete. And they take time to modify when the source code that they touch has changed.
Excellent test coverage in major enterprise projects is the exception, not the norm, speaking from my own experience. Although we like to stress the importance of good unit testing, in practice it’s often no more than paying lip service. I think that is because we too often have experienced the act of writing and maintaining unit tests as a chore that gets in the way of doing our job, which is delivering valuable features. I have seen mission-critical code bases running fine that were not covered by tests, but most of them were still a tangled mess on the inside. Unit tests may not be a necessary requirement to working software in principle, but in practice they are indispensable to building it the right way.

We want tests to validate the business rules manifested in our evolving software and ring the alarm when it suddenly behaves differently. We want to ensure that changes which affect the observable behavior of the application are signaled by a failing test, in other words prevent regression, a.k.a. bugs.

To focus on observable behaviour means that our tests should take a black-box approach, zone in on the public API and not validate implementation details. If the public method returns a Boolean value that is converted from a numeric one or zero in some private method, we should be more concerned with what bubbles to the surface, not so much with the nuts and bolts of the conversion. It follows then that tests should cover units of behaviour, not necessarily units of code. This is a pretty big departure (and a game changer I might add) from the tenets of TDD that I grew up with of always writing a test and production source file in tandem. Khorikov makes a very good point against the established practice of tying tests to units of code (usually classes) and mocking or stubbing all dependencies, whether in your own code or from third-party libraries. Not only are these test doubles (mocks and stubs) awkward to set up, but they make your test code very sensitive to changes of the production code. Constant refactoring is important to agile, emergent development, but if your unit tests are tightly coupled to implementation details, they will fail, or even refuse to compile. Such failures are really false positives, because by definition a refactoring does not change observable behaviour and therefore should not break any tests, provided you tests only validate observable behaviour. The one-test-one-class approach may be thorough in terms of coverage, but you can’t avoid the regular chore of manually adapting it to the refactored production code. That is not adding value. Instead of helping you forward the test suite is now slowing you down. And while we’re on the subject of time and money.

Writing code, any code, is an expensive way to solve problems. Test code should guard against regressions in the first place. Therefore, you should focus your testing on the complicated, error-prone parts of the application. Be very thorough but stay on the outward-facing boundaries: the API. You get the same figures for lines and branch coverage that way than with the one-class-one-test approach, and with much less hassle. Granted, the process of writing tests leads to better design, avoids wrong assumptions during coding and has a documentary function: reading the test suite explains what the code is supposed to do. But those benefits can be achieved by other, cheaper means as well.

Last but least we have the need for speed. Tests should give quick feedback. Everyone who has ever batched up or otherwise postponed multiple code commits because the damn build takes half an hour knows what I am talking about. It breaks concentration flow and is expensive too, when you add up all those idle developer minutes. Having fewer but better tests is where you can find the solution.