TDD is Dead! Long Live TDD!

Imagine that you’re writing a web service. It is implemented with a bunch of classes. Pretend this circle represents your service, and the shapes inside it are classes.

The way I learned test-driven development[1], we wrote itty-bitty tests around every itty-bitty method in each class. Then maybe a few acceptance tests around the outside. This was supposed to help us drive design, and it was supposed to give us safety in refactoring. These automated tests would give us assurance, and make changing the code easier.

It doesn’t work out that way. Tests don’t enable change. Tests prevent change! In particular, when I want to refactor the internals of my service, any class I change means umpteen test changes. And all these tests include example == actual, and I’ve gotta figure out the new magic values that should pass. No fun! These method- or class-level tests are like bars in a cage preventing refactoring.

Tests prevent change, and there’s a place I want to prevent unintentional change: it’s at the service API level. At the outside, where other systems interact with this service, where a change in behavior could be a nasty surprise for some other team. Ideally, that’s where I want to put my automated tests.

Whoa, that is an ugly cage. At the service level, there are often many possible input scenarios. Testing every single one of them is painful. We probably can’t even think of every relevant combination and all the various edge cases. Much easier to zoom in to the class level and test one edge case at a time. Besides, even if we did write the dozens of tests to cover all the possibilities, what happens when the requirements change? Then we have great big tests with long expected == actual assertions, and we have to rework all of those. Bars in a cage, indeed.

Is TDD dead? Maybe it’s time to crown a new TDD. There’s a style of testing that addresses both of the difficulties in API-level testing: it finds all the scenarios and tames the profusion of hard-coded expectations. It’s called generative testing.[2]

Generative testing says, “I’m not gonna think of all the possible scenarios. I’m gonna write code that does it for me.” We write generators, which are objects that know how to produce random valid instances of various input types. The testing framework uses these to produce a hundred different random input scenarios, and runs all of them through the test.

Generative testing says, “I’m not gonna hard-code the output. I’m gonna make sure whatever comes out is good enough.” We can’t hard-code the output when we don’t know what the input is going to be. Instead, assertions are based on the relationship between the output and input. Sometimes we can’t be perfectly specific because we refuse to duplicate the code under test. In these cases we can establish boundaries around the output. Maybe, it should be between these values. It should go down as this input value goes up. It should never return more items than requested, that kind of thing.

With these, a few tests can cover many scenarios. Fortify with a few hard-coded examples if needed, and now half a dozen tests at the API level cover all the combinations of all the edge cases, as well as the happy paths.

This doesn’t preclude small tests that drive our class design. Use them, and then delete them. This doesn’t preclude example tests for documentation. Example-based, expected == actual tests, are stories, and people think in stories. Give them what they want, and give the computer what it wants: lots of juicy tests in one.

There are obstacles to TDD in this style. It’s way harder. It’s tough to find the assertions that draw a boundary around the acceptable results. There’s more thinking, less typing here. Lots more thinking, to find the assertions that draw a boundary around the acceptable output. That’s the hardest part, and it’s also the best part, because the real benefit of TDD is that it stops you from coding a solution to a problem you don’t understand.

look for more posts on this topic, to go along with my talks on it. See also my video about Property Based Testing in Scala


[1] The TDD I learned, at the itty-bitty level with mock all the things, was wrong. It isn’t what Kent Beck espoused. But it’s the easiest. [2] Or property-based testing, but that has NOTHING to do with properties on a class, so that name confuses people. Aside from that confusion I prefer “property-based”, which speaks about WHY we do this testing, over “generative”, which speaks about how.