TDD with generative testing: an example in Ruby

Say I’m in retail, and the marketing team has an app that helps them evaluate the sales of various items. I’m working on a web service that, given an item, tells them how many purchases were influenced by various advertising channels: the mobile app, web ads, and spam email.

The service will look up item purchase records from one system, then access the various marketing systems that know about ad clicks, mobile app usage, and email sends. It returns how many purchases were influenced by each, and uses some magic formula to calculate the relevance of each channel to this item’s sales.

My goal is to test this thoroughly at the API level. I can totally write an example-based test for this, with a nice happy-path input and hard-coded expected output. And then, I need to test edge cases and error cases. When no channels have impact; when they all have the same impact; when one fails; when they timeout; et cetera et cetera.

Instead, I want to write a few generative tests. What might they look like?

When I’m using test-driven development with generative testing, I start at the outside. What can I say about the output of this service? For each channel, the number of influenced purchases can’t be bigger than the total purchases. And the relevance number should be between 0 and 100, inclusive. I can assert that in rspec.

expect(influenced_purchases).to be <= total_purchases
expect(relevance).to be >= 0
expect(relevance).to be <= 100

These kinds of assertions are called “properties”. Here, “property” has NOTHING TO DO with a field on a class. In this context, a property is something that is always true for specified circumstances. The generated input will specify the circumstances.

To test this, I’ll need to run some input through my service and then make these checks for each output circle. It needs some way to query the purchase and marketing services, and I’m not going to make real calls a hundred times. Therefore my service will use adapters to access the outside world, and test adapters will serve up data.

result = InfluenceService.new(TestPurchaseAdapter.new(purchases),
                make_adapters(channel_events)).investigate(item)

result.channels.each do |(channel, influence)|
  expect(influence.influenced_purchases).to be <= total_purchases
  expect(influence.relevance).to be >= 0
  expect(influence.relevance).to be <= 100
end

(relatively complete code sample here.)
To do this, I need purchases, events on each channel, and an item. My test needs to generate these 100 times, and then do the assertions 100 times. I can use rantly for this. The test looks like this:

it “returns a reasonable amount of influence” do
 property_of {
  … return an array [purchases, channel_events, item] …
 }.check do |(purchaseschannel_eventsitem)|
  total_purchases = purchases.size
  result = InfluenceService.new(TestPurchaseAdapter.new(purchases),
                make_adapters(channel_events)).investigate(item)

  result.channels.each do |(channel, influence)|
   expect(influence.influenced_purchases).to be <= total_purchases
   expect(influence.relevance).to be >= 0
   expect(influence.relevance).to be <= 100
  end
 end
end

(Writing generators needs a post of its own.)
Rantly will call the property_of block, and pass its result into the check block, 100 times or until it finds a failure. Its objective is to disprove the property (which we assert is true for all input) by finding an input value that makes the assertions fail. If it does, it prints out that input value, so you can figure out what’s failing.

It does more than that, actually: it attempts to find the simplest input that makes the property fail. This makes finding the problem easier. It also helps me with TDD, because it boils this general test into the simplest case, the same place I might have started with traditional TDD.

In my TDD cycle, I make this test compile. Then it fails, and rantly reports the simplest case: no purchases, no events, any old item. After I make that test pass, rantly reports another simple input case that fails. Make that work. Repeat.

Once this test passes, all I have is a stub implementation. Now what? It’s time to add properties gradually. Usually at this point I sit back and think for a while. Compared to example-based TDD, generative testing is a lot more thinking and less typing. How can I shrink the boundaries?

It’s time for another post on relative properties.