Why is it so hard to estimate how long a piece of work will take?
When I estimate how long to add a feature, I break it down into tasks. Maybe I’ll need to create a table in the database, add a drop-down in the GUI, connect the two with a few changes to the service calls and service back-end. I picture myself adding a table to the database. That should take about a day, including testing and deployment. And so on for the other tasks.
Maybe it works out like this:
Create Table = 1 day
Service back-end = 2 days
New drop-down = 2 days
+ Service call = 1 day
New feature = 6 days
It almost never happens that way, does it? The estimate above is the happy path of feature development. Each component is probably accurate. If there’s a 70% chance that each of four tasks works as expected, then the chance of the feature being completed on time is (0.7^4) = 24%. Those aren’t very good odds.
It’s worse than that. Take the first task: create table. Maybe there’s a 70% chance of no surprises when we get to the details of schema design. And a 70% chance the tests work, nothing bites us. And a 70% chance of no problems in deployment. Then there’s only a 34% chance that Create Table will take a day. Break each of the others into three 70% pieces, and our chance of completing the feature on time is 1%. Yikes! No wonder we never get this right!
We can picture the happy path of development. It’s much harder to incorporate failure paths – how can we? We can’t expect the deployment to fail because some library upgrade was incompatible with the version of Ruby in production (or whatever). The chance of each failure path is very low, so our brains approximate it to zero. For one likely happy path, there are hundreds of low-probability failure paths. All those different failures add up — and then multiply — until our best predictions are useless. The most likely single scenario is still the happy path and 6 days, but millions of different possible scenarios each take longer.
It’s kinda like distributed computing. 99% reliability doesn’t cut it when we need twenty service calls to work for the web page to load – our page will fail one attempt out of five. The more steps in our task, the more technologies involved, the worse our best estimates get.
Now I don’t feel bad for being wrong all the time.
What can we do about this?
1. Smooth out incidental complexity: some tasks crop up in every feature, so making them very likely to succeed helps every estimate. Continuous integration and continuous deployment spot problems early, so we can deal with them outside of any feature task. Move these ubiquitous subtasks closer to 99%.
2. Flush out essential complexity: the serious delays are usually here. When we write the schema, we notice tricky relationships with other tables. Or the data doesn’t fit well in standard datatypes, or it is going to grow exponentially. The drop-down turns out to require multiple selection, but only sometimes. Sensitive data needs to be encrypted and stored in the token service — any number of bats could fly out of this feature when we dig into it. To cope: look for these problems early. Make an initial estimate very broad, work on finding out which surprises lurk in this feature, then make a more accurate estimate.
Say, for instance, we once hit a feature a lot like this one that took 4 weeks, thanks to hidden essential complexity. Then my initial estimate is 1-4 weeks. (“What? That’s too vague!” says the business.) The range establishes uncertainty. To reduce it, spend the first day designing the schema and getting the details of the user interface, and then re-estimate. Maybe the drop-down takes some detail work, but the rest looks okay: the new estimate is 8-12 days, allowing for we-don’t-know-which minor snafus.
Our brains don’t cope well with low-probability events. The scenario we can predict is the happy path, so that’s what we estimate. Reality is almost never so predictable. Next time you make an estimate, try to think about the possible error states in the development path. When your head starts to hurt, share the pain by giving a nice, broad range.