Safety and Progress

At Papers We Love conference, Dr Heidi Howard described requirements for distributed consensus: Safety and Progress.

In distributed consensus, multiple servers decide on some value and then report that to their clients. Safety means that the clients never learn about different values; the consensus is all correct and consistent. Progress means that clients do eventually get the values. It doesn’t get stuck.

There are lots of ways to guarantee safety. The trick is to find ones that allow progress in all circumstances.

It reminds me of the conflict between security and development. Security teams are responsible for safety: prevention of bad things. Development teams are responsible for progress: making good things happen.

Separate these two responsibilities and you get deadlock. The obvious ways to get safety prevent progress, and the fastest routes to progress erode safety.

Algorithms, processes, designs that give you progress and safety exist, but they’re subtle. You won’t find them by fighting with each other.

In distributed consensus, the algorithm designer holds responsibility for safety and progress. To build the features that advance your business with the security that keeps it safe, put these responsibilities on the same team.

Safety and progress can be at odds. Don’t bake this conflict into your org structure.

Keep them entwined. Guarantee progress while allowing safety.