Limitations of Abstraction, and the Code+Coder symbiosis

Notes from #qconnewyork

I went into programming because I loved the predictability of it. Unlike physics, programs were deterministic at every scale. That’s not true anymore – and it doesn’t mean programming isn’t fun. This came out in some themes of QCon New York 2014.

In the evening keynote, Peter Wang told us we’ve been sitting pretty on a stable machine architecture for a long time, and that party is over. The days of running only on x86 architecture are done. We can keep setting up our VMs and pretending, or we can pay attention to the myriad devices cropping up faster than people can build strong abstractions on top of them. The Stable Dependencies Principle is crumbling under us.

Really we haven’t had a good, stable architecture to build on since applications moved to the web, as Gilad Bracha reminded us in the opening keynote. JavaScript has limitations, but even more, the different browsers keep programmers walking on eggshells trying not to break any of them. The responsibility of a developer is no longer just their programming language. They need to know where their code is running and all the relevant quirks of the platform. “It isn’t turtles all the way down anymore. We are the bottom turtle, or else the turtle under you eats your lunch.” @pwang

As a developer’s scope deepens, so also is it widening. Dianne Marsh’s keynote and Adrian Cockroft’s session about how services are implemented at Netflix emphasized developer responsibility through the whole lifecycle of the code. A developer’s job ends when the code is retired from production. Dianne’s mantra of “Know your service” puts the power to find a problem in the same hands that can fix it. Individual developers implement microservices, deploy them gradually to production, and monitor them. Developers understanding the business context of their work, and what it means to be successful.

It’d be wonderful to have all the tech and business knowledge in one head. What stops us is: technical indigestion. Toooo much information! The Netflix solution to this is: great tooling. When a developer needs to deploy, it’s their job to know what the possible problems are. It is the tool’s job to know how to talk to AWS, how to find out what the status is of running deployments, how to reroute between old-version and new-version deployments. The tool gives all the pertinent information to the person deploying, and the person makes the decisions. Enhanced cognition, just like Engelbert always wanted (from @pwang’s keynote).
“When you have automation plus people, that’s when you get something useful.” – Jez
“Free the People. Optimize the Tools.”- Dianne Marsh

Those gradual rollouts, they’re one of the new possibilities now that machines aren’t physical resources in data centers. We can deploy with less risk, because rollback becomes simply a routing adjustment. Lowering the impact of failure lets us take more risks, make more changes, and improve faster without impacting availability. To learn and innovate, do not prevent failure! Instead, detect it and stay on your

This changed deployment process is an example of something Adrian Cockroft emphasizes: question assumptions. What is free that used to be expensive? What can we do with that, that we couldn’t before? One of those is the immutable code, where every version of a service is available until someone makes the decision to take it down. And since you’re on pager duty for all your deployed code, there’s incentive to take it down.

When developers are responsible for the code past writing it, through testing and deploy and production, this completes a feedback loop. Code quality goes up, because the consequences of bugs fall directly on the person who can prevent them. This is a learning opportunity for the developer. It’s also a learning opportunity for the code! Code doesn’t learn and grow on its own, but widen the lines. Group the program in with the programmer into one learning organism, a code+coder symbiote. Then the code in production, as its effects are revealed by monitoring, can teach the programmer how to make it better in the next deployment.

Connection between code and people was the subject of Michael Feathers’ talk. Everyone knows Conway’s Law: architecture mirrors the org chart. Or as he phrases it, communication costs drive structure in software. Why not turn it to our advantage? He proposed structuring the organization around the needs of the code. Balance maintaining an in-depth knowledge base of each application against getting new eyes on it. Boundaries in the code will always follow the communication boundaries of social structure, so divide teams where the code needs to divide, by organization and by room. Eric Evans also suggested using Conway’s Law to maintain boundaries in the code. Both of these talks also emphasized the value of legacy code, and also the need for renewal: as the people turn over, so must the code. Otherwise that code+coder symbiosis breaks down.

Eric Evans emphasized: When you have a legacy app that’s a Big Ball of Mud, and you want to work on it, the key is to establish boundaries. Use social structure to do this, and create an Anti-Corruption Layer to intermediate between the two, and consider using a whole new programming language. This discourages accidental dependencies, and (as a bonus) helps attract good programmers.

Complexity is inevitable in software; bounded contexts are part of the constant battle to keep it from eating us. “We can’t eliminate complexity any more than a physicist can eliminate gravity.” (@pwang)

In code and with people, successful relationships are all about establishing boundaries. At QCon it was a given that people are writing applications as groups of services, and probably running them in the cloud. A service forms a bounded context; each service has its internal model, as each person has a mental model of the world. Communications between services also have their own models. Groups of services may have a shared interstitial context, as people in the same culture have established protocols. (analogy mine) No one model covers all of communications in the system. This was the larger theme of Eric Evans’ presentation: no one model, or mandate, or principle applies everywhere. The first question
of any architecture direction is “When does this apply?”

As programmers and teams are going off in their own bounded contexts doing their own deployments, Jez Humble emphasized the need to come together — or at least bring the code together — daily. You can have a separate repo for every service, like at Netflix, or one humongoid Perforce repository for everything, like with Google. You can work on feature branches or straight on master. The important part is: everyone commits to trunk at the end of the day. This forces breaking features into small features; they may hide behind feature flags or unused APIs, but they’re in trunk. And of course that feeds into the continuous deployment pipeline. Prioritize keeping that trunk deployable over doing new work. And when the app is always deployable, a funny thing happens: marketing and developers start to collaborate. There’s no feature freeze, no negotiating of what’s going to be in the next release. As developers take responsibility of the post-coding lifecycle, they gain insight into the pre-coding part too. More learning can happen!

As developers start to follow the code more closely, organizational structure can’t hold to a controlled hierarchy. Handoffs are the enemy of innovation, according to Adrian. The result of many independent services is an architecture diagram that can only be observed from production monitoring, and it looks like the Death Star:

I wonder how long before HR diagrams catch up and look like this too?

Dianne and Jez both used “Highly aligned, loosely coupled” to describe code and organization. Leadership provides direction, and the workers figure out how to reach the target by continually trying things out. Managers enable this experimentation. If the same problem is solved in multiple ways, that’s a win: bring the results together and learn from both. No one solution applies in all contexts.

Overall, QCon New York emphasized: question what you’re used to. Question what’s under you, and question what people say you can’t do. Face up to realities of distributed computing: consistency doesn’t exist, and failure is ever present. We want to keep rolling through failure, not prevent it. We can do this by building tools that support careful decision making. If we each support our code, our code will support our work, and we can all improve.

This post draws from talks by Peter Wang, Dianne Marsh, Adrian Cockroft, Eric Evans, Michael Feathers, Jez Humble, Ines Sombra, Richard Minerich, Charles Humble. It also draws from my head.
Most of the talks will be available on InfoQ eventually.