Control Loops and Rice Cookers


Rice cookers are fascinating machines. I’ve owned one for years, as rice is a significant part of my regular diet, and it completely removes the stress of preparing rice. They also operate on a simple principle that can help us operate cloud infrastructure – the control loop.

Tools for a Culture of Writing


One of the hardest things we do, as humans, is try and communicate what is going on in our minds to each other. With significant room for misunderstanding, biases, assumptions and cultural differences, communicating with other engineers (or to stakeholders) appears fraught. However, there are tools we can leverage to make ourselves understood, and to smooth the passage of information to makes sure it gets to the right people at the right time.

Point and Call


It’s 2AM. You’re paged to respond to a failing set of components that you are the Subject Matter Expert (SME) for. Sleepy, you load up the playbook for when the SplineReticulatorBlocked alert has gone off, and start executing. The Incident Commander (IC) is vaguely aware of what you are doing, and checks in now and then.

Unreasonably Effective Patterns


Much of my current job is maintaining and enhancing control planes for Heroku’s managed data services. This post explores three patterns used to reduce operational burden and increase system safety and resiliency: state machines (and associated state-transition tables), transducers and re-entrant and idempotent operations.

Everything I Know About Operations, I Learned From NHS 111


Ever heard someone say “It’s only software/money/<trivial thing>, not life or death”, in the context of incidents at your company? Although mostly true, I want to talk about a time in my career when sometimes, just sometimes, it was the latter, and how it shaped my approach to operating and owning services.