Operational Software Design from Theo Schlossnagle



Operational Software Design; Ignorance Buys Pain


There are two common tenets of operations: "hell is other people's software," and "better software is produced by those forced to operate it." In this session I'll take a fly-by-tour of two pieces of software that were built from the ground up for operability from the hard-earned teachings of their inoperable predecessors: a distributed datastore replacing PostgreSQL, and a message queue replacing RabbitMQ.


We'll discuss specific design aspects that increase resiliency in the event of failure and observability at all times.