A CoPE’s Guide to Alert Management
Alerts are a perennial topic, and a CoPE will need to engage with them. The bounds of this problem space are formed by two types...
The CoPE and Other Teams, Part 2: Custom Instrumentation and Telemetry Pipelines
The previous post laid out the basic idea of instrumentation and how OpenTelemetry’s auto-instrumentation can get teams started. However, you can’t rely only on auto-instrumentation....
The CoPE and Other Teams, Part 1: Introduction & Auto-Instrumentation
The CoPE is made to affect, meaning change, how things work. The disruption it produces is a feature, not a bug. That disruption pushes things...
Staffing Up Your CoPE
Getting the right people working in the CoPE is crucial to success because these change agents must limber up the organization and promote the flexibility...
Independent, Involved, Informed, and Informative: The Characteristics of a CoPE
In part one of our CoPE series, we analogized the CoPE with safety departments. David Woods says that those safety departments must be: independent, involved,...
Establishing and Enabling a Center of Production Excellence
Software is in a crisis. This is nothing new. Complex distributed systems are perpetually in a state far from equilibrium, operating in what Richard Cook...
Evolving by Involving
In this post, we’re going to lay out the guiding principle that unifies the diverse world of CS as we see it—and show how we...
Autocatalytic Adoption: Harnessing Patterns to Promote Honeycomb in Your Organization
When an organization signs up for Honeycomb at the Enterprise account level, part of their support package is an assigned Technical Customer Success Manager. As...
Sense and Signals
Part of understanding a complex, distributed software system as a socio-technical system means taking seriously that the signals the stewards receive aren’t just chatter....