Ask Miss O11y: When should you delete instrumentation?By Charity Majors | April 29, 2022
When do you delete instrumentation?
You delete instrumentation when you delete code. Other than that, if you’re doing things right: almost never. One of the best things about honeycomb is that it completely transforms the incentives around preserving instrumentation.
With metrics-based tools, the most valuable metrics are always custom metrics. You need to define a custom metric for literally any question you might ever want to ask about the app and its utilization or performance. So companies tend to accrue a ton of them.
Unfortunately, the cost scales up at least linearly as you add custom metrics. And metrics with relatively high(er) cardinality can be blindingly expensive, like tens of thousands of dollars per year, PER METRIC. That is the reason why most instrumentation gets removed: when the cost of storing it outstrips the likelihood it will ever be useful.
But Honeycomb isn't built on metrics, it's built on top of arbitrarily-wide structured data blobs, aka events, a single wide event per request, per service. The Honeycomb equivalent of "defining a new custom metric" is adding a new dimension to the event.
But you can add as many dimensions you want to your events — appending more data to them is free(!!). It’s literally free for you, because we charge by the event, and effectively free for us because appending a few bits to the event store in S3 is goddamn cheap. There's no real need to ever go back and reap rarely-used dimensions. You never know what wisp of context might turn out to be relevant in the future, so why not keep it all?
A maturely instrumented service in Honeycomb tends to have 300-500 dimensions per event; that being said, we have a hard cap of 2,000 dimensions per event or 10,000 per dataset, mostly as a safeguard to catch folks who accidentally mix up their keys and values.
When you’re instrumenting your code, you should keep your team (and your future self) in the back of your mind. How will they debug this code in the future? What details might help them find an outlier or an edge case?
The longer you’ve been working on this code, the richer your instrumentation set should become. So delete that which is no longer relevant. But keep the rest.
BTW – if you’re interested in seeing Honeycomb in action but don’t want to sit on a demo or book a meeting with sales just yet, check out our interactive Honeycomb Play where you can see the product in three different use cases. If you want to explore even further don’t forget we have a free tier (always free not just for 2 weeks) that you can start pulling your data into and explore.
Dear Miss O11y, I’ve been told I need to use the OpenTelemetry Collector, but I have no idea what it is, or why I need...
More often than not, as developers, when we get a report that a large customer is hitting 500 errors, there's a flurry of activity. What's...