The New Rules of Sampling
By Rachel Perkins | Last modified on May 9, 2019One of the most common questions we get at Honeycomb is about how to control costs while still achieving the level of observability needed to debug, troubleshoot, and understand what is happening in production. Historically, the answer from most vendors has been to aggregate your data--to offer you calculated medians, means, and averages rather than the deep context you gain from having access to the actual events coming from your production environment.
This is exactly what it sounds like--a poor tradeoff for performance. With classic metrics and APM tools, you can never again get back to the raw event source of truth, which means you'll regret that choice when debugging a complex, distributed system. When you’re working with metrics, the data must be numeric, and any other type of data must be stored as metadata either attached to the datapoints themselves or out-of-band in some way (“tags”, “dimensions”, etc), AKA: more limits on what you can store and retrieve.
Honeycomb's answer is: Sample your data.
But, you say, sampling means I'm throwing away some (or a lot) of my data. How is that OK? I won't know what I am not seeing, right?
What if you had more flexibility? What if sampling offered a greater breadth of options than just "send a percentage of my data"?
Find out what's possible in The New Rules of Sampling.
Related Posts
Calculating Sampling’s Impact on SLOs and More
What do mall food courts and Honeycomb have in common? We both love sampling! Not only do we recommend it to many of our customers,...
The Evolution of Sampling in Honeycomb: Introducing Refinery 2.0
It's rare to have too much telemetry—it's not often that someone says "I wish I didn't have all this information!" However, telemetry is data, and...
How Metrics Behave in Honeycomb
Honeycomb has the ability to receive events from application. These events can take the shape of Honeycomb wide events, OpenTelemetry trace spans, and OpenTelemetry metrics....