Sampling   Metrics  

The New Rules of Sampling

By Rachel Perkins  |   Last modified on May 9, 2019

One of the most common questions we get at Honeycomb is about how to control costs while still achieving the level of observability needed to debug, troubleshoot, and understand what is happening in production. Historically, the answer from most vendors has been to aggregate your data--to offer you calculated medians, means, and averages rather than the deep context you gain from having access to the actual events coming from your production environment.

This is exactly what it sounds like--a poor tradeoff for performance. With classic metrics and APM tools, you can never again get back to the raw event source of truth, which means you'll regret that choice when debugging a complex, distributed system. When you’re working with metrics, the data must be numeric, and any other type of data must be stored as metadata either attached to the datapoints themselves or out-of-band in some way (“tags”, “dimensions”, etc),  AKA: more limits on what you can store and retrieve.

Honeycomb's answer is: Sample your data.

But, you say, sampling means I'm throwing away some (or a lot) of my data. How is that OK? I won't know what I am not seeing, right? 

What if you had more flexibility? What if sampling offered a greater breadth of options than just "send a percentage of my data"?

Find out what's possible in The New Rules of Sampling.


Related Posts

Observability   Metrics  

The Cost Crisis in Metrics Tooling

In my February 2024 piece The Cost Crisis in Observability Tooling, I explained why the cost of tools built atop the three pillars of metrics,...


How to Avoid Paying for Honeycomb

You probably know that we have a generous free plan that allows you to send 20 million events per month. This is enough for many...


​​Calculating Sampling’s Impact on SLOs and More

What do mall food courts and Honeycomb have in common? We both love sampling! Not only do we recommend it to many of our customers,...