Refine Your Observability Experience at ScaleBy George Miranda | Last modified on March 8, 2021
Today, we announced that Refinery is now generally available. With Refinery, it’s now easy to highlight the critical debugging data you need and to stop paying for the rest. Refinery is a sampling solution that lets you control resource costs at scale without sacrificing data fidelity.
Support for Refinery is now also included in Honeycomb Enterprise plans. Enabling a high-fidelity debugging experience while reducing resource costs ensures that enterprises have what they need to achieve production excellence at scale.
At a large enough scale (think: tens of billions of events per month), the costs incurred to collect, transmit, and save every single event that your system generates dramatically outweighs the benefits that data is intended to provide. Refinery is Honeycomb’s sampling solution to help you get the most out of your data. It improves your observability experience at scale by providing intuitive yet sophisticated methods to focus on keeping only the data that you want and allowing you to stop paying for the rest.
For the vast majority of applications, the incremental cost of collecting observability data is incredibly minimal and unobtrusive. But at that hundreds-of-billions-per-year scale, suddenly every byte generated can add up to immense amounts of data and traffic to manage. When your stream of observability data instead turns into a raging flood, there’s a tradeoff to consider: Exactly how much data do you need to achieve the desired result, without breaking other systems or the bank? Liz Fong-Jones unpacks those concerns in her post about refining your sampling data.
Let’s be clear. You don’t need to sample your data with Honeycomb. We handle some of the world’s biggest and most powerful production applications (send us your biggest traffic loads!). But some of you who think about production event volume at that aforementioned scale might find yourselves wanting to sample your data once you hit those tradeoff limits. For you, Refinery makes tackling the tradeoff easy.
The reality of most applications is that the vast majority of their events are successful and virtually identical. In order to debug effectively, what’s needed is a representative sample of successful (or "good") events, against which to compare the unsuccessful (or "bad") events. Until now, the hard part has been figuring out how to keep the enough of the right “bad” events and only occasionally sample the “good” events. To dig into that problem more than Liz’s post above, check out the “Cheap and Accurate Enough: Sampling” chapter of our upcoming O’Reilly book.
Refinery supports several sampling methods by default, including support for both dynamic and tail sampling. Dynamic sampling helps you identify and keep the data you care about most by doing things like automatically capturing rare traffic frequently and frequent traffic rarely, or automatically adjusting your sample rates in response to shifting traffic patterns. Refinery helps you make decisions using head-based factors like endpoint call frequency, or tail-based factors like request latency and status code. You can mix and match sampling methods, verify the results in advance with a dry-run mode, and lean on Honeycomb's enterprise support as needed to help you get exactly the right configuration you need. When you combine that with our predictable event-based pricing with no hidden fees, using Refinery means you can easily control your spend without sacrificing data-fidelity.
Refinery allows us to save costs without compromising what we know about our systems. We don’t have to turn off integrations that will reduce our understanding of what is happening in our stack or settle for simple sampling logic that would miss details and traces we need. We're still able to use our thorough application instrumentation, we get reliable tracing, and we reduce our spend.
Refinery has reduced our event volume to 25% of what it used to be, and that reduced traffic is also more consistent on a daily basis. Now our team has confidence that we can use Honeycomb even more in our services to better understand what's happening, without worrying so much about our spend.
As of today, Refinery is now generally available (GA) and ready for anyone to use. Refinery runs on your infrastructure, enabling you to reduce your overall spend by decreasing your cloud provider network egress traffic and Honeycomb EPM volume. We put you in full control, you determine the level of spend that’s right for your organization, with Honeycomb right behind you at each step. Support for Refinery is included as part of Honeycomb’s Enterprise plan.
Achieving production excellence at enterprise scale
As pioneers that first brought observability tooling to market, Honeycomb is always looking for new ways to democratize the tools and expertise that used to be reserved for only the world’s most elite companies. Before Honeycomb, observability was only feasible in hyper-scale settings to tackle seemingly impossible problems. Previously, commercial solutions for sampling have been proprietary and closed-source. Now, with the Refinery GA, Honeycomb makes it easy for anyone to understand, implement, and control an effective sampling strategy.
Similarly, observability enables a set of capabilities that can help anyone—not just the world’s elite organizations—achieve production excellence. Honeycomb Enterprise plans bundle those capabilities into a cohesive set of features for companies that operate at enterprise scale. In addition to Refinery, those features include:
- Debuggable Service-Level Objectives. SLOs are useful measures to help align technical and business stakeholders around focusing on customer experience. Unlike SLO tools that only provide numbers and alerts, Honeycomb’s SLOs are debuggable so that teams can identify and fix the issues affecting their error budgets from within a single interface.
- Secure Tenancy. Honeycomb’s patented approach to data privacy ensures that none of your plaintext data ever touches Honeycomb’s infrastructure, without affecting functionality. Organizations can ensure off-premises data security compliance with a solution that is transparent to end users.
- Exclusive training and onboarding. Also available as of today, our Onboarding Accelerator Packages let you choose how best to engage our Customer Success team. We’ll work with your specific organizational needs to get you started and set up to scale observability adoption across various teams.
- Top-priority support for engineers by engineers. Every day, we use Honeycomb—including Refinery—to manage Honeycomb. Whether you’re just getting started or deep in the weeds of sampling strategies, our industry-leading observability expertise means that when you engage with support, we know what you’re going through and we know how to help.
In addition to the capabilities enabled by features in our other Honeycomb plan tiers—like toggling trace views without switching context, quickly diagnosing and understanding outlier with BubbleUp, getting lightning-fast answers to queries with billions of rows and hundreds of high-cardinality fields within seconds—Honeycomb Enterprise enables practices that help teams foster a culture of production excellence at enterprise scale.
Production excellence isn’t just reserved for the world’s elite anymore. By adding Refinery’s sampling features to this roster of capabilities, Honeycomb continues its mission to further democratize access to capabilities that foster elite levels of performance for all software delivery and operations teams.
Try it for yourself
As of today, we’re also making it easier than ever to try this for yourself. First, for a limited time, you can register for a free 30-day trial of Honeycomb Enterprise.
Next, if you want a closer look at what production excellence means, register to attend our upcoming webinar, “Achieving Production Excellence at Enterprise Scale,” on March 17, where our own Charity Majors and Liz Fong-Jones will join Redmonk’s James Governor to examine how the practices that shape production excellence are implemented in an enterprise environment.
It's rare to have too much telemetry—it's not often that someone says "I wish I didn't have all this information!" However, telemetry is data, and...