Honeycomb Blog

The Price is Right

Here at the hive, we’re working on something that isn’t code or new features(!), but is a big part of our business notwithstanding: figuring out the best way to help people understand how we price Honeycomb and the built-in assumptions we make about how they use Honeycomb. There are some issues (pricing is hard, film at 11!), but we’ve found the challenges to be more about words than about technology or sales process. Let me explain. We’ve heard from people who’ve been confused by the way we price and bounced without even trying Honeycomb because they think we are either…

Read More...

How Honeycomb Uses Honeycomb Part 8: A Bee’s Life

This post continues our dogfooding series from How Honeycomb Uses Honeycomb, Part 7: Measure twice, cut once: How we made our queries 50% faster…with data. To understand how Honeycomb uses Honeycomb at a high level, check out our dogfooding blog posts first — they do a better job of telling the story of problems we’ve solved with Honeycomb. This blog post peeks under the hood to go into greater detail around the mechanics of what we track, how we track it all, and how we think about the sorts of questions we want to answer. We’ve built up a culture…

Read More...

Metrics: not the observability droids you’re looking for

I went to Monitorama last year for my first time. It was great; I had a terrific time. But I couldn’t help but notice how speaker after speaker in talk after talk spent time either complaining about the limitations of their solutions, or proudly/sadly showing off whatever terrible hacks they had done to get around the limitations of how events were being stored on disk. I went to Strange Loop a couple of weeks ago, and the same thing happened in all the talks I saw or heard of that were about monitoring- or analytics-related topics. People were saying things…

Read More...

Reflections on Monitorama 2017: From the Metrics We Love to the Events We Need

There were a bunch of talks at Monitorama 2017 that could be summed up as “Let me show you how I built this behemoth of a metrics system, so I could safely handle billions of metrics.” I saw them, and they were impressive creations, but they still made me a little sad inside. The truth is that most of us don’t actually need billions of metrics. Sure, there are the Googles and Facebooks (and legit – one of these presentations was from Netflix, who actually does need billions of metrics), but most of us don’t really need billions of metrics….

Read More...

Instrumenting High Volume Services: Part 2

This is the second of three posts focusing on sampling as a part of your toolbox for handling services that generate large amounts of instrumentation data. The first one was an introduction to sampling. Sampling is a simple concept for capturing useful information about a large quantity of data, but can manifest in many different ways, varying widely in complexity. Here in Part 2, we’ll explore techniques to handle simple variations in your data, introduce the concept of dynamic sampling, and begin addressing some of the harder questions in Part 1. Constant Sampling This code should look familiar from Part…

Read More...

Instrumenting High Volume Services: Part 1

This is the first of three posts focusing on sampling as a part of your toolbox for handling services that generate large amounts of instrumentation data. Recording tons of data about every request coming in to your service is easy when you have very little traffic. As your service scales, the impact of measuring its performance can cause its own problems. There are three main ways to mitigate this problem: measure fewer things aggregate your measurements before submitting them before submitting them measure a representative portion of your traffic Each method has its place; this series of posts focuses on…

Read More...

The Problem with Pre-aggregated Metrics: Part 3, the “metrics”

This is the third of three posts focusing on the limitations of pre-aggregated metrics. The first one explained how, by pre-aggregating, you’re tightly constrained when trying to explore data or debug problems; the second one discussed how implementation and storage constraints further limit what you can do with rollups and time series. Finally, we arrive at discussing “metrics.” Terminology in the data and monitoring space is incredibly overloaded, so for our purposes, “metrics” means: a single measurement, used to track and assess something over a period of time. While simple to reason about (“this number is the overall error rate…

Read More...

The Very Long And Exhaustive Guide To Getting Events Into Honeycomb No Matter How Big Or Small, In Any Language Or From Any Log File

How do you get events in to Honeycomb? This gets confusing for lots of people. Especially when you look at all the gobs of documentation and don’t know where to start. But all you need are these three easy steps: Form JSON blob. Go nuts! Smush as many keys and values as you want into one fat request. This is your “event”. Send blob to Honeycomb API. curl will do fine, or use one of our handy SDKs, or honeytail your logfiles in. Profit. Remember, if you don’t charge for your business you aren’t a sustainable enterprise. Okay, that’s just…

Read More...

The Problem with Pre-aggregated Metrics: Part 2, the “aggregated”

This is the second of three posts focusing on the limitations of pre-aggregated metrics. The first one explained how, by pre-aggregating, your flexibility is tightly constrained when trying to explore data or debug problems. The third can be found here. The nature of pre-aggregated time series is such that they all ultimately rely on the same general steps for storage: a multidimensional set of keys and values comes in the door, that individual logical “event” (say, an API request or a database operation) gets broken down into its constituent parts, and attributes are carefully sliced in order to increment discrete…

Read More...

The Problem with Pre-aggregated Metrics: Part 1, the “Pre”

This is the first of three posts focusing on the limitations of pre-aggregated metrics, each corresponding to one of the “pre”, “aggregated”, and “metrics” parts of the phrase. The second can be found here. Pre-aggregated, or write-time, metrics are efficient to store, fast to query, simple to understand… and almost always fall short of being able to answer new questions about your system. This is fine when you know the warning signs in your system, can predict those failure modes, and can track those canary-in-a-coal-mine metrics. But as we level up as engineers, and our systems become more complicated, this…

Read More...