Honeycomb Blog

Observability: A Manifesto

Everybody and their freaking grandpa is now claiming to do observability, not stodgy old monitoring. Fine, great. Nice to be trendy I guess. But are they? What’s the difference? You all know how I define observability: the power to ask new questions of your system, without having to ship new code or gather new data in order to ask those new questions. Monitoring is about known-unknowns and actionable alerts, observability is about unknown-unknowns and empowering you to ask arbitrary new questions and explore where the cookie crumbs take you. Observability means you can understand how your systems are working on…

Read More...

Diving into Kubernetes clusters with Honeycomb

At Honeycomb, we’re excited about Kubernetes. In fact, we’re in the early stages of moving some of our services to k8s. Tools like kops have made getting started with k8s easier than ever. But building clusters is only the beginning – before long you might find yourself with a large number of deployments, pods, and services, and new things coming on line every week. Observability is critical to cluster operations. Fortunately, Honeycomb provides multiple Kubernetes integrations to help you get started exploring your cluster’s events and metrics. What can we do with Kubernetes data in Honeycomb? Let’s look at a…

Read More...

The New Best Engineer

If you make a habit of reading twitter or the writings of various thought lords and ladies of the internet, you’ve probably heard a lot of advice on what software engineers should do, like: Software engineers should deploy their own code! Software engineers must know how to operate their own services! EVERYBODY is in ops now! #devops Engineers need to care about business objectives! Even operations engineers need to care about design and utility! #designops At some point even the most game among us starts to feel overwhelmed. Can’t we just go heads down and care about writing code sometimes?…

Read More...

The New Stack Writes up Charity’s Monkigras Talk on the Sustainability of On-Call Culture

If you’ve been keeping up with our blog, you’ll remember that we published a related post about the sustainability of oncall culture a couple of weeks ago, but this writeup from The New Stack of the talk Charity gave at Monkigras in London this year is worth a read as well–giving a talk and writing a blog post are similar, but you get a different angle on the topic from each. Read the article here and let us know what you think–is on-call a privilege and honor to be part of at your enterprise?

Read More...

Sam Stokes talks about data infrastructure on the Data Engineering Podcast

This past week, Honeycomb engineering manager Sam Stokes was interviewed on the Data Engineering Podcast, and in addition to hearing him talk a little about himself (which as far as I can tell, he almost never does) I thought you might want to hear all about Honeycomb’s data infrastructure in Sam’s voice (which is extremely soothing) as well: Listen to or download the podcast here. In addition to talking about the characteristics of our event data, Sam describes how we leverage our own use of Honeycomb to support and analyze our customer usage rapidly and at scale, by slicing and…

Read More...

Oncall and Sustainable Software Development

Yes, being on call typically and anecdotally sucks. I understand! If you’ve heard me speak, I often point out that I’ve been oncall since I was 17 years old—so I know how terrible it can be. But I believe strongly that it doesn’t have to be. Oncall can and should be different. It can be like being a superhero—if you’re on call and an issue comes up, you should get to feel like you’re saving the world (or at least your users), in a good way. It shouldn’t eat your life, or have a serious negative impact on your day-to-day…

Read More...

Development at Honeycomb: Crossing the Observability Bridge to Production

For years, the “DevOps” community has felt focused on one main idea: what if we pushed our ops folks to do more development? To automate their work and write more code? That’s cool—and it’s clearly been working—but it’s time for the second wave of that movement: for developers (that’s us!) to own our code in production, and to be on point for operating / exploring our apps in the wild. Observability is that bridge: the bridge from developers understanding code running on local machines to understanding how it behaves in the wild. Observability is all about answering questions about your…

Read More...

Use Derived Columns To Prioritize Development Work

We recently released a new feature for Honeycomb: derived columns and we promised at the time that we’d show you some more examples of how it can make your life easier. Here are a couple that are about helping you figure out wtf you should do next: What SDK(s) should we work on the most/next? We’ve got so much to do, but we can’t do it all at once. You know how it is. Sometimes, you’ve got to prioritize things. Using a derived column, we can look at the contents of user_agent from data our customers send us and use…

Read More...

Event Foo: Building Better Events

This post from new Honeycomber Rachel Perkins is the seventh in our series on the how, why, and what of events. An event is a record of something that your system did. A line in a log file is typically thought of as an event, but events in Honeycomb can be a lot more than that—they can include data from different sources, fields calculated from values from within, or external to the event itself, and more. An event represents a unit of work. It can tell a story about a complete thing that happened–for example, how long a given request…

Read More...

Build Observable Systems

What should you log? When your systems break, it’s great to be able to look at what they were doing just before they broke. A log is a common solution. But hands up if you’ve come across a log that looks like this: 11:32:33 Processing request for user 42 11:32:33 Request processed successfully 11:32:33 Processing request for user 43 11:32:33 WARNING: user 43 sprocket needs adjusting! 11:32:33 Processing request for user 44 11:32:33 Request processed successfully 11:32:34 NullPointerException in adjustSprocket() 11:32:34 Processing request for user 43 11:32:34 Request processed successfully 11:32:34 Processing request for user 44 What caused the exception?…

Read More...