BLOG

Reflections on Monitorama 2019

This year was my third in a row attending (and now speaking at!) Monitorama. Because the organizers do a great job of turning introverts into extroverts for three days straight, it’s always a fun…

Investigating Timeouts with Tracing

Tracing is one of the key tools that Honeycomb offers to make sense of data. Over the last few weeks, we’ve made a number of improvements to our tracing interface — and, put together,…

Toward a Maturity Model for Observability

Access to observability is becoming critical to organizations shipping software, running modern infrastructures in production, and to understanding how users are experiencing their service. To achieve success in delivering a complex service, it’s no…

Welcome (to) Home

Our latest product update features an intuitive home (landing) page that orients users with a quick, real-time view into what’s happening right now in your production systems. Home displays commonly used queries and breakdowns,…

Dynamic Sampling by Example

Last week, Rachel published a guide describing the advantages of dynamic sampling. In it, we discussed varying sample rates to achieve a target collection rate overall, and having different sample rates for distinct kinds…

The New Rules of Sampling

One of the most common questions we get at Honeycomb is about how to control costs while still achieving the level of observability needed to debug, troubleshoot, and understand what is happening in production….

Anatomy of a Cascading Failure

In Caches Are Good, Except When They Are Bad, we identified four separate problems that combined together to cause a cascading failure in our API servers. This followup post goes over them in detail,…