Heatmaps Make Ops Better

In this blog miniseries, I’d like to talk about how to think about doing data analysis “the Honeycomb way.”  Welcome to part 1, where I cover what a heatmap is—and how using them can…

Postmortem: RDS Clogs & Cache-Refresh Crash Loops

On Thursday, October 4, we experienced a partial API outage from 21:02-21:56 UTC (14:02-14:56 PDT). Despite some remediation work, we saw a similar (though less serious) incident again on Thursday October 11 from 15:00-16:02 UTC (8:00-9:02PDT). To implement a more permanent fix, we scheduled an emergency maintenance window which completely interrupted service on Friday Oct 12 for approximately two minutes, from 4:38-4:40 UTC (Thursday Oct 11, 21:38-21:40 PDT).

There And Back Again: A Honeycomb Tracing Story

In our previous post about Honeycomb Tracing, we used tracing to better understand Honeycomb’s own query path. When doing this kind of investigation, you typically have to go back and forth, zooming out and back…