Training Videos Observability Debugging

Intro to o11y Topic 8: Using Heatmaps of Duration in Honeycomb


In this video, Developer Advocate Jessica Kerr explains how to use the heatmap of duration, or latency graph, in Honeycomb. What is a heatmap? How is it constructed? How do we read it? 

Note: This video assumes you have already connected your app to Honeycomb. Jessica Kerr is using a sample app called Sequence of Numbers. If you would like to download the app and follow along, you can do so using the process from Intro to o11y Topic 3. If you need help connecting your app to Honeycomb, see Intro to o11y Topic 4.



Jessica Kerr [Developer Advocate|Honeycomb]:

When I want to know what is slow, then I want to look at a graph of latency. Latency basically means slowness. There’s a handy one on the homepage. So I click the Honeycomb logo to start. Then I click on the latency graph to zoom in and mess with it. For now, I’m going to click these buttons to shrink the sidebars and focus on the graph.

Then I’m going to zoom in on the part of the timeline that I care about. I go to the beginning of the part I care about. I click and drag across the part of the timeline that I care about. I release. And then I push this little plus button in a magnifying glass to zoom in.

In the upper right-hand corner, the time range has changed to include only the time that I selected. Now, what do we have here? The query specification at the top tells us exactly what we’re looking at. It is a visualization of a heatmap of duration milliseconds. We’ve seen that field before. It comes in the telemetry data sent to Honeycomb in each event.

Remember that Honeycomb receives an event per span, and the duration in milliseconds tells how long the span took. Honeycomb uses that data to create this heatmap. To make a heatmap, Honeycomb puts each event in a bucket. First, going across on the X axis, find the timestamp representing when the spans started.

Each box is 250 milliseconds wide. That’s the granularity on the timestamp axis. Then going up on the y-axis, find the event’s duration in milliseconds. This one took like 10 milliseconds. So it goes in this box, right here. Now Honeycomb does that with every qualifying event and some boxes get lots of events in them.

And those get a darker color. Like the scale sets, the darkest boxes have three events and the lightest ones have one. So the heatmap shows a count of events by timestamp and duration. Why is this called a heatmap? I don’t know. It’s kind of like a hot or not contest for each space on the graph. Which spot is popular? The dark ones!

When I read this, I can see that there are lots of squares and especially dark ones below like 60 milliseconds. And there are a few above that. And then like this one-way high one at 160 milliseconds. Is that an outlier because it’s random, or for a reason? That is a question for later. By the way, in a production system, the heatmap of duration will look more like this.

It’s much denser. Most of the boxes have a color. You can see where the events cluster, whether they’re getting slower or faster over time. Times of the day when requests are less frequent. There are millions of requests represented in this graph. In contrast, the data for Sequence of Numbers is sparse, because it’s just me pushing go and stop. Dozens of requests instead of millions. Still, you can see a spread of fast and slow requests.

And you can dig into the fast or slow request. Check this out. If I want to see a fast request, I click one of these dark boxes down here with low duration. Here’s the request to get slash fib that took four milliseconds. Yup. That’s a fast one. Click this back arrow to return to the heatmap. When I went to see a slow request, I click on one of the boxes near the top of the latency heatmap. How about this? This trace took 138 milliseconds.