Training Videos Observability

Intro to o11y Topic 11: Using “Group By” with a Heatmap


In this video, Developer Advocate Jessica Kerr explains how to use the “group by” field in a heatmap to further refine your data.

Note: This video assumes you have already connected your app to Honeycomb. Jessica Kerr is using a sample app called Sequence of Numbers. If you would like to download the app and follow along, you can do so using the process from Intro to o11y Topic 3. If you need help connecting your app to Honeycomb, see Intro to o11y Topic 4.



Jessica Kerr [Developer Advocate|Honeycomb]: 

Here we have a visualization of a heatmap of duration of all incoming requests (remember that “trace.parent_id does not exist” translates to “only the root span of each trace”). And now we’re grouping by http.url. The heatmap looks the same. But below that, there’s this whole table. Each URL has its own row. When I hover over the different rows in the table, the heatmap changes to show just the events with that value of http.url.

So really, Honeycomb has returned a dozen different heatmaps, and they’re overlaid on the same graph. You can see the shape of each individual heatmap on each row in the table. This little graph is sideways compared to the heatmap; it’s a count of events by duration.

This one says that most of the requests with url of fib index=0 had a low duration. On this one, it’s getting a little more spread out. And down here by index=12, that’s a high duration. Let’s sort by the URL. Click on this arrow next to the field name. 

Fib with index of 0 and 1 and 2 are fast. These are in alphabetical order, not numeric, sadly. As index increases, it looks like it’s getting more spread out and slower, and then 10 and 12 here are the slowest on our list. That seems to make a difference. Grouping by http.url has given us data broken down by each value in that field. We can see this more clearly if we add another visualization.

I click on the visualize section and add another one. Let’s see the max of duration_ms for each time increment, that is, each 500 ms, what is the maximum duration of a request, for each URL. If this were a real production app, I’d have way more data points, so I’d ask for the 90th or 99th percentile of duration, instead of max. Hmm, it looks like nothing happened. Oh right, I need to hit run query.

Now a second graph appears. This one is a line graph, with a line for each group. The sparseness of our data makes this a bunch of points instead of smooth lines, but hover over each row in the table and that row’s line stands out.

It looks like, as the index to /fib gets higher, the max duration gets higher. Requests get slower. And, I can say how slow! This table gives the overall max duration for requests with each URL over the whole time range of my query. That time range, by the way, is visible up here at the top of the page. This is data I can use. There’s definitely something here; the requests get slower as the index to /fib gets higher.