Level Up with Derived Columns: Taking Things Apart and Putting Things Together

When we released derived columns last year, we already knew they were a powerful way to manipulate and explore data in Honeycomb, but we didn’t realize just how many different ways folks could use them. We use them all the time to improve our perspective when looking at data as we use Honeycomb internally, so we decided to share. So, in this series, Honeycombers share their favorite derived column use cases and explain how to achieve them.

This installment follows the previous post, “Two Neat Tricks That Will Improve Your Observability.”

Have a favorite derived column use case of your own? Send us a screenshot and description and we’ll send you something in the mail 🙂

When many things are sort of the same

When writing some code that can have multiple different types of transient and recoverable errors, I tend to put them in different fields so that they don’t clobber each other. You can picture json_decoding_error, write_error, read_error, connect_error, etc.

After adding a few breakdowns, I’ll wind up with a table like this:

screenshot of results table of transient errors

Many times these differences in errors are super useful! For example, I can focus on one type by asserting read_error exists! But when I want to ask “Please, just give me all the errors. I don’t care where they are! ERROR ME111!!!” I struggle.

Derived columns let me live in peace and concentrate on errors! Enter the COALESCE function. Coalesce takes the first non-empty value it finds and uses that in the resulting column.

screenshot of creating a derived column to combine errors into one

Take anything in any one of those three columns and SMUSH them together!

screenshot of combined error column
Footnote – there are two functions that SMUSH (technical term, of course): COALESCE and CONCAT. I like COALESCE because it gives me only the first value: if one field has a generic error (eg failure in GET) and a different has a specific error (eg connection to 10.0.0.2:45 failed) then by ordering my fields right and choosing the first one I get better grouping. But there are certainly times when you’d want all the errors instead of whichever came first – in that case use CONCAT instead.

When one thing is sort of different

Tracing is amazing! So much win when every section of your code has a timer and you can see execution flow wandering with timings through a waterfall diagram. But how do you effectively look at these things in aggregate? Many Honeycomb queries are straightforward, but I recently found myself wanting to look at the amount of time spent in two different parts of my code simultaneously.

Easy, add a breakdown, right?

animation of adding a breakdown

Ok, I also wanted it on a Board. In the same graph. I know, I know, so many demands.

How can I get both of these graphs in one query? They’re both graphing duration_ms but with different filters. This isn’t something I can express using a query, is it?

Derived Columns to Save the Day!

Let’s make two new columns: write_duration_ms that matches the first span and read_duration_ms that matches the second.

  • write_duration_ms will match the first span. IF(EQUALS($name, "submitProbe"), $duration_ms, null)
  • read_duration_ms will match the first span. IF(EQUALS($name, "checkProbe"), $duration_ms, null)

We now have two fields! We can add two heatmaps! Hooray!!!

screenshot with two heatmaps

So pretty, so nice to see both in one place.

Just as a derived column with COALESCE can take many columns and combine them into one, a derived column can separate one column into many to make it easier to view the data you need side by side.

Data++

Thanks for coming on this short tour of how I use derived columns. Please write in and tell us about your most awesome (and terrible) derived column! And if you haven’t made any yet…

screenshot showing what to click to make a derived column

As always, if you want to try out Honeycomb, go on and sign up! Have fun!