When we released derived columns last year, we already knew they were a powerful way to manipulate and explore data in Honeycomb, but we didn’t realize just how many different ways folks could use them. We use them all the time to improve our perspective when looking at data as we use Honeycomb internally, so we decided to share. So, in this series, Honeycombers share their favorite derived column use cases and explain how to achieve them.
This installment follows the previous post, “Two Neat Tricks That Will Improve Your Observability.”
Have a favorite derived column use case of your own? Send us a screenshot and description and we’ll send you something in the mail
When many things are sort of the same
When writing some code that can have multiple different types of transient and recoverable errors, I tend to put them in different fields so that they don’t clobber each other. You can picture
After adding a few breakdowns, I’ll wind up with a table like this:
Many times these differences in errors are super useful! For example, I can focus on one type by asserting
read_error exists! But when I want to ask “Please, just give me all the errors. I don’t care where they are! ERROR ME111!!!” I struggle.
Derived columns let me live in peace and concentrate on errors! Enter the
COALESCE function. Coalesce takes the first non-empty value it finds and uses that in the resulting column.
Take anything in any one of those three columns and
SMUSH them together!
Footnote – there are two functions that
SMUSH (technical term, of course):
CONCAT. I like
COALESCE because it gives me only the first value: if one field has a generic error (eg
failure in GET) and a different has a specific error (eg
connection to 10.0.0.2:45 failed) then by ordering my fields right and choosing the first one I get better grouping. But there are certainly times when you’d want all the errors instead of whichever came first – in that case use
When one thing is sort of different
Tracing is amazing! So much win when every section of your code has a timer and you can see execution flow wandering with timings through a waterfall diagram. But how do you effectively look at these things in aggregate? Many Honeycomb queries are straightforward, but I recently found myself wanting to look at the amount of time spent in two different parts of my code simultaneously.
Easy, add a breakdown, right?
Ok, I also wanted it on a Board. In the same graph. I know, I know, so many demands.
How can I get both of these graphs in one query? They’re both graphing
duration_ms but with different filters. This isn’t something I can express using a query, is it?
Derived Columns to Save the Day!
Let’s make two new columns:
write_duration_ms that matches the first span and
read_duration_ms that matches the second.
write_duration_mswill match the first span.
IF(EQUALS($name, "submitProbe"), $duration_ms, null)
read_duration_mswill match the first span.
IF(EQUALS($name, "checkProbe"), $duration_ms, null)
We now have two fields! We can add two heatmaps! Hooray!!!
So pretty, so nice to see both in one place.
Just as a derived column with
COALESCE can take many columns and combine them into one, a derived column can separate one column into many to make it easier to view the data you need side by side.
Thanks for coming on this short tour of how I use derived columns. Please write in and tell us about your most awesome (and terrible) derived column! And if you haven’t made any yet…
As always, if you want to try out Honeycomb, go on and sign up! Have fun!