Level Up with Derived Columns: Better Math(s)
When we released derived columns last year, we already knew they were a powerful way to manipulate and explore data in Honeycomb, but we didn’t realize just how many different ways folks could use them. We use them all the time to improve our perspective when looking at data as we use Honeycomb internally, so we decided to share. So, in this series, Honeycombers share their favorite derived column use cases and explain how to achieve them.
This installment follows the previous post, “Taking Things Apart and Putting Things Together.”
Have a favorite derived column use case of your own? Send us a screenshot and description and we’ll send you something in the mail 🙂
Only math the things that matter
When making a derived column that does math based on other columns, it’s tempting to just include the math parts. Sometimes this is fine. Other times it lays a trap; if one of the columns used in the mathematical expression doesn’t exist, it will be filled with a zero in order to fulfill the calculation.
But fear not, you can use the EXISTS
test and the null
value to protect you from this horrible fate!
Let’s see how this works. First, we’ll generate some test data to play with. Half of these events have only the id
field and the other half have both an id
field and a time_millis
field. Our derived column wants to transform time_millis
into seconds to make it easier to read. (The translation is simple – divide time_millis
by 1,000!)
Generate content (using bash)! In this example, my writekey is stored in a variable called $wk
. honeyvent is a small convenience to send individual events to Honeycomb (easier than curl).
for i in {10000..11000} do honeyvent -k $wk -d dctest -n id -v $RANDOM -n time_millis -v $i honeyvent -k $wk -d dctest -n id -v $RANDOM done
Ok, we’ve got some data to play with. Let’s make two derived columns
normal
– the division just on its own. No fancy trickswith_null
– a fancy derived column usingEXISTS
andnull
Already in the preview we can see the effects of our fancy – and predict how this will change our calculations. In the preview data, we can see the undefined
rows in the normal
column turn in to zeroes and in the with_null
example they remain undefined
.
Ok, let’s look at how this lands!
So pretty!
Pretty’s not really good enough here though – the real benefit comes when we do other calculations using this derived column. Let’s take the average… I think you can all see what’s going to happen here…
Data++
Thanks for exploring derived columns with me. Please write in and tell us about your most awesome (and terrible) derived column! And if you haven’t made any yet…
As always, if you want to try out Honeycomb, go on and sign up! Have fun!