Imagine if you’re running a production service and you’ve been notified that something is wrong, but it’s not quite obvious what it is. Let’s take a look at how we can use the Honeycomb Query Interface to drill down and find the source of an issue.
In our case here, we have a dataset that contains information about the API service we’re running. Let’s do a quick count to see activity over the last six hours. Honeycomb quickly returns this set of activity that’s been happening over time. Nothing looks particularly wrong.
Let’s use Honeycomb’s breakdown operator and breakdown by status code. It gives us a picture of what’s happening inside this time range. As we can see, there are some successes and failures. Let’s use the filter capability to just filter out where status code equals 500. As you can see, Honeycomb’s query is extremely fast. The count gives us a nice pattern of activity.
Let’s go ahead and look at a P95 or 95th percentile of latency. As we can see, we have a few spikes, but it’d be nice if we could dig a bit deeper. Let’s use Honeycomb’s breakdown operator and break down by endpoint and user ID to give us a little bit more visibility on where something might be happening and who it’s affecting.
Extremely quickly we see a spike for a particular user. We’re noticing that there is a high number of status code 500 or errors on the export endpoint for a specific user, 20109. Honeycombs’ query interface makes it easy for us to drill down through a diverse set of data and its blazing-fast query engine gives us our results right when we need them. That combined with other features like tracing, BubbleUp activity, and history makes it easy for everyone to become the best debugger in production in a short amount of time.
If you see any typos in this text or have any questions, reach out to firstname.lastname@example.org.