Event Foo: What Should I Add to an Event?By Ben Hartshorne | Last modified on February 5, 2019
When we’re talking with people about how they should start using Honeycomb, many ask for guidance about what should go into an event. Though there are longer posts on this blog about what it means to be an event, this one is a “short” list of things to consider when you’re building events.
What is actually useful is of course dependent on the details of your service, but most services can get something out of these suggestions. As a point of reference, the Honeycomb front end web server generates events with an average of 40 fields (±20), and our API server is closer to 70 (±30). Building and collecting wide events (with many fields) give you the context you’ll need later when trying to understand your production service.
- Add redundant information when there’s an enforced unique identifier and a separate column that is easier for the people reading the graph to understand. For example, at Honeycomb, the Team ID is globally unique, and every Team has a name. We add the ID to get a unique breakdown and add the Name so that it’s easier to recognize (“honey” is easier to remember than “122”).
- Add two fields for errors - the error category and the returned error itself, especially when getting back an error from a dependency. For example, the category might include what you’re trying to do in your code (
error reading file) and the second what you get back from the dependency (
- Opt for wider events (more fields) when you can. It’s easier to add in more context now than it is to discover missing context later.
- Don’t be afraid to add fields that only exist in certain contexts. For example, add user information if there is an authenticated user, don’t if there isn’t. No big deal.
- Think about your field names some but don’t bikeshed (http://bikeshed.com/). Common field name prefixes help when skimming the filed list since they’re alphabetized.
- Add units to field names, not values (such as
Ok, let’s talk about specific fields!!
Who’s talking to your service?
- Remote IP address (and intermediate load balancer / proxy addresses)
- If they’re authenticated
- user ID and user name (or other human-readable identifier)
- company / team / group / email address / extra information that helps categorize and identify the user
- Any additional categorization you have on the source (SDK version, mobile platform, etc.)
What are they asking of your service?
- URL they request
- Handler that serves that request (such as rails route or goji handler or django view or whatever it’s called these days.)
- Other relevant HTTP headers
- Did you accept the request? or was there a reason to refuse?
- Was the question well formed? (or did they pass you garbage as part of the request)
- Other attributes of the request (was it batched? gzipped? if editing an object, what’s that object’s ID? etc.)
How did your service deal with the request?
- How much time did it take?
- What other services did your service call out to as part of handling the request?
- Did they hand back any metadata (like shard, or partition, or timers) that would be good to add?
- How long did those calls take?
- Was the request handled successfully?
- Other timers (such as around complicated parsing)
- Other attributes of the response (if an object was created, what was its ID?, etc.)
Obviously optional, as this type of information is often unavailable to each server, but when available it’s surprising how useful it can be at empowering different groups to easily use the data you’re generating. Some examples:
- Pricing plan - is this a free tier, pro, enterprise? etc.
- Specific SLAs - if you have different SLAs for different customers, including that info here can let you issue queries that take it in to account.
- Account rep, business unit, etc.
Additional context about your service / process / environment
- Hostname or container ID or …
- Build ID
- Environment, role, and additional environment variables
- Attributes of your process, eg amount of memory currently in use, number of threads, age of the process, etc.
- Your broader cluster context (eg AWS availability zone, instance type, kubernetes pod name, etc.)
Just start. You can always add fields later. Use your events to make your life easier, and have fun along the way :)
Software systems are increasingly complex. Applications can no longer simply be understood by examining their source code or relying on traditional monitoring methods. The interplay...