Event Foo: Thou Shalt Not Aggregate at Source


This guest post from Alex Rasmussen of Freenome is the sixth in our series on the how, why, and what of events.

When your application emits events, it’s usually emitting them for the benefit of a human operator – maybe you at 3am, if you’re unlucky. The operator wants as much information as possible, with as much context as possible. Keeping this in mind, here are a three things I always consider when creating events.

Provide as much context as possible

At minimum, an event should contain information about the process and host that emitted it, and the time at which it was emitted. Request, session and user IDs should also be recorded if applicable and available. More context is always better than less, and filtering context out is a lot easier than injecting it back in later.

When in doubt, split it out

The operator should never have to resort to regular expressions when analyzing events. If a field can be decomposed into multiple fields, do so. If a field is useful in both decomposed and combined forms (date-times are a great example of this) provide both forms. Storing and transferring bits is far cheaper than wasting your operator’s time and energy.

Aggregate as lazily as possible

It may be tempting to aggregate events at the emitting process. Unfortunately, once you do one aggregate in-process, you’re stuck with that aggregate unless you do a code change. Unless you absolutely can’t avoid it, emit raw events and let your observability system handle aggregation. Chances are that system can handle a lot more raw events without falling over than you think.