Ask Miss O11y: Baggage in OTel
By Martin Thwaites | Last modified on April 25, 2022Miss O11y is delighted to welcome our newest band member: Martin Thwaites! Martin has been a member of the Honeycomb user community practically since its inception. He is a UK-based consultant who specializes in helping teams scale up and tackle challenging business problems, and a long-time contributor to the Azure and .NET communities. We think he looks â¨amazing⨠in a tiara.
Dear Miss O11y,
What on earth is âBaggageâ in OpenTelemetry? Why does it exist and what would I use it for? Please de-mystify it for me
kthxbye
Thanks for the question, and itâs a common thing to ask. Honestly, OpenTelemetry (OTel) Baggage is the footgun you never wantedâbut weâll get to that in a bit.
So what is this OTel Baggage thing?
Imagine you wanted to have the CustomerId appear on all your spans, but itâs only available on the initial API request because your Stock Check API doesnât need a Customer context. This is where OpenTelemetry Baggage comes to the rescue.
In OpenTelemetry, "Baggage" is a fancy term for contextual information thatâs passed between spans. In Honeycomb distros for OpenTelemetry, we take this a step further and allow you to add the Baggage data to all the spans as attributes (more on this footgun later). More specifically, itâs about passing that context between service boundaries. So really, itâs about pushing that context over an HTTP, gRPC, or a message so the other service can use it to add context to its span.
OpenTelemetry uses a concept called âPropagationâ to pass this concept around, and each of the different library implementations has âpropagatorsâ that will parse and make that Baggage available without you needing to explicitly implement it.
But why the hell does OTel Baggage exist?
Thatâs a really good question. We have HTTP and message headers right? Theyâre a key value list right? Why re-invent something that already exists? Is it just about âNot-invented-hereâ?
All valid questions, but there is something special about Baggage that makes it different. Letâs talk about standardization! The brilliance of OpenTelemetry is that itâs a cross platform and cross framework. What Baggage gives you is a requirement that the context values live in the same place, have the same format and follow the same pattern. That means that all your applications, no matter what the language, will be able to read them, parse them, and use them. This is important when youâre building a massively distributed system, and you want to provide autonomy to teams to work in whatever language or framework they want.
You could absolutely use something else for this; e.g., you could standardize on headers, etc., in your organization. However, what youâll soon find is that you end up building helpers in every framework and language that are never maintained.
What should I use OTel Baggage for?
This is where the footgun comes inâthe best answer is ânothing sensitive, and nothing that you donât want third parties to see.â Additionally, donât always trust what you get because there are no built-in integrity checks to ensure it was your Baggage items.
Common use cases weâve seen are information thatâs only accessible further up a stack; things like Account Identification, User Ids, Product Ids, maybe even origin IPs. Passing these down your stack allows you to then add them to your spans in descendent services to make it easier to filter when youâre searching in the UI.
As we can see in the diagram, unless the AccountId is passed via Baggage, the Stock API cannot add the AccountId to the spans. This gets really important when we're debugging live, high-volume systems, as we may want to know whether a load on our Stock API is being caused by a particular Account or even a particular IP address.
So tell me about this footgun thing?
Baggage can be prolific ⌠it goes EVERYWHERE. Because itâs in the background and OTel is passing it around without you doing anything, you donât know itâs happening.
Imagine all your secrets being shared with your neighbors, but imagine you were the one doing it in your sleep or simply having them flash up on your T-shirt while youâre talking to them. That would be ⌠not good, right?
Thatâs what can happen if youâre not careful with how you use Baggage Propagation and what you use Baggage for.
Iâve also seen these kinds of shared context be abused. If you can imagine baggage being similar to âSessionâ data thatâs stored for a user, you can start to get an idea of where Iâm going with this. If youâve ever worked in .NET or Java, youâll have seen people pushing entire object trees into session state because they might need some of the properties. Just imagine that, on top of storing that object tree, youâre also passing it between all of your services.
If youâve seen someone add an extension method to the Baggage functions in your language that allows serialization of an object into a string that can be used in Baggage, please just put them out of their misery and save yourself a 5 a.m. alert because the system is running slow.
Baggage != Span attributes
One final thing on Baggage that is the biggest misconception weâve found is that Baggage is not a subset of the Span attributes that are added when you push them.
Itâs not that unreasonable to assume that when you add something as Baggage, youâre doing it so it ends up on the attributes of the child systemâs spans. However, it doesnât; at least not automatically. You must explicitly take something out of Baggage and append it as attributes.
var accountId = Baggage.GetBaggage("AccountId"); Activity.Current?.SetTag("AccountId", accountId);
To make this easier, weâve added the BaggageSpanProcessor to our .NET and Java Honeycomb libraries for OpenTelemetry that do this automatically. I would refer to the footgun as to whether you want to use these or build your own.
Let me tell you a (true) (funny) storyâŚ
As youâre likely aware, we dogfood Honeycomb and OpenTelemetry here. So everything we use in our telemetry ingest uses OpenTelemetry to instrument itself.
Now, imagine youâre a Honeycomb customer and using OpenTelemetry, and you start to add some context on your applications to Baggage. Now, because youâre using OpenTelemetry, that Baggage that you were using internally gets pushed onto your telemetry provider, aka usâŚÂ
The above scenario did happen, and when it did, there wasnât the concept in the Go libraries we were using at the edge to say âIâm an external endpoint, ignore everything about tracing context.â We then ended up with the customerâs Baggage information in our spans. Most importantly though, as things like user_id
and team_id
are pretty standard names, they were overriding our own names.
What makes this worse is that the trace_id
and parent_id
properties suffered the same fate, but with worse implications. Thatâs a story for another day though!
Learn more about OpenTelemetry and Honeycomb and why weâre all in on OTel. Have an OTel-related question (or any other observability-related question) for Miss O11y? Send us an email!Â
Related Posts
Start with Traces, not with Logs: How Honeycomb Helped Massdriver Reduce Alert Fatigue
Before Massdriver, Dave worked in product engineering where he was constantly bogged down with DevOps toil. He spent his time doing everything except what he...
Infinite Retention with OpenTelemetry and HoneycombÂ
Honeycomb is massively powerful at delivering detailed answers from the last several weeks of system telemetry within seconds. It keeps you in the flow state...
Trace Propagation and Public API Endpoints in .NET: Part 1 (Disable All)
One of the issues with the W3C trace context is that it doesnât define any standards for how far a trace is to propagate. If...