Training Videos Observability Distributed Tracing

Intro to o11y Topic 7: Interpreting Honeycomb’s Trace View

Description:

In this video, Developer Advocate Jessica Kerr explains how to interpret the data in Honeycomb’s powerful trace view. By the end of the video, you’ll understand how to see:

  • times, duration, concurrency in the waterfall
  • causality in the tree of spans
  • removing, adding, and coloring by fields
  • field library.name for where the telemetry came from
  • field span.kind for the client/server role

Note: This video assumes you have already connected your app to Honeycomb. Jessica Kerr is using a sample app called Sequence of Numbers. If you would like to download the app and follow along, you can do so using the process from Intro to o11y Topic 3. If you need help connecting your app to Honeycomb, see Intro to o11y Topic 4.

Transcript

Jessica Kerr [Developer Advocate|Honeycomb]: 

Okay, here we are on a trace view. Pro-tip number one, this little arrow on the left sidebar gets it out of the way and gives you more screen real estate. Now, what are all these pretty colored rectangles? This one says 3.3 milliseconds. So it represents a span of time. Conceptually, each of these spans represent a unit of work.

Each of these spans was transmitted to Honeycomb as a wide event. That wide event has all of these attributes associated with it. This whole thing with the pretty rectangles is called the waterfall view. You can see how long each piece of work took and when they ran with respect to each other. For instance, it looks like this activity waited until after this one finished.

While these ran at the same time. Are they part of each other? Over here on the left is the tree representation showing that yes, this little middle-ware query span was part of this get /fib span. We say that the get/fib span is the parent of the middleware query span, and that get/fib had three children.

In turn, its parent is this HTTP get span, which has only one direct child. This gives us some sense of causality. What caused this middleware? Why, it was get /fib. What caused that this HTTP get and who did it? It was this service, the fib microservice, the service name is all the same. Boring! Let’s get rid of it.

I click on this little dropdown next to the field name and choose hide this field. Let’s get something more interesting instead. Who told us about each of these spans? Click on fields over here and I can add more. This field library.name represents the instrumentation code that sent this telemetry. It’s an open telemetry semantic standard to put that attribution in a field called library.name.

Looks like the HTTP auto instrumentation told us about this span. And then the Xpress auto instrumentation told us about this one. Express is the web application framework this node.JS app uses. That app includes a library from open telemetry to provide the instrumentation code, to send telemetry data about what express is doing.

And then this one was the HTTP instrumentation library again. Are there any other library names in here? I can check quickly by changing the color of the rectangles. Check this out! Click on that little dropdown next to the field name and choose color rows based on the values in this field. Okay. Now everything from HTTP is purple.

Everything from express is orange. Looks like it’s all purple and orange. So those are the only two libraries that sent spans as part of this trace. I’m a little curious why we have two spans here and HTTP get, and then, uh, get, is it calling out to a different service? Let’s check the service name again.

No, those are all the same. What is going on here? Hmm, what other fields could help? Let’s scroll through these attributes. Everything in that wide event that Honeycomb received is available to us here. Hmm. What other fields can help here? All these fields in the span I’m poking around. Oh, Hey. Oh, Hey. I recognize this one: span.kind.

That’s an open telemetry standard field too. Let’s see this field for every span of the trace. And color by it while we’re at it. Okay, okay. Check this out. This microservice starts this trace as a server, does some internal stuff then makes a client call by HTTP. Then poof it to server. Again, does some internal stuff then responds at the end of this span while here’s an insight into the workings of this application.

It calls out over HTTP to itself. Didn’t expect that! There’s always something to learn when looking at a trace. We’ll come back to this trace view later. It has a lot more tricks than we’ve seen.

Transcript