I was in the Early Access program. Do I need to do anything to move to GA?

No. If you were using Agent Timeline during Early Access, it stays on and your existing instrumentation keeps working. If you weren't in Early Access, Agent Timeline is available to you today.

What exactly counts as an agent conversation, or trajectory?

It's everything that happens in response to one agentic interaction, bound together by a shared conversation_id on your spans. That covers each agent execution, the model calls and tool invocations inside it, handoffs between agents, retries, human escalations, and the downstream system spans those actions set off. You set the conversation_id on your spans, and Agent Timeline handles the binding.

Will it work with the agent framework I'm already running?

If your framework emits the OpenTelemetry GenAI semantic conventions, yes, it will work out of the box. Agent Timeline reads standard OpenTelemetry spans, so there's no Honeycomb SDK to adopt and nothing proprietary going into your code. Any framework producing the required GenAI spans works the same way. The Agent Instrumentation Guide lists the attributes you need.

My prompts contain customer data. How do I keep PII out of Honeycomb?

You decide what leaves your environment. Capture full prompt and completion content where it's safe to, send metadata-only spans where it isn't, or redact sensitive fields before the spans are exported. The conversation structure, failure signals, and trace context hold up no matter how much content you emit, so you can run production-safe instrumentation and still get the full timeline.

When should I use Agent Timeline versus Canvas?

Agent Timeline is for going deep on one conversation: what this agent did, where it broke, what caused it downstream. Canvas is the wider view, for asking a question across many conversations and the rest of your platform. Most investigations start in Agent Timeline, since you usually arrive with a conversation ID from a support ticket or an alert. When the answer lives outside that single conversation, you pivot into Canvas without losing your place.

Agent Timeline Is Now Generally Available

Honeycomb’s Agent Timeline gives you a unified view of LLM behavior and multi-agent workflows, so you can investigate entire conversations and quickly see where prompts, tool calls, and failures happened, in the order they happened.

By: Dan Juengst

| June 18, 2026

Product Updates

AI & LLMs

Agent Timeline: The Flight Recorder for Your AI Agents

Blog

May 19, 2026

Agent Timeline: The Flight Recorder for Your AI Agents

Every LLM call, every tool invocation, every agent handoff, every downstream service span, in one conversation, in one view. Now in Early Access.

Read Now

A few weeks ago I wrote about a customer’s refund request that stopped halfway through at 11:47 p.m. on a Tuesday night. That post walked through the 40 minutes it took to work out what happened when an agentic application had a problem: a tool retried against a rate-limited payments API, the error responses filled up the context window, and the agent gave up. The whole reason we built Agent Timeline was to turn that 40 minutes into five. To reduce MTTR. To solve the problem and get back to sleep.

With Agent Timeline, when you get an SLO burn-down alert on a key agentic workflow indicating a problem like a failing customer refund request, you open the agent conversation that had the problem, flip on Show Failures Only, and the failing tool call is sitting right there with its six retries and the 502 underneath it.

Screenshot of the timeline of a failing tool call in Honeycomb Agent Timeline

Today, we are excited to announce that Agent Timeline is generally available to every Honeycomb customer so you, too, can experience this fast agent conversation debugging.

Learn more about Honeycomb Intelligence

Connect with our experts today.

Let's Chat

GA feels like the right moment to say something I've been thinking about for a bit. Observability platforms have organized themselves around the trace for more than a decade. A request arrives, you follow it across services, you find where it broke. Agents don't fit that model. One person asking "where is my refund" can spin up a supervisor agent, which might hand off to a refund agent and an order-status agent, make a dozen model calls, fire 17 tool calls, and touch half your backend before it answers anything. A trace captures one thread and stops there. It can’t show you all this interaction.

This agent activity has a name. We call it an agent conversation (some refer to the same thing as an agent trajectory). It's the full arc of an agentic workflow: the agent executions, the LLM calls, the tool invocations, the handoffs and retries, all bound together by a conversation_id and connected down into the system spans they trigger. This is the telemetry you need to debug an agent. A platform that stops at the model can't tell you about the downstream 502 that actually killed the run. And if all it understands is the trace, you're back to copying timestamps between two browser tabs while the fire is still going. The agent conversation is becoming required telemetry for anyone serious about running agents in production, the way distributed tracing became required when we moved to microservices.

Agent Timeline is how Honeycomb renders it. You start at the conversation and drill down, instead of starting from a single span and trying to reconstruct what the agent was attempting. The summary across the top gives you duration, model calls, tool calls, agents involved, retries, and a failure count. Horizontal lanes show each agent running in parallel, so a misbehaving one stands out visually instead of hiding inside nested traces. Click any AI span and you get the prompt, completion, tokens, model, tool name, and error type, with any quality signals you emit attached as attributes. From there, you pivot into the full trace waterfall, where the agent's decision connects to the backend root cause without switching tools. Check out our Agent Timeline documentation for a deep dive.

Screenshot of Honeycomb's Agent Timeline

Since you are likely using OpenTelemetry today, none of this asks you to change how you instrument. Instrument with the OpenTelemetry GenAI semantic conventions, send your spans, and Agent Timeline lights up and binds them by conversation. If you're just starting, the Agent Instrumentation Guide will get you there.

Agent Timeline runs on the high-cardinality, event-based engine Honeycomb has had for 10 years, which happens to suit the messy, high-dimensional telemetry agents throw off. Your prompts often carry sensitive data, so you can capture full content when it's appropriate, send metadata-only spans when it isn't, or redact before anything reaches us, and the conversation structure holds up either way.

When the trail moves past a single conversation, you can move into Honeycomb Canvas and ask the broader question across your whole platform without leaving the investigation.