Honeycomb MCP Is Now In GA With Support for BubbleUp, Heatmaps, and Histograms

If you’ve been following my public journey with LLMs this year, it probably won’t surprise you to learn that this blog post is an announcement about the general availability of Honeycomb’s hosted MCP server. I want to share a few updates about what’s new in the GA release, discuss some interesting learnings from building it, and share examples of how we’re using MCP internally.

By: Austin Parker

| September 9, 2025

AI & LLMs

Product Updates

Webinars

August 14, 2025

Introducing Honeycomb MCP: Your AI Agent’s New Superpower

Watch Now

Honeycomb MCP Is Now In GA With Support for BubbleUp, Heatmaps, and Histograms

First: if you're still in the dark about MCP and AI agents, go read the earlier blogs I linked. They do a pretty good job of covering the space. That said, let’s dive in!

What’s new?

In the months since we launched our public beta, we’ve been hard at work making Honeycomb MCP more useful and capable for AI agents and human operators alike. Our goal with this project has been, from the start, to allow AI to engage in the same kind of investigatory loops that we guide users towards. Many of the new features are designed expressly with this in mind, the most exciting of which is BubbleUp, now available in MCP. Agents are now able to select interesting heatmap sections or time ranges of Honeycomb queries and run BubbleUp on them, detecting outliers and helping agents understand what’s different about slow requests or API endpoints in your system.

To support BubbleUp, we’ve also added a variety of new visualization types for MCP queries, including heatmaps and histograms. A lot of time has gone into improving the query tool, and improving the performance of our tools with non-Anthropic models, such as GPT-5. We still recommend Claude Code as your daily driver, but our team uses it with Cursor quite a bit as well! Any modern MCP client that supports streamable HTTP transport should work just fine, though.

If I spelled it out narratively, we’d be here all day, so here’s a list of the highlights:

The query tool has parity with the UI, supporting ad-hoc calculated fields, usage mode, saved queries, chart and table visualizations, and more.
New search and filter tools are available for inspecting field names and descriptions, queries, boards, and other Honeycomb metadata.
Optimized context functions to reduce hallucinations in tool calling.
Built-in prompts for adding OpenTelemetry instrumentation to applications and services.
Support for board creation through MCP, so you can ask your agent to investigate and then create a board for you to save the results.
Support for our Service Map feature, allowing you to ask MCP for a map of how your services fit together and the traffic between them.

Behind the scenes, hundreds of smaller fixes and tweaks have been made to tool schemas and output formats in order to make our MCP as efficient as possible for agents.

What did we learn from building this?

Let’s talk about some of those efficiencies! One of the biggest challenges in doing pretty much anything with LLMs these days is context management, and it’s even more of a challenge when you don’t control the client. The failure mode of context management is pretty dire; more powerful models will figure out that they don’t know what’s going on, but less powerful ones will often hallucinate relationships in the data that don’t exist. Even state of the art models will struggle as their context window fills, and will often over-fit on patterns when making followup tool calls.

I wish I could say there was a science here, but it’s really more of an art. Our design goal has been to bias towards returning the data most likely to be valuable in any given call rather than all of the data, when we can make a choice about such things. An early realization was that we needed to avoid JSON in responses as much as possible, as it’s the least token-efficient way to communicate anything back to the model, especially for data with repeated fields. What works better? For tabular data, CSV! The exact same payload represented in CSV offers a roughly 40% savings in terms of tokens with no drop in comprehension on our evals.

This realization also applies to trace visualization. Many of our customers rely on Honeycomb for its best-in-class distributed tracing engine; We wanted to make sure that we supported this fully. However, we learned a lot about the best way to show an LLM a trace. You can give it a picture, you can give it ASCII art, you can even give it a markdown table. We found that CSV, again, was a solid option. Topographically sorting the spans in a trace and including the span relationships through their IDs wound up being highly efficient in nearly all scenarios (and to be fair, the times it didn’t have more to do with the quality of the trace than anything).

I don’t think this last learning should be surprising, but I do think it’s telling: MCP is rough to work with as a standard. I think it’s valuable, but I have to believe we can do better. I would certainly appreciate it if clients and model providers could get their ducks in a row on supporting it better! It’s quite frankly baffling that I see companies shipping agents without streaming HTTP support or OAuth support as recently as this summer.

How are we using MCP?

Over 50 Honeycomb teams have used MCP since it went into beta (thanks y’all!) and I’ve talked to many of them. I think the story that comes up the most is that developers who previously had to page out of their IDE to look at something in Honeycomb can now just ask the agent or paste in a trace URL, keeping them focused on their task. We’ve heard stories of users discovering inefficiencies in AWS resource usage, using MCP to help bridge language gaps, and seen developers integrate it into their daily workflows (it’s basically the only way I use Honeycomb these days).

What’s been really cool though is watching how my colleagues have been adopting it. We’ve always encouraged people outside R&D to use Honeycomb to understand how our customers use Honeycomb, but our query interface and overall idioms can be challenging for people that are just starting out. Claude Desktop’s MCP integration has let us roll out Honeycomb to pretty much our entire organization, and they’ve been using it pretty actively!

Customer success is able to quickly answer questions and generate reports based on telemetry data, or pinpoint customer reported issues and escalate them to the right engineers. Our sales engineers and account managers are using it to understand the health of trial accounts, and discovering opportunities to help existing teams right-size their usage. Yeah, it’s cool that our devs can use it, but I think the real transformative impact is on everyone else. Bringing data to people turns out to be pretty powerful.

Want to try it out for yourself? MCP is available to all of our accounts at no additional charge. Sign up, check out our docs on how to get data flowing (or just ask Claude!), and get connected. I can’t wait to see what you do.

New to Honeycomb? Get your free account today.

Get access to distributed tracing, BubbleUp, triggers, and more.

Up to 20 million events per month included.

Try Now