Building Your Observability Practice with Tools that Co-exist
10 minute read
Not only that, but Honeycomb is perfect to use alongside the things you’re already using to develop, deliver, and operate your service. If you’ve been in business for any amount of time, you’ve already made choices–choices about what data you’re collecting, how you’re collecting it, where you’re sending it, how you’re visualizing it–and you’ve no doubt put work into making these tools serve your needs. And although Honeycomb can achieve a lot of what you are doing with the tools you’ve got in-house already, you don’t need to use every feature and capability of Honeycomb to get your money’s worth.
To put it another way, the goal is to augment and expand your ability to deliver your service and improve your bottom line, not require you to rip out the tooling your organization has depended on up to now. You can benefit from using Honeycomb alongside what’s been working for you until now.
The reality is that team, skills, and cultural transformation for DevOps is hard and takes time. As your organization evolves toward service ownership and a practice of observability-driven development, you’ll require a different set of tools and processes, and this kind of transformation doesn’t occur between one deploy and the next. While you develop and improve your observability practice, Honeycomb coexists as part of the ecosystem of tools that bring value to your business.
The things everyone knows they need
Most organizations share the following basic requirements:
Some metrics, to keep a high-level eye on infrastructure
Metrics tools do well at counting things like number of jobs queued, host level resource usage, number of requests served by the system, and so on. Your organization has likely built many dashboards to monitor your service, and tuned them over the months and years to reflect the particular foibles of your production environment and user base. There may be legacy documentation in place instructing engineers to optimize their logging for output to Graphite, or Librato. These systems aren’t going to be dismantled overnight, nor should they be.
An archive, for longer-term storage and compliance
Many business practices have legal requirements for data retention timelines, and many executives are simply more comfortable in the knowledge that your organization has a searchable archive going back six months, a year, or even several years. For the majority of forensic or compliance use cases, ETL and query response timelines don’t need to be fast, as long as you can provably retrieve the data as required by your business. Perhaps you built out an ELK cluster for this purpose. It might even have originally been your team’s primary troubleshooting interface but was relegated to the role of archive when scaling it to respond quickly during a production incident no longer proved cost-effective.
A query and graphing tool, to investigate and debug
Here’s where you’re likely to be seeing the most obvious need for Honeycomb to augment your existing tooling. If you’ve been using classic APM like Datadog or New Relic for a while, you might have found their approach met your needs at the outset, but as your environment and use case has become broader and more complex, you may be experiencing slower performance and costs spike as you add more tags, or noticing that more and more issues are not solvable using the preset views available.
Your tools should coexist while your enterprise scales
When Honeycomb coexists with classic log-aggregation and metrics tooling, most organizations start out by using an integration to send existing log output to Honeycomb, sometimes also duplicating that stream and sending it to their existing archive and/or metrics tools. They typically also install one or more Honeycomb Beelines, which automatically instrument their code with events and traces and send the telemetry as well.
In this situation, when investigating something turned up by metrics, an oncall engineer may begin their troubleshooting by turning to what’s familiar–Datadog or New Relic, Splunk, or another classic APM tool. When an issue’s complexity outstrips the abilities of these tools, and the services involved are sending data to Honeycomb, the engineer is able to drill in more deeply and gain resolution.
Incident Response. Ongoing Optimization. Ongoing Development
Observability-Driven Development is an ongoing journey
It’s important to remember that observability has a definition: the ability to ask and answer the questions about your systems that you need to without shipping new code. This matters, because logs, traditional APM, and monitoring tools all claim to give you observability today, but ultimately don’t give you the ability to do that in every case, or without vastly exceeding your budget.
As legacy services are replaced with new code, the engineering team can take the opportunity to instrument with observability in mind, implement additional Beelines, and improve the overall telemetry sent to Honeycomb. And when they do this, much more is possible; the entire organization also benefits from more comprehensive performance analysis tooling, greater visibility into third-party services that impact your bottom line, and greater data-driven insights when prioritizing both product and technical debt work.
Ongoing Performance Analysis
When a new feature is shipped, Honeycomb can show you the impact it has on your system. Observability means you can more easily prioritize where to add capacity or optimize code, where to focus in order to make the most impact and keep your important customers happy.
Intercom used Honeycomb to evaluate performance across all the dimensions required to understand how different users and types of usage affected the performance of a given endpoint. They were able to both identify
Honeycomb Beelines Verify Fast Time to Value at hCaptcha