Ask Miss O11y: Is the OpenTelemetry Collector useful?
By Martin Thwaites | Last modified on May 19, 2022Dear Miss O11y,
Iâve been told I need to use the OpenTelemetry Collector, but I have no idea what it is, or why I need it? Should I be using it? What value does it add? What the hell am I missing?
Kthxbye
That is a perfectly valid mental state to be in when looking at the OpenTelemetry Collector. As an OpenTelemetry community, weâve not done a superb job at explaining when, where, and why it adds value to your deployment infrastructure. The great news is, weâre working on better docs, better configuration, and better information on best practices. This will take some time, so let me outline some of that here.
TL;DR Collectors are really useful if youâre running a lot of services, or want to centralize and control your configs and telemetry processing. Not using a collector is a perfectly valid choice, and there will be no adverse effects from excluding it from your design.
What is the OpenTelemetry Collector?
At a very high level, the Collector is something that you can deploy as a service in your infrastructure. The OpenTelemetry Collector runs as a service, and by that I mean you can run it on a virtual machine (VM), run it with your workload (as a separate process), or a container, itâs up to you.
You can choose to run it next to your application (in a sidecar in a Kubernetes Pod for instance), or you could run it in a central network place. The decision will be based on what you want to get out of it.
There are two âmodesâ in the OpenTelemetry Docs (Agent and Gateway), however, the only difference here is conceptual in that itâs running next to each deployment of your application (Agent) or in a central place for multiple services to send their traces to (Gateway). In both modes, the service accepts or scrapes data from a service or services, and forwards that onto an exporter.
What value does it add?
The Collector can provide a few features that can be very useful.
- Centralizing config: When youâre sending data to a third party, youâll be putting secrets like API Keys into config files. Having those in a central location so they can be updated and cycled in one place is definitely a benefit in a lot of environments.
- Centralizing egress: In some secure networks, individual services will be limited in their ability to connect to the internet, and will likely be filtered/monitored. Putting an OpenTelemetry Collector instance in place will allow all internal services to send to an internal location, then that service can forward to your third-party observability tool like Honeycomb.
- Filtering/securing trace/span data: If youâre worried about the egress of your tracing data including some personal or private information, the collector can provide a central point to redact that data before it leaves your environment. This is done using something called a âProcessorâ in the collector, and specifically, the Redaction Processor which is part of the add-ons for the Collector.
- Sync application performance: If youâre using an OpenTelemetry integration that relies on synchronous sending of trace data, the latency of the endpoint can impact your response times. If you use a Collector close to your application (e.g.a Sidecar in Kubernetes) you can eliminate that as the Collector will take the request immediately, and then forward on the spans and metrics from there.
- Telemetry enrichment: Having a central point in your infrastructure that can add additional metadata to your telemetry entries can be powerful. Adding attributes such as cloud region/availability zones, or Pod information like Namespaces in Kubernetes using the Collector ensures that the engineers donât need to care about where the application is hosted.
Should I add it?
This is entirely your choice. Itâs valid to send data directly to your observability product from your application, itâs equally valid to send it through a Collector instance. Itâs all about tailoring things to whatâs important to you, and working with the trade-offs of more things to manage, vs. more places to config.
If youâre still not sure, I encourage you to book office hours with one of our developer advocates to chat more about what might be right for you!Â
Related Posts
Ask Miss O11y: To Metric or to Trace?
Dear Miss O11y, I remember reading quite interesting opinions from you about usage of metrics and traces in an application. Did you elaborate on those...
Ask Miss O11y: Is There a Beginnerâs Guide On How to Add Observability to Your Applications?
Dear Miss O11y, I want to make my microservices more observable. Currently, I only have logs. Iâll add metrics soon, but Iâm not really sure...
Ask Miss O11y: Error: missing âx-honeycomb-datasetâ header
Your API Key (in the x-honeycomb-team header) tells Honeycomb where to put your data. It specifies a team and an environment. Then, Honeycomb figures out...