Managing OpenTelemetry Semantic Convention Migrations With the Collector
This is what a semantic convention migration looks like in practice: not a clean cutover, but months of coexistence where old and new attribute names overlap. In this post, I'll explain why this happens, how the OpenTelemetry Collector's schema processor is designed to automate migrations in both directions, and what we're actively working on to get it into a state where everyone can use it.

By: Mike Goldsmith

The Director’s Guide to the Future of Observability: AI, OpenTelemetry, and Complex Systems
Read Now
Real production data tells the story better than I can. Juraci Paixão Kröhling, a friend and fellow observability practitioner at OllyGarden, recently shared an example from an anonymized production environment: 1,830 occurrences of http.url and 23,984 occurrences of url.full in the same dataset. Both attributes describe the same thing. Both are actively being written to the same backend at the same time.
This is what a semantic convention migration looks like in practice: not a clean cutover, but months of coexistence where old and new attribute names overlap. Any query, SLO, or trigger that references http.url is now only seeing a fraction of your traffic. And the problem runs in both directions: services that haven't upgraded yet still emit the old names, while services that have upgraded ahead of your dashboards and alerts emit the new ones. Either way, your operational tooling returns incomplete results.
In this post, I'll explain why this happens, how the OpenTelemetry Collector's schema processor is designed to automate migrations in both directions, and what we're actively working on to get it into a state where everyone can use it.
Read our O’Reilly book, Observability Engineering
Get your free copy and learn the foundations of observability,
right from the experts.
Why attribute names change
OpenTelemetry's semantic conventions define a shared vocabulary for telemetry attribute names. They're the reason http.response.status_code means the same thing regardless of whether it came from a Java service or a Python service. As the project matures, the conventions get refined: attributes are renamed for consistency, split to be more precise, or reorganized into clearer namespaces.
OpenTelemetry v1.21.0 was a significant example of this. A large set of HTTP attributes were renamed:
| Old name | New name |
|------------------|---------------------------|
| http.url | url.full |
| http.method | http.request.method |
| http.status_code | http.response.status_code |
| net.host.name | server.address |
| net.host.port | server.port |And more recently, in v1.27.0:
| Old name | New name |
|------------------------|-----------------------------|
| deployment.environment | deployment.environment.name |The problem isn't that the renames happen, it's that not everything upgrades at the same time. Your Java auto-instrumentation agent upgrades to a version that emits url.full, but your dashboards, SLOs, and alert triggers still reference http.url. Your Python service hasn't upgraded yet and is still emitting the old names. Now your SLO only captures traffic from services still on the old convention, or only traffic from services on the new one, depending on which side of the rename you queried.
The schema file: migrations as structured data
Here's what makes OpenTelemetry's approach to this problem powerful: every published semantic convention version ships alongside a schema file that describes what changed and how to migrate.
The schema file format uses a versions block where each version lists the changes made in that release. The example below is illustrative—the real published schema files follow the same structure but contain the full set of changes for each version. A rename looks like this:
file_format: 1.1.0
schema_url: https://opentelemetry.io/schemas/1.27.0
versions:
1.27.0:
all:
changes:
- rename_attributes:
attribute_map:
deployment.environment: deployment.environment.name
1.21.0:
spans:
changes:
- rename_attributes:
attribute_map:
http.url: url.full
http.method: http.request.method
http.status_code: http.response.status_code
net.host.name: server.address
net.host.port: server.portThe all section applies changes to all signal types (traces, metrics, logs). Signal-specific sections (spans, metrics, logs, resources) scope changes more precisely. The versions are chained, which means if your instrumentation is on v1.20.0 and you want to migrate to v1.27.0, the schema file gives you the complete, ordered sequence of changes to apply.
Beyond renames, the format also supports renaming metrics, renaming span events, and splitting metrics, for cases where a single metric was divided into multiple more specific ones.
Each schema file is immutable once published and is versioned by its URL: https://opentelemetry.io/schemas/1.27.0.
The schema processor
The schema processor is a component in the OpenTelemetry Collector Contrib project. You configure it with a list of target schema versions (one per schema family) and it handles the rest.
When a signal arrives, the processor reads the schema_url (a version identifier set by the instrumentation library that produced the signal), fetches the corresponding schema file, and determines whether the signal needs upgrading or downgrading to reach the target version. It then applies the appropriate sequence of renames from the schema file, updates the schema_url in the signal to the target version, and passes it downstream.
Note: The schema processor currently has development stability status and is not yet included in official Collector distributions. We're actively working to fix known issues and bring it to alpha, at which point it will be available in the standard contrib distribution without any custom build required. Scroll down for details.
Configuration
processors:
schemaprocessor:
# Optional: pre-fetch schema files for versions you expect to receive
prefetch:
- https://opentelemetry.io/schemas/1.21.0
# Required: target version per schema family
targets:
- https://opentelemetry.io/schemas/1.27.0Then, include it in your pipeline:
service:
pipelines:
traces:
receivers: [otlp]
processors: [schemaprocessor]
exporters: [otlp]
metrics:
receivers: [otlp]
processors: [schemaprocessor]
exporters: [otlp]
logs:
receivers: [otlp]
processors: [schemaprocessor]
exporters: [otlp]With this in place, a signal arriving with schema_url: https://opentelemetry.io/schemas/1.21.0 would have all the v1.21.0-v1.27.0 renames applied before reaching your backend. Your queries, SLOs, and triggers only need to reference the new attribute names.
Upgrade vs downgrade
The processor is designed to be bidirectional; upgrading signals older than the target and downgrading signals newer than it. You configure a single target version and the processor determines the direction automatically based on the incoming signal's version.
The downgrade direction is arguably the more operationally important of the two. Instrumentation upgrades don't always happen on a schedule you control—a dependency update, a library patch, or a new service deployment can start emitting newer attribute names before your queries, SLOs, and triggers have been updated to match. Without downgrade support, you'd be back to the same silent divergence problem, just in reverse: newer services emitting url.full while your SLO still filters on http.url. Configuring a target version at the Collector means even signals from ahead-of-target instrumentation are brought back in line, keeping your operational tooling consistent regardless of what's happening across your fleet.
For downgrades, the recommendation is to not revert more than three versions. Keep your target version reasonably current so you're not maintaining large translation chains.
Instrumentation libraries and schema_url
For the schema processor to work, it needs to know which semantic convention version each incoming signal was produced with. Instrumentation libraries communicate this by including a schema_url field in the OTLP payload they emit. For example, https://opentelemetry.io/schemas/1.21.0.
Most official OpenTelemetry auto-instrumentation libraries set this automatically. Manual instrumentation often doesn't, depending on how the SDK is initialized. If a signal arrives without a schema_url, the schema processor has no basis for translation and passes it through unchanged. We'll come back to how to handle that.
Gaps: what the schema processor doesn't cover
The schema processor can only work with what's in the schema files. Most systematic renames in the OpenTelemetry semconv history are codified—the HTTP and deployment renames above are good examples. But some renames are not in the schema files, typically when a rename was context-dependent and couldn't be encoded as a simple attribute map.
For example, net.peer.name → server.address appears in the HTTP convention history but is not in the schema file because the mapping was ambiguous depending on which side of the connection the span represented.
For these gaps, the transform processor with explicit OTTL rules is the right tool. You can run both processors together:
processors:
schemaprocessor:
targets:
- https://opentelemetry.io/schemas/1.27.0
transform:
trace_statements:
- context: span
statements:
# Handle renames not covered by the schema file
- set(attributes["server.address"], attributes["net.peer.name"])
where attributes["net.peer.name"] != nil and attributes["server.address"] == nil
- delete_key(attributes, "net.peer.name")
where attributes["net.peer.name"] != nilThe two statements mirror what the schema processor does internally for a rename: copy the value to the new name, then remove the old one. The set is guarded so it won't overwrite server.address if it already exists; the delete_key then cleans up the old attribute regardless, keeping the output consistent with schema processor behavior.
The example above covers span attributes only. If the same attribute appears in your metrics or logs, you'll need equivalent rules under metric_statements and log_statements too.
Signals that arrive without a schema_url also fall into this category. The schema processor skips them, so if you have services that don't emit schema_url, explicit transform rules are the only way to normalize their attribute names.
The schema processor also performs hard renames. The old attribute is removed once migrated. This is correct for steady-state operation but disruptive during an active migration, where your queries and SLOs may still reference the old name. We've proposed a copy_mode to address this.
Protecting your operational tooling
The practical goal is keeping your queries, SLOs, and alert triggers reliable through instrumentation upgrades. A few things that help:
- Normalize at the Collector. Running schema migrations at the Collector layer means your backend always receives attributes under a single canonical name, regardless of which instrumentation version produced them. You can upgrade instrumentation on a per-service basis at your own pace without updating backend queries in lockstep.
- Target a specific version and maintain it. Pick the semconv version that matches what your newest instrumentation emits and set that as your target. Review it when you upgrade instrumentation again—you may need to advance the target.
- Use
prefetchfor reliability. The processor fetches schema files from incoming signal URLs at runtime. Useprefetchto list the semconv versions you expect to receive so those schema files are cached at startup—a transient network issue won't then affect your pipeline. - Audit for gaps. Check your dataset for overlapping old and new attribute names. In Honeycomb, a GROUP BY on
schema_urlshows which semconv versions are active across your services. Any attributes that appear under both old and new names are candidates for explicit transform rules while the schema processor matures. - Use the Honeycomb MCP to investigate. The Honeycomb MCP server understands OpenTelemetry semantic conventions, so you can ask it directly which semconv versions your services are reporting, where attribute name divergence exists across your dataset, and which attributes may need explicit transform rules to normalize.
What about custom schemas?
The schema processor isn't limited to OpenTelemetry's own schema family. The same mechanism works for organization-defined schemas. If you publish your own schema file following the OpenTelemetry schema file format spec, the processor will apply it to signals that carry your custom schema_url. This opens up schema-managed migrations for your own internal attribute conventions.
This pairs well with the OpenTelemetry Weaver project, which lets you define layered schema registries—combining OpenTelemetry conventions with your own additions—and use them for validation, documentation, and tooling. That's a topic for another post.
What we're working on
The schema processor has the right design—machine-readable schema files, bidirectional migrations, automatic version chaining—but there are a few things that need to land before it's ready for general use.
Fixing the upgrade path. There is a known bug where upgrades (incoming signal older than the target) apply no changes. The processor fetches the schema file at the incoming signal's URL, which only contains history up to that version, so the forward migration steps it needs aren't there. The fix is to fetch the target version's schema file instead. This is the most important open item.
Adding copy_mode. Hard renames are disruptive during active migrations. We've proposed a copy_mode option that writes the new attribute name while preserving the old one, giving teams a window to update their queries, SLOs, and triggers before committing to the new names.
Reaching alpha stability. Once these are resolved, we're aiming to bring the processor to alpha, which would include it in the standard otelcol-contrib distribution. Most teams running a Collector today use otelcol-contrib—alpha means no custom build required and a stability signal that it's ready to use in production pipelines.
If either of these issues affects your team, head over to the GitHub issues and show your support. Community interest helps prioritize the work.
In the meantime, you can start preparing: use a GROUP BY on schema_url in Honeycomb to understand which semconv versions are active across your services today, identify where attribute name divergence exists, and plan which target version you'll configure once the processor is ready.
Semantic convention migrations don't have to be a months-long overlap of attribute names eroding your query reliability. The schema processor is being built to handle this automatically, and we're working to get it there.
Sign up today
Want to see how your own data looks across semconv versions? Sign up for a free Honeycomb account and start querying, or book office hours with one of our Developer Advocates if you want a hand getting set up.