OpenTelemetry  

Action Recommended: OpenTelemetry HTTP Attributes Breaking Changes

By Phillip Carter  |   Last modified on December 11, 2023

Earlier this year, the folks working on OpenTelemetry launched an effort to stabilize HTTP Semantic Conventions. In November 2023, OpenTelemetry announced that HTTP Semantic Conventions were stable. They accomplished this by merging the existing HTTP Semantic Conventions with the Elastic Common Schema HTTP attribute conventions. The benefits in doing so are numerous: 

  • Stabilizes the format of data for the most-used OpenTelemetry instrumentation libraries
  • Makes OpenTelemetry compatible with any system that also uses the ECS HTTP schema
  • Provides a way for all HTTP instrumentation libraries (the most popular libraries in OpenTelemetry) to offer stable, 1.0-quality releases

However, there are many breaking changes in these attributes. Most attributes were renamed or removed, new ones were added, and the http.target attribute was split into two attributes, url.path and url.query.

Who will this affect?

Later this month (December 2023), Java and .NET will release instrumentation versions that emit the new, stable values for HTTP by default. Up until now, they had supported emitting them behind an environment variable, but that will no longer be an option—only new values will be emitted. 

  • If you have Java or .NET services, you should prepare for the change 
  • If you do not have Java or .NET services, you won’t be affected by these changes yet 
  • If you have a mix of languages that includes Java or .NET, you may want to hold off on upgrading until the changes are consistent in all instrumentations (via library or auto-instrumentation agent) relevant to you, then coordinate a release

As of now, no other languages have planned library changes (Node, Go, Python, Ruby, etc.), but most likely will in the future.

What’s changing?

The following attributes will change:

Current New Comments
http.method http.request.method Now captures only nine common HTTP methods by default (configurable) plus _OTHER
http.status_code http.response.status_code
http.request.header.<key> Dash ("-") to underscore ("_") normalization in <key> has been removed• On HTTP server spans: now must be provided to sampler
http.response.header.<key> Dash ("-") to underscore ("_") normalization in <key> has been removed
http.request_content_length http.request.body.size
http.response_content_length http.response.body.size
user_agent.original On HTTP client spans: Recommended → Opt-In• On HTTP server spans: now must be provided to sampler
net.protocol.name network.protocol.name
net.protocol.version network.protocol.version
net.sock.family Removed
net.sock.peer.addr network.peer.address On HTTP server spans: if http.client_ip was unknown, then also net.sock.peer.addr → client.address; client.address must be provided to sampler
net.sock.peer.port network.peer.port Now captured even if same as server.port
net.sock.peer.name Removed
http.request.method_original New, Only captured when http.request.method is _OTHER
error.type New
http.url url.full
http.resend_count http.request.resend_count
net.peer.name server.address
net.peer.port server.port Now captured even when same as default port for scheme
http.target url.path AND url.query Split into two separate attributes
http.scheme url.scheme Now factors in X-Forwarded-Proto, Forwarded#proto headers
http.client_ip client.address If http.client_ip was unknown (i.e., no X-Forwarded-For, Forwarded#for headers), then net.sock.peer.addr → client.address; now must be provided to sampler
net.host.name server.address Now based only on Host, :authority, X-Forwarded-Host, Forwarded#host headers
net.host.port server.port Now based only on Host, :authority, X-Forwarded-Host, Forwarded#host headers

What can break?

Let’s say you update your HTTP instrumentations or Java auto-instrumentation agents. In that case, the following things may break:

  • Triggers or SLOs

    • SLO tracking latency with an SLI that checks for http.status_code or a trigger with a GROUP_BY on http.route

  • Refinery rule conditions and field lists

    • Conditions or fields that rely on http.status_code, http.route, or http.method

  • OpenTelemetry Collector configurations

    • Processors like the transform, filter, or redaction processors that look for HTTP attribute fields

  • Boards and saved queries

    • Any Boards and associated queries that use HTTP attributes in the VIZUALIZE, WHERE, or GROUP_BY sections of the query builder

  • Derived Columns not used by SLOs

    • These might impact your saved Boards and queries, but look for Derived Columns that use conditionals or concatenation or regex functions against HTTP attribute columns

The consequences of this change can include the following:

  • Collector configurations fail, leading to specific data being dropped or included unexpectedly
  • Sampling rules with Refinery fail, leading to a large and unexpected uptick in events sent to Honeycomb
  • Queries are no longer accurate
  • Triggers send an alert unexpectedly, or fail to send an alert when they should have
  • SLOs stop accurately counting events, leading to incorrect burn or burn rate alerts

We highly recommend that you plan to make changes in any of the above applicable areas before HTTP instrumentations proliferate.

What can you do about it today?

First, if you update HTTP instrumentations or auto-instrumentation agents to the latest version, do so carefully. To mitigate the impact of updating your HTTP instrumentations, you should audit them for any HTTP attributes. We recommend the following, in this order:

  1. Use the OTEL_SEMCONV_STABILITY_OPT_IN environment variable at the application/service level to double-send old and new values

    • This is relevant for Java today

  2. Check your OpenTelemetry Collector configuration

    • Consider using the transform processor to write both new and old attributes if you’ll be sending new attributes in at least one service

  3. Check your Refinery configuration

    • Consider writing duplicate sampling rules, if applicable
    • If you are an Enterprise customer, reach out to your account team to get help

  4. Check your triggers and SLOs and how you alert

    • Be aware that alerts may or may not fire unexpectedly, and adjust these according to the new attribute names if you plan to take the new attributes

If you have not taken any updates to HTTP libraries or auto-instrumentation, you do not need to make changes yet.

Consider centralizing how you update OpenTelemetry libraries and/or agents. If anyone in your organization pushes an update, they may inadvertently break things in Honeycomb. By centralizing your telemetry instrumentation and version management, you can prevent this scenario.

What’s OpenTelemetry doing about it?

The migration plan section of the OpenTelemetry blog post contains details about how OpenTelemetry will handle a migration period.

Libraries that update to the stable conventions are encouraged to introduce an environment variable that will let users control which attributes are emitted. They are also encouraged to continue to patch older versions for six months.

However, it’s important to understand that this period will not be uniform across all OpenTelemetry supported languages. Different languages will introduce stable conventions at different times, so if you have a multi-language system, you may need to hold off migrating until every HTTP instrumentation you use emits the new values.

What’s Honeycomb doing about it?

This notice is the first of several communications we will make. As more relevant updates happen to OpenTelemetry, such as a prominent instrumentation updating to the new semantic conventions, we will update you and provide refined guidelines on how to take action.

Thank you for your patience!

We’re working out how to best handle this transition. Although we’re excited about the HTTP attribute Semantic Conventions becoming stable, we’re very aware that the migration to these stable attributes may be painful. We will continue to notify you of this change and will provide updates when appropriate.

 

Related Posts

OpenTelemetry   Observability  

Real User Monitoring With a Splash of OpenTelemetry

You're probably familiar with the concept of real user monitoring (RUM) and how it's used to monitor websites or mobile applications. If not, here's the...

OpenTelemetry  

Transitioning to OpenTelemetry

This article touches on how we at Birdie handled our transition from logs towards using OpenTelemetry as the primary mechanism for achieving world-class observability of...

OpenTelemetry   Instrumentation  

Instrumenting a Demo App With OpenTelemetry and Honeycomb

A few days ago, I was in a meeting with a prospect who was just starting to try out OpenTelemetry. One of the things that...