Running the OpenTelemetry Collector in Azure Container AppsBy Martin Thwaites | Last modified on September 13, 2022
In this post, we’ll look at how to host the OpenTelemetry Collector in Azure Container Apps. There are a few gotchas with how it’s deployed, so hopefully this will stop you from hitting the same issues.
If you don’t care about the details and just want to run a script, I’ve created one here.
What are Azure Container Apps?
Azure Container Apps are the latest offering of a Managed Container Runtime in Azure. They allow you to run a container without having to think about anything in the infrastructure. Think of it as having someone else manage your Kubernetes cluster and you just push containers.
There are tons of other benefits that come with Container Apps, like built-in Authentication, and SSL termination. For this post, we won’t be using Authentication. We’ll cover securing the infrastructure in VNETs and providing authentication for the frontend app flows later.
Recently, the Container Apps team enabled Azure Files support, meaning that they also have persistent storage for your containers. What we’ll be doing is using that mechanism to host the config file, which will allow us to use the stock images, and also update the container to new images easily.
What is the OpenTelemetry Collector?
The Collector is an application that can work as an aggregation point for traces, metrics, and logs within your system. We recently wrote a post about why it’s useful.
The Collector configuration we’re going to deploy includes two examples of processors that are specific to the Collector's ability: Filtering of spans in a central place (in this case, healthchecks), and also augmenting spans with static information.
Deploying the Collector
Unfortunately (or fortunately, depending on how averse to UI Portals you are), the mechanism for adding Azure Files support is only through the APIs (CLI, etc.), so we’ll be using the Azure CLI for deploying all the components to keep it consistent.
There is also a nuance that not even the Azure CLI can create containers that use File Shares, so we’ll need to do some YAML changes to make that work.
Step 1: Create a Storage Account and Azure File Share
The first step is to create a storage account that we’ll be using for the share. This is where we’ll place the config file for the OpenTelemetry Collector so that it persists across different restarts and revisions of the Container App.
We’ll first set some environment variables in the console so that our commands will use consistent values. Make sure to add your own Honeycomb API key. You can also update each of them to use your own names.
export RESOURCE_GROUP=honeycomb-collector export ENVIRONMENT_NAME=collector-env export LOCATION=uksouth export STORAGE_ACCOUNT_NAME="collectorappstorage$RANDOM" export STORAGE_SHARE_NAME="collector-config" export STORAGE_MOUNT_NAME="configmount" export CONTAINER_APP_NAME="collector" export COLLECTOR_IMAGE=otel/opentelemetry-collector export HONEYCOMB_API_KEY=<your key>
Note: the Storage Account’s name needs to be unique, so we’re using bash’s inbuilt `$RANDOM` to provide a unique number.
Create the resource group:
az group create \ --name $RESOURCE_GROUP \ --location $LOCATION
Create the Storage Account:
az storage account create \ --resource-group $RESOURCE_GROUP \ --name $STORAGE_ACCOUNT_NAME \ --location "$LOCATION" \ --kind StorageV2 \ --sku Standard_LRS \ --enable-large-file-share
We’re using Locally Redundant Storage (LRS) which is fine, because if you’re deploying a Collector to another region, it will need its own storage account for better isolation.
Now, we create the File Share on the Storage Account:
az storage share-rm create \ --resource-group $RESOURCE_GROUP \ --storage-account $STORAGE_ACCOUNT_NAME \ --name $STORAGE_SHARE_NAME \ --quota 1024 \ --enabled-protocols SMB \ --output table
We need to use the SMB protocol, as no other protocol will work. This is due to the way that Container Apps works. Since we don’t have a performance requirement as the config is loaded at startup, this is fine.
Now, let's get a storage account key. It’s unknown right now whether RBAC will work. However, our config file will not contain any sensitive values as they’ll be injected through environment variables.
STORAGE_ACCOUNT_KEY=`az storage account keys list -n $STORAGE_ACCOUNT_NAME --query ".value" -o tsv`
This command will get the account keys for the storage account and put the primary key into a variable for usage in the next steps.
Step 2: Upload the config file
We’re going to use a very basic config file for this tutorial. This config opens up an HTTP endpoint for receiving traces, metrics, and logs, then sends them to Honeycomb using the OTLP exporter. Additionally, this config will add some attributes to the spans, and filter out healthchecks, which are the two standard additions that add some known value to your telemetry pipelines. Therefore, these are the 2 additions I see added to most of the Collector installations I’ve seen.
Put this into a file named `
receivers: otlp: protocols: http: processors: batch: attributes/collector_info: actions: - key: collector.hostname value: $HOSTNAME action: insert - key: azure.container_app.revision value: $CONTAINER_APP_REVISION action: insert - key: azure.container_app.name value: $CONTAINER_APP_NAME action: insert - key: source.blog value: "true" action: insert filter/healthcheck: spans: exclude: match_type: strict attributes: - Key: http.target Value: /health exporters: otlp: endpoint: "api.honeycomb.io:443" headers: "x-honeycomb-team": "$HONEYCOMB_API_KEY" otlp/logs: endpoint: "api.honeycomb.io:443" headers: "x-honeycomb-team": "$HONEYCOMB_API_KEY" "x-honeycomb-dataset": "$HONEYCOMB_LOGS_DATASET" service: pipelines: traces: receivers: [otlp] processors: [batch,filter/healthcheck,attributes/collector_info] exporters: [otlp] metrics: receivers: [otlp] processors: [batch] exporters: [otlp] logs: receivers: [otlp] processors: [batch] exporters: [otlp/logs]
This command will upload the file to our Azure File Share:
az storage file upload -s $STORAGE_SHARE_NAME \ --source config.yaml \ --account-key $STORAGE_ACCOUNT_KEY \ --account-name $STORAGE_ACCOUNT_NAME
Step 3: Create the container app environment
A Container App Environment is similar to a namespace in Kubernetes with a specific network. It can host multiple containers (or containers with sidecars, like a Pod), and therefore, it’s where our storage mount will go.
Create the environment:
az containerapp env create \ --name $ENVIRONMENT_NAME \ --resource-group $RESOURCE_GROUP \ --location "$LOCATION"
Add the storage mount from Step 1:
az containerapp env storage set \ --access-mode ReadWrite \ --azure-file-account-name $STORAGE_ACCOUNT_NAME \ --azure-file-account-key $STORAGE_ACCOUNT_KEY \ --azure-file-share-name $STORAGE_SHARE_NAME \ --storage-name $STORAGE_MOUNT_NAME \ --name $ENVIRONMENT_NAME \ --resource-group $RESOURCE_GROUP
Step 4: Add the container
az containerapp create \ --name $CONTAINER_APP_NAME \ --resource-group $RESOURCE_GROUP \ --environment $ENVIRONMENT_NAME \ --image $COLLECTOR_IMAGE \ --min-replicas 1 \ --max-replicas 1 \ --target-port 4318 \ --ingress external \ --secrets "honeycomb-api-key=$HONEYCOMB_API_KEY \ --env-vars "HONEYCOMB_API_KEY=secretref:honeycomb-api-key" "HONEYCOMB_LOGS_DATASET=azure-logs"
Now comes the tricky part. Since the CLI doesn’t allow you to create the container with the mounts, we need to download the `
yaml` config, make some changes, and push it back up.
We download the config using the following command that will put the current config into a file called `
az containerapp show \ --name $CONTAINER_APP_NAME --resource-group $RESOURCE_GROUP --output yaml > app.yaml
Option 1: Manual changes
You’ll need to add two sections to the config.
The first is to add the Storage Mount as a volume to the container app. This is done in the `
template: … volumes: - name: config storageName: configmount storageType: AzureFile
storageName` attribute here should match the name set up in `
storage-name` used in Step 3.
The second is adding the volume to the container. This is done in the `
Template: containers: - … volumeMounts: - mountPath: /etc/otelcol volumeName: config
mountPath` here is special, in that it’s the location that the Collector expects it to be in.
Finally, you’ll need to remove the `
secrets` element as we don’t want to update that.
properties: configuration: secrets: - name: honeycomb-api-key
Option 2: Use the yq CLI
yq` is a utility that will allow you to amend YAML files on the CLI in a more programmatic way. The download and installation instructions can be found on their GitBook.
yq -i ' .properties.template.volumes.name = "config" | .properties.template.volumes.storageName = strenv(STORAGE_MOUNT_NAME) | .properties.template.volumes.storageType = "AzureFile" | .properties.template.containers.volumeMounts.volumeName = "config" | .properties.template.containers.volumeMounts.mountPath = "/etc/otelcol" | del(.properties.configuration.secrets) ' app.yaml
Much simpler, right?
Now, you can upload the file back to Azure.
az containerapp update \ --name $CONTAINER_APP_NAME \ --resource-group $RESOURCE_GROUP \ --yaml app.yaml
At this stage, you should have a Collector running in Azure Container Apps. Go you!
The domain name is provided in the app.yaml. Alternatively, you can use this command to get it:
az containerapp ingress show \ --name $CONTAINER_APP_NAME \ --resource-group $RESOURCE_GROUP \ --query "fqdn" -o tsv
Note: The port is 443, and HTTPS only. Some SDKs will require you to add `
/v1/traces` to the end of the URL to get traces in. This is particularly important in the .NET SDK for OpenTelemetry.
Azure Container Apps are a great way to get yourself a Collector for your infrastructure and start centralizing and securing your config. In this blog, you’ve learned the steps needed to make it work, and also how you can easily apply some customizations.
Take a look at the GitHub repository for a script that will make this easier, and will allow you to start using the Collector as your egress!
In future posts, we’ll look at how you can secure this inside of your VNETs—and provide scaling, too.
Thanks for reading!
In telemetry jargon, a pipeline is a directed acyclic graph (DAG) of nodes that carry emitted signals from an application to a backend. In an...