Honeycomb’s Deployment Protection Rule for GitHub Actions

Honeycomb’s Deployment Protection Rule for GitHub Actions quickly enables canary deployments by letting you use Honeycomb query results to prevent deploying to your next target environment.

By: George Miranda

| April 20, 2023

Product Updates

Honeycomb Launches Support For New GitHub Actions Deployment Protection Rules

Today, GitHub announced the public beta of Deployment Protection Rules for GitHub Actions for GitHub Enterprise users. In support of that launch, we’ve partnered with GitHub to create the Honeycomb Deployment Protection Rule (available as a GitHub App). This rule lets you run Honeycomb queries so that you can get real-time performance feedback from your services before deciding whether to prevent deployment of your code to a specific environment.

Using real staging or canary data from Honeycomb to prevent a deployment provides additional safety against your CI/CD workflow catastrophically breaking production.

What are GitHub Actions Deployment Protection Rules?

Deployment Protection Rules are essentially an automatic gating mechanism in your GitHub Actions workflows. Previously, only certain gating mechanisms (like manual approvals) existed for deployments in GitHub Actions. Starting today, any GitHub App can provide Deployment Protection Rules that make decisions automatically in your deployment workflows.

To support that new change, Honeycomb’s new GitHub App provides a Deployment Protection Rule that lets you use Honeycomb query results to decide whether it’s safe for your deployment to proceed.

What is the Honeycomb Deployment Protection Rule?

Our Deployment Protection Rule allows you to block deployments based on the results of Honeycomb queries. That feedback mechanism isn’t intended to replace your pre-deploy CI checks. Rather, it’s a complementary and additional layer of protection that can help you catch real-time performance issues that might otherwise slip through CI unnoticed.

Deploying to production can be stressful for many teams (see: arbitrary rules like No Friday Deploys) because they lack the ability to quickly find—let alone fix—small subtle failures in production until it’s too late: they only start seeing those issues once they’ve become a much bigger problem. But Honeycomb users can build and deploy with confidence. With Honeycomb, their ability to surface deeply buried issues in complex systems, correctly locate their sources, and speed up their diagnosis makes deployments predictable and routine. That is, unless an issue that would quickly and catastrophically break production somehow got through.

The new Honeycomb Deployment Protection Rule mitigates that problem. Before unpacking functionality and use cases, let’s first see it in action.

The Honeycomb Deployment Protection Rule runs as a pre-deployment gate. It determines whether the current deployment step is allowed to proceed. That determination is made by running a Honeycomb query and comparing the results to a specified threshold. You can specify one Honeycomb query to run per GitHub Actions environment you wish to deploy to.

For example, if you’ve already set up a Honeycomb SLO for a service you’re deploying to production, you could set up a Deployment Protection Rule that queries the corresponding SLI column for that service in its staging environment to find out if a recent build caused severe stability issues. If that happened, you’d want to prevent that deployment from going to production.

When a Honeycomb Deployment Protection Rule fails your deployment, it also provides a permalink to the exact query results that failed. In this example, clicking on the link would start your investigation by seeing the exact conditions that triggered the SLI disruption.

How does the Honeycomb Deployment Protection Rule work?

The Honeycomb Deployment Protection Rule mimics the same Honeycomb workflows you already use, so there’s very little to learn. Query functionality is decoupled from the Honeycomb UI and its configuration lives inside your code repo. Queries also operate with the same mechanism as Honeycomb Triggers. If you’ve written a Honeycomb query and used a Trigger before, you know most of what you need to make our Deployment Protection Rule work.

As a developer, you control deployments by defining a query payload (the query, a threshold, and an operator) in a .honeycomb.yml file within your repo. When a deployment is requested, the GitHub App picks the appropriate query payload for the target GitHub Actions environment, sends it to Honeycomb to run, and waits for a response. After the query results return (typically within a few seconds), Honeycomb will then send back a pass or fail response to the App.

GitHub Deployment Protection Rules only work with GitHub Actions workflows that use GitHub environments as part of deployments. You can specify different Honeycomb query payloads for each target GitHub environment or reuse the same payload across all target environments.

In the earlier example, querying an SLI column as a Deployment Protection Rule, the query in your .honeycomb.yml file might look like this if you wanted to ensure its 30 minute average success rate was above 80%:

queries:
  - honeycomb_environment: staging
  spec: '{
      "time_range": 1800,
      "calculations": [
        {
            "op": "AVG",
            "column": "management_api_sli"
        }
      ],
      "filters": [
        {
            "column": "management_api_sli",
            "op": "exists"
        },
        {
            "column": "global.build_id",
            "op": "=",
            "value": "${GITHUB_RUN_ID}"
        }
      ],
      "filter_combination": "AND"
    }'
  threshold:
    operator: >
    value: 0.8

The Honeycomb query in that payload is set via the Honeycomb Query Specification. You can compose that query yourself, or you can use the Honeycomb Query Builder UI to generate the JSON for you. Note how you can interpolate the run_id of the GitHub Workflow that triggered the protection rule to target only a particular build.

See docs for the Honeycomb Deployment Protection Rule for more details.

Which queries can you use in a Honeycomb Deployment Protection Rule?

Like Honeycomb Triggers, any Honeycomb query returning results that can be evaluated against a threshold is considered valid. Any query that can be used for a trigger can also be used as a deployment protection rule.

You have a lot of flexibility with these queries. Like triggers, common use cases might be checking service health measures. Honeycomb lets you ask any arbitrary question of your data. When it comes to gating deployments, you might want to ask any number of things, like:

What’s the pool size of healthy containers in the target environment and do they have enough resources to support this new feature?
How long has an app process been running in production and is that longer than the interval since the last deploy (possibly indicating a wedge issue preventing restarts)?
Is there a flag set (like a zk-lock) that would inadvertently block this deploy?
Were any of the newly deployed binaries in the last environment failing to start correctly?
Is there a pending migration for a database that we depend on, which hasn’t yet completed?
Did requests from the loadtest user hitting the /payments endpoint return 500 errors in the qa environment during the last part of this deployment chain?
… and so on.

These examples are all valid Honeycomb queries that could be supported by our Deployment Protection Rule. But with many things being possible, which queries should you use?

At Honeycomb, our guiding philosophy is that deploys should happen quickly and often. There’s safety in speed and we believe deploys should be enabled to flow with as few delays as possible. Similarly, we believe that using Honeycomb enables our customers to focus on big picture application performance, rather than triaging and correlating hundreds of tiny measurements and alerts. Honeycomb users are well equipped to quickly see and diagnose any unknown-unknowns.

Our Deployment Protection Rule is designed with those principles. It’s intended as a backstop to deploying when something is obviously wrong; your SLO would burn through its error budget, latency suddenly spiked, an ongoing incident is still happening, etc. What is that critical and obvious performance indicator for the service you’re deploying? We recommend using a query that focuses on that as your protection rule.

To catch any other possible issue that could occur, there’s Honeycomb.

Try it today

GitHub Actions Deployment Protection Rules are in public beta and limited only to GitHub Enterprise users. The Honeycomb Deployment Rule is available for use by all Honeycomb users in every tier.

We can’t wait to see how you’ll use this in your deployments. Try out the Honeycomb Deployment Protection Rule and let us know what you think.

Have an interesting use case? Find us in the Pollinators Slack group. We’d love to hear about it!

Want to know more?

Talk to our team to arrange a custom demo or for help finding the right plan.

BOOK A CONSULTATION