High Availability with the Honeycomb AWS Bundle

High Availability with the Honeycomb AWS Bundle


The time has finally come–you can now run the Honeycomb AWS Bundle with a high availability setup! No more anxiously hoping your instance stays up and if it doesn’t, that you can restart rapidly!

image of a cupcake with a hurray flag

Let’s talk about the changes:

How state is maintained

Let’s use Elastic Load Balancer log ingestion as an example – honeyelb, the component of the Honeycomb AWS Bundle which processes ELB logs, works by asking S3 for the last hour of logs from your specified ELBs.

Here’s how it works without high availability:

  • honeyelb references a local state file with a list of processed S3 logs stored by their S3 AWS path and time stamp
  • If the log is not found in the state file, it is downloaded, parsed, and an event is sent to Honeycomb
    • The log path and timestamp is added to the processed objects file
  • honeyelb publishes these events and waits for new logs to arrive in the S3 bucket

In this setup (without high availability), there’s no way for agents running on different instances to ensure that a log hasn’t already been processed. Since the file of processed objects is stored locally on each instance, we can’t check across multiple agents. What happens if one of your agents goes down? Or the box OOMs and restarts? If that happens, we’ve lost all records of what objects have already been processed!

How do we fix this? What if we could move the state that maintains processed objects out of the instances themselves?

Using DynamoDB for High Availability

We decided to solve this problem by using a global conditional write/lock system. DynamoDB within AWS provides the backend we need to make this happen. Logs are written to DynamoDB where they are processed and can thus be referenced from any Honeycomb AWS Bundle agent.

Entries to DynamoDB have a TTL (Time to Live) of 7 days, but by default maintain a backfill of 1 hour. This means that an agent is looking for a max of one hour on S3 logs and after 7 days, the entries into DynamoDB expire – this way your DynamoDB table will not fill endlessly. You can also specify how long you’d like the backfill to be, up to 7 days.

./honeyelb --highavail --writekey={{ Your Writekey }} --dataset="{{ Your DataSet }}" --backfill=120  ingest {{ Your LoadBalancers }}

INFO[2018-02-12T12:29:37-05:00] State tracking with high availability enabled - using DynamoDB
INFO[2018-02-12T12:29:38-05:00] Attempting to ingest LB                       lbName=shepherd-dogfood-lb
INFO[2018-02-12T12:29:38-05:00] Access logs are enabled for ELB ♥             bucket=honeycomb-elb-access-logs lbName=shepherd-dogfood-lb
INFO[2018-02-12T12:29:38-05:00] Getting recent objects                        entity=shepherd-dogfood-lb prefix=shepherd-dogfood/AWSLogs/702835727665/elasticloadbalancing/us-east-1/2018/02/12/702835727665_elasticloadbalancing_us-east-1_shepherd-dogfood-lb
INFO[2018-02-12T12:29:38-05:00] Downloading access logs from object           entity=shepherd-dogfood-lb from_time_ago=17h28m20.822074s key=shepherd-dogfood/AWSLogs/702835727665/elasticloadbalancing/us-east-1/2018/02/12/702835727665_elasticloadbalancing_us-east-1_shepherd-dogfood-lb_20180212T0000Z_107.23.163.113_3bgrbnrm.log size=5887379

You can Chaos Monkey away and now your logs will still be processed!

gif of a windup monkey

Wrapping up

With high availability now an option, Honeycomb AWS agents can run on multiple clients and all communicate to the same source of truth. For more on requirements to run —highavail and setup details, check out the README.
If this has piqued your interest in Honeycomb, sign up for a free trial!