Why I'm Grateful For Our Observability CommunityBy Deirdre Mahon | Last modified on April 1, 2021
It’s that season, when we take time to consider what we're grateful for and extend thanks to those we value and the experiences we treasure. One special aspect of America’s Thanksgiving holiday is the inclusiveness of celebrating across all communities and simply sharing, taking time to enjoy the fruits of the land. Giving thanks in late November can bring some fulfillment, but it should also be a reminder that we need to practice gratitude more regularly. For one thing, gratitude is proven to bring both physical and psychological health, enhance empathy, and reduce aggression. We should all train ourselves to take moments throughout the year to say thanks.
According to the Psychology Today journal, grateful people sleep better and have higher self-esteem. At Honeycomb we take time for gratitude every two weeks during our all-hands and this regular practice brings us together to form a strong bond and mutual purpose. When Christine and Charity founded the company, their commitment was to build a company that not only provided value to all software engineers but moreover “Honeycomb, is trying to democratize and simplify modern programming with “consumer-quality” software monitoring and debugging tools” for the benefit of all. For the Observability community, there are many things to be grateful for. I’ve picked my top 3:
Collaboration & building a blameless culture
When everyone works towards a set of common goals, it creates a culture of ownership and engenders a strong community and feeling of togetherness. A key tenet of Honeycomb’s engineering philosophy is to bring every team-member up to the level of the most experienced engineer, quickly and efficiently. In conversation with CJ, Eaze Principal Engineer, she said, “I amost never build a query from scratch in Honeycomb. I always start with what my colleagues have built before because the dashboards are a really good starting point”.
Query permalinks will be there for the new team members coming onboard, helping to bring them up to speed and learn how the system has behaved to that point. As the service grows and matures and new code is shipped, the cognitive leaps are done using the social parts of our brain and so asking questions like: “who on our team is the expert in this area” or “who was last on call when we saw a problem like this, and what did they do” will not only speed up resolution but reduce toil and allow the team to move on to new priorities.
In Google’s SRE handbook there’s a chapter dedicated to conducting post-mortems which are a necessary part of learning. “The cost of failure is education”. Void of finger-pointing, a blameless culture focuses on learning the causes of an incident and it’s never about ill-intent nor fear of punishment.
Open source contributions
Observability has gained momentum over the past 18 months and we continue to see new and existing vendors adopt the terminology as they recognize it as the only way forward when managing complex systems at scale. The main problem with everyone “jumping on the o11y bandwagon” is that confusion arises when terms and definitions become diluted. Words are powerful and meanings are important especially for newly emerging practices, as engineering teams adjust process and tooling to spend less time debugging and more time innovating. In this recent Newstack article titled Observability - a 3 year Retrospective, Charity Majors states: "I believe it will set the industry back by years if we cannot clearly articulate the (substantial) technical differences between monitoring and observability. But this will be up to the engineers in the field, the only people with the ability to hold vendors accountable for their language — or not."
Observability is less about logs, metrics, and traces, and more about finding the most efficient approach to understanding what is happening to your code in prod, solving real problems that affect both the engineers and the end-user customers. Throughout the software lifecycle, whether you are managing an incident on call or shipping new features, instrumenting code for better telemetry: you are involved in the practice of observability. Achieving stability and resilience while building new features for business benefit should always be the prime focus.
The most recent open source project that I am grateful for is OpenTelemetry. This is a combined effort by many contributors across both open and closed source organizations that will bring distributed tracing, metrics and logging into a single set of system components and language-specific libraries. Due to confusion, several groups decided to come together and combine efforts, creating OpenTelemetry as the next major version of OpenCensus and OpenTracing. Once generally available (in 2020), the other two projects will officially “sunset” and will have no major new features and minimal bug fixes.
We are very grateful for the many engineers that have contributed to this effort over many months. At the recent Kubecon / CloudNative Conference, Liz Fong-Jones, Honeycomb’s developer advocate, co-presented with Sarah Novotny, doing a demo to show how far this project has come. Honeycomb engineer Alyson van Hardenberg realized how confusing all this new terminology is and wrote a special blog post making it clearer. There’s still plenty of work ahead, with everyone looking forward to standardizing on a way to instrument and build useful telemetry for systems so you can better observe and debug in prod. Asking questions of telemetry data and getting answers in order to learn from your systems is an imperative to move forward.
Customer love and sharing o11y journeys
The users of observability sharing their progress is absolutely the most joyous aspect of this community. Everyday we read about user delight on our fave social channels because they now understand what’s going on in prod by using rich insights from the new questions they're able to ask of their systems. As more team-members adopt, the knowledge goes up exponentially. At Honeycomb, we enjoy this sharing so much that we take time each month to celebrate it across our internal team. My recent favorite is this from Glen Mailer. If you enjoy reading positive stories, follow us on twitter.
One such customer who is sharing their o11y journey is Eaze. On Thursday, Dec 5th at 10am PT, join CJ Silverio, Principal Engineer, and Ben Gardella, Manager of Infrastructure who will be joined by Honeycomb’s writer and community leader, Rachel "pie" Perkins to share how they are building a new microservices platform using observability insights to know what to build in what order, while maintaining the older monolith app.
For now, I’m all out of gratefuls. Please take the time to thank your colleagues and I hope you come together this Thanksgiving, push forward to reduce toil, and celebrate progress. Have a wonderful holiday.
Want your engineers to be grateful? Sign up for Honeycomb and make their lives a lot easier!
My Time As An Employee Board Member (The Weirdest Skip-Level)
In January 2022, Honeycomb kicked off a one year experiment to have an employee sit as a voting board member on the board of directors....
The Incident Retrospective Ground Rules
I joined Honeycomb as a Staff Site Reliability Engineer (SRE) midway through September, and it’s been a wild ride so far. One thing I was...
Engineers New to Honeycomb, What Did You First Notice About How We Do Things Here?
We’ve wondered, in the past, what new engineers think about how we do things at Honeycomb. This time, we asked! Meet Elliott and Reid, two...