Raw & Real Ep. 2
Tame Your Alerts! SLOs cut the noise
& calm the team

 

+ Transcript:

Kelly Gallamore [Manager, Demand Gen|Honeycomb]:

Hi, everyone. Welcome to Raw and Real, this is our short and sweet product demo series, where we focus on how Honeycomb uses Honeycomb. If you’re listening out there in the audience, let me know in the question box or wherever you can, let me know you can hear me. I’d appreciate that. While everybody’s finishing heating up their coffee this morning, we’ll give it two minutes. We’ll start promptly in two minutes. At two minutes, after the hour, we’ll get going, here, with our episode for today. 

If you’re just joining us for Raw and Real, we’re going to begin in a couple of minutes. We do have a live stream going for this event. One second. This live stream is our live captions, so if you are interested in live captions for this event, you can find it at StreamText at Honeycombio captions. It’s also on our Raw and Real signup page. For those of you interested in live captions check us out on StreamText. We’ll give it another minute, here, and get started. 

Thank you for saying you can hear me. Are we as awesome in person as you sound on air? That’s a great question and I just want to let you all know that you can ask any questions at any time. We’ll get to them at the end if we don’t cover them in the presentation. You’re here with Kelly and Wilde. Michael, you want to get going here? Should we get this show on the road?

Michael Wilde [Dir of Sales Engineering|Honeycomb]

I think that’s a fantastic idea. 

2:34

Kelly Gallamore: 

Today’s episode, episode 2, is Tame Your Alerts! SLO’s Cut the Noise and Calm the Team. And I’m really excited to talk about this, Michael, it’s kind of stressful times right now. Being able to focus on what matters most is really helping me keep sane. How are you doing?

Michael Wilde: 

Tell me about it. It’s really interesting in this weird time. Some of us are stoked we get to work at home. Some of us are going crazy because we are working from home. Most of the folks listening to my voice right now are probably some sort of tech worker, right? And, the jobs that we can do, we can do almost 24 hours a day so sometimes you might find yourself working more than you’d hoped, so just be careful out there. 

Kelly Gallamore: 

I appreciate you talking about that. I’m Kelly. Michael, my teammate at Honeycomb. I’m excited to focus on SLOs & BubbleUp today. SLO stands for Service Level Objective. Michael, can you tell everybody what that means? 

Michael Wilde: 

Service Level Objective. The easiest way to think about it is, all it is a goal. A goal that you, as a team, decide on how well you want your service to work and what working actually means. Uptime is vastly different than working, as we know. If you agree on how well you want it to work and you agree with what works, you also agree when to bother each other when it’s not working right. Today, we’re going to learn a whole bunch about it and how we do it at Honeycomb. But it’s not such an esoteric topic. It’s really easy to do and try to make your life better. 

Kelly Gallamore: 

I’m excited to dig into it. As stressful as it is around the world, I’m really thinking about our folks on call. People who build and maintain software, people who are in operations. A lot of our customers, some of our customers have seasonal times that they know they are going to be busy, right? We see this at Christmas time. We see it on Thanksgiving. Some folks on Earth Day. They have their seasonal, all of our experience tells us this is when it’s going to hit. In the past few weeks, there are definitely some services, some companies that use Honeycomb, and some who don’t, who have seen an unpredictable amount of attention on their products. I imagine all the alerts they have reporting on what they knew about their system, with all their experience, I can imagine the noise. Like, my partner is on call. Our team is on call. What I see our team doing is making smart decisions on what to say “yes” to. Service level objectives, as I understand it, help us on what our users need. Could you talk about that? 

Michael Wilde: 

That’s a good point, the saying yes to. Honeycomb, like every other product in the world, can create alerts. Alerts are great because then you use this technology the engineering team built to do something for you. Instead of you having to look at it all the time, you have the robots let you know. The challenge is, a lot of times, alerts lack context and sometimes there’s an alert, we’ve all been in this situation where we started the job and there were alerts that were there. There were alerts that were there, that were created and that person who created it is now gone and no one wants to get rid of an alert because what might happen? We might miss something. The idea of Service Level Objectives lets Honeycomb give us a way through that. And I’d like to show you a little bit about that.

Kelly Gallamore:

Yes, please show us what that looks like.

Michael Wilde:

Okay. First, before I show you something. We’re going to talk about dogfooding right here. It is the process of using your own. So we’ve got Dixie. Do you want a cookie, Dixie? Do you want to say hi Dixie? Dixie’s my chocolate lab. So I brought Dixie with me, because some folks, like Ryan and Vlad, in our Slack group, Honeycomb Pollinators, thought it was a good idea. 

Let me show you a few things. One of the key parts that Kelly mentioned was on call. Okay. Honeycomb is a product that you use and may work with, with me on, and my other teammates. But Honeycomb is also a production service. So, right now, I’m showing you this Slack window because there are two people, from our team, Martin and Alyson, that are on call. 

Kelly Gallamore: 

I’m sorry to interrupt. Are you sharing your screen? 

Michael Wilde: 

You should have seen Slack. 

Kelly Gallamore: 

I don’t see it yet. 

 Michael Wilde: 

Okay. Let me retry the whole screen. 

Kelly Gallamore: 

Okay. 

Michael Wilde: 

Weird. 

Kelly Gallamore: 

No problem. Folks definitely see Dixie.

Michael Wilde: 

Are you seeing my screen now?

Kelly Gallamore: 

I am not seeing your screen. 

Michael Wilde: 

Maybe folks on the other end might let me know if they were seeing my screen at all? 

Kelly Gallamore: 

Yeah.

Michael Wilde:

It’s possible that Kelly might not be seeing my screen.

Kelly Gallamore: 

Yeah, let’s find out if it’s me here. One second. We can see the screen. I hear it from our team over here. 

8:20

Michael Wilde: 

Okay. Great. So, back to what we were talking about, of course, it’s Raw and Real, so it’s okay to make mistakes. Alyson and Martin are on call right now. They build and support Honeycomb, so I have to respect their time because the challenge of an engineer, they’re not solving problems 24/7, they’re building code to then solve problems or create features. 

So, given that I know that Alyson and Martin are on call, one thing I actually don’t want to do is chat with them right now because they’re probably in the middle of something and I don’t want to create a bunch of alerts for them that are spurious. Let’s take a look, a little bit, at about kind of this approach that we’re taking, at Honeycomb, of a better way to generally alert. 

If we check out the screen, this is the Honeycomb default user interface, except this, like Dixie, is Dogfood. Well, Dixie’s not actually dog food, she’s my dog. But, this is one of the few Honeycomb deployments of the world and the job of Dogfood is to monitor and provide observability for the production Honeycomb that our customers use. The very thing Alyson and Martin are supporting now. If we look at the left-hand side of the screen, we see “recent activity.” And we did another Raw & Real. It was called Remote, But Not Alone. It was all about this activity thing. Yes, we can see, Alyson’s on call. Molly’s in support. They’re at work. They’re doing what we need them to do. 

Well, one of the things that when Alyson debugs Honeycomb is, she has the ability to look at any of the services that we have and ask a vast number of questions. So, we’ll look at two different things today, the ingestion API, as far as services, and we’ll look at our web front end, which we call Poodle. But before we get there, you’re going to see me clicking and dragging and showing BubbleUp and a bunch of other things and telling you about features that you shouldn’t really use very often. 

How does data get into Honeycomb? It gets there via structured events being generated by systems. It could be a log. But in our case, we add fields to every feature that we create. So, from the user interface, this is a little bit of Honeycomb’s source code on how we’re adding fields and traces. If Alyson wants to ask a question about a specific user, having the field, in there, of user.email, is fantastic. Okay. Maybe if it was the very first time that you got invited to Honeycomb, having the invite token there. All of these things that are available on every log event or span make it possible for folks to ask an interesting number of questions. 

Now, if we were to kind of skip ahead to part of the end, when it comes to alerting, well, how do we alert? As I said, most systems have a way for you to automate their querying, searching, whatever. If you look, in Honeycomb, we do that, it’s called triggers. You can create triggers. In Honeycomb, you can create the same kind of triggers that will cause somebody to not delete that three years from now. But if you look at what we do, half of them are disabled. Mostly because we now do SLOs. But many of them are really informative. Right? Someone in support, Molly, if a specific customer has traffic that drops low. They’re going to Molly, they’re not going to on-call. We’re working on this Grafana plugin, and are people using it? So, we’ll use Honeycomb Dogfood to notify ourselves of non-critical events with triggering. Sure, there are some critical events. A trigger, itself, allows us to run a query on a frequent basis. Since Honeycomb is a high cardinality analytic system, you could run queries every minute and not have them stack up. 

But ultimately I get to run a query and then do something about it. In this case, we’re just looking at any time someone’s using this plugin. And it has alerted a project channel in Slack. So, that’s great. But, if we’re not using most of these random triggers for production, what are we using? Okay. So, let’s show you the mother of all alerts. And the mother of all alerts is some query or some search. When we were talking earlier about Alyson and what she can and can’t do in Honeycomb if we were looking at that specific service, which in this case we call Poodle, which is the front end of Honeycomb. Granted, you’re looking at a front end. Just follow me on this. It’s the thing that everyone logs into. 

On the right-hand side of the screen, while we see my history and team activity, we also see details. And that’s all the fields that were generated from the code that’s running to build Honeycomb. So, as you can see, we’ve got everything from feature flags in here, to email, too, you know, all sorts of different things that are created. I never counted. There’s probably 200, 300, 400. Who knows. There’s a lot. It doesn’t cost the Honeycomb user anything else because we like to deal with wide events. You add 200 fields to a span and it’s no big deal. It makes it possible for her to ask a lot of questions, but it also makes it possible for you to create a billion alerts. 

The first thing we start doing is figuring out, is there something interesting? So we have a query here. BubbleUp is a really great tool because it helps me look at something I find interesting, in this case, some page load time might be more than 14,000 milliseconds. Maybe that’s long. What BubbleUp does, it does this analysis and helps me understand, well, what’s going on? It looks like it’s happening on one particular dataset. Okay. You know, it’s one particular team. Maybe it’s a specific team ID. And of course, I could, you know, drill into a deep level and actually go and look at a trace or an event. BubbleUp becomes really powerful because its job is to point you in the direction that you might even know that you should look. I say it’s like asking the second question. So we get down to this trace and we find that somebody took 15 seconds to load the page.

15:20

Kelly Gallamore: 

I see what you’re saying. So BubbleUp lets me select data points that aren’t really in the baseline so you can really compare just, like, what’s interesting over the baseline of everything that’s happening. Do I understand that right? 

Michael Wilde: 

Yeah, you’re right. But, it helps me make sense out of these points that are on a chart because often, we can see something slow, but we’re not sure why and we’re actually not sure what the next thing we’re supposed to look for is on. Ultimately we’ll get to this point where, hey we found someone’s experience and we have lots of fields over here and that’s fantastic. And it looks like Uyen, one of my fellow sales reps here, had a slower page load time for whatever reason. The real question is, should we alert because Uyen had a bad time? We’re not going to alert if a Honeycomb employee had a bad time. I might run a P95 or find the 95th percentile latency. Basically filter out all the slow things.

So, we have some that are slow and okay, that’s 18 seconds, what should we do about it? Well, the mother of all alerts is a well-crafted searcher query. You can do this in any tool, but in this case, we might say, let’s do a P95 on duration, we don’t alert on a heat map, per se. If it’s more than, you know, 12,000 milliseconds, send an alert. This is where I was talking about that maybe you shouldn’t do this. Maybe you shouldn’t just randomly create alerts because no one knows why Wilde created this alert unless I put a description that says “delete this after a year” or “throw it away after I’m no longer here.” So what should we do? We could do ourselves a little bit better by using Service Level Objectives. 

To Kelly’s question about alerting, if you’re not necessarily going to do rando alerts, then how do you make an agreement on your team of what your service is and how the thing is supposed to perform? You sit down, look at your data, and actually talk about it. Right? And you also talk about what you want to do. So, we talked amongst ourselves and said, where are people going to notice where Honeycomb is performing slow or not right or bad experience? Some of those people are not you, the customer, some of them are internal engineers. If build time is slow, maybe that’s something because we don’t want to have grumpy engineers waiting for their build to finish. 

If we look at specifically two SLOs, the ingestion API. So, a lot of stuff on the screen, here, but let me break it down for you. So, one, and this is unique about Honeycomb because it’s back end data store allows for the analysis of the raw data in near real-time, in very high dimensionality. Not the kind of thing a metrics tool could do. Not the thing a log analytics tool could do. This is really the kind of analytics your CFO has access to, but in this case, your developers have access to it.

We sat down and said, our ingestion pipeline needs to work. It needs to work 99.9% of the time doing specific things, not available. Okay. It’s not like Pingdom says we’re up. Yes, it’s important we’re up. If we’re not up, we can’t succeed. But, what is an eligible event? The SLI or Service Level Indicator is the filter that says, of all the stuff that comes in this is the class of events we’ll consider worthy of evaluation like some random HTTP 400 error is not our fault. So I would never want to alert on that. I would never want to consider that success or failure. The simple SLI is a filter that says, in English, it has to be on two specific endpoints and a post. It can’t be the following things and it has to be 200. This is an error SLO.

Well, because Honeycomb’s got this fast analytics engine, it knows the exact number of events that should succeed in that 30 day period. It builds an error budget. It lets you know where you are. It allows you to have exhaustion alerts prior to you getting below your goal. Not prior to being down, but prior to getting lower than your goal. Alyson is not right now, Martin is not right now getting alerted because someone is having less than stellar time ingesting their events. However, the cool thing about observability is, we can go in and look at any time and debug anything. 

So, I’ve not alerted these folks but, customer success, customer support, anybody else, even a random engineer could go in here and say, I wonder how things are doing? Looks like someone is sending events that are too big and fields are associated with that. Should we alert on the front end? So, interestingly, you know, we saw that alert before. We saw that maybe Uyen was having a slow page load time. This is the SLO for Poodle page loads, which is the front end. We do have triggering. Granted, we’re still above 99% of eligible events, which in this case, it’s a page load time. Could be any page load event, less than 10 seconds. So, as we can see, Uyen’s would have fallen out of that. Whether it’s triggered or not, I have the ability to see exactly which users are affected, potentially, which teams and all these other facets, and that BubbleUp has been done for me automatically.

21:21

Couple other things you might see, the home page itself, when you load not www.Honeycomb.io, but this main home page here, that should load within a certain amount of time. So, it seems like there’s some correlation with some slow page loads on both queries and the home page itself. But as we can see, this is a specifically different SLI. In this case, we don’t alert if a Honeycomb employee is having an issue. They’re very important. 

But as you can see, Uyen’s slow page load time wouldn’t happen here. But, you know, there’s a couple of other things in here that say, “can’t be support, have to be logged in, has to be less than 250 milliseconds.” That is the agreement you set. What is success? How well do you want to be? How well do you want to be at your service, and what should you do about it when it’s not working so well? That’s what we generally think at Honeycomb is, set some goals, try to meet them. Make sure that you see, let’s stop sharing. Are we back? Trying to click the “stop sharing” button here. Am I still sharing, Kelly? 

Kelly Gallamore: 

It looks like it. I actually had to pull up my phone so I could see. 

Michael Wilde: 

Okay. I’m trying to click on the “not share” screen. It thinks I’m sharing.

Kelly Gallamore: 

Good. Perfect. 

Michael Wilde: 

You can take over or kill that. The point here is that there should be a way of having much saner alerting. I like the idea of service level objectives because we’ve got this ability to kind of keep Alyson focused on, you know, non-random stuff basically.

Kelly Gallamore: 

Gotcha. I really appreciate you talking about that. I want to remind everyone listening at home if you have any questions you can put them in the Question Box at the bottom of your screen. We will take questions at this time. And while we’re waiting, what I’m curious about Michael is, I can imagine a team coming together with some goals. You just get started. You have one goal. This is a new thing for us. We’re going to start with Service Level Objectives. Do you know what one of the first goals is that our team set? 

Michael Wilde: 

Some of the first goals that we set were basically for the two core services that we have, right? There are two main things that Honeycomb actually has to be good at. It’s taking events in, right? That goes through our Shepherd Event Service. And presenting a user interface, which is Poodle. How is the query actually working? For most situations, as I said before, I think about where users will see your stellar software working great or they’ll notice that it’s not so great. 

The one thing about SLOs is you also want to match the user’s memory. If you set an SLO for 99.5% of eligible events in a one day period, if you fail yesterday and succeed today, they’ll remember yesterday. Actually you’ll be in the same boat of people freaking out over alerts, versus saying, hey, let’s judge ourselves on a weekly basis or a monthly basis and then have data that backs that up as well.

Kelly Gallamore: 

I can see how you have to start with something and see how it works and then iterate from there. Is it easy to adjust SLOs in Honeycomb? 

Michael Wilde: 

It is. That’s great. The SLO comes in two specific parts. One, it’s the SLI. So, you know, the filter that says it’s on this endpoint, it’s of this latency, it’s of this error code. It’s pretty frequent that when someone makes an SLO, they’ve judged themselves in a way that might be a little too harsh or they might have too many events and it turns out instantly they make this SLO, they are blown out. Negative 9,000 percent. You go through this process of just evaluating, what are the events you have? What are the things that are successful and failures? You take and tweak the SLI to get that scope of events, and then Honeycomb will measure that for you. 

Does it mean you can perform at four 9s, five 9s, three 9s? It doesn’t, necessarily. But you may set your goal and say, hey, we can’t be at four 9s. We’ve got the success versus failures, but we can’t be four 9s. Let’s try to be three 9s or two 9s and then when your error budget expires, that’s the point in time you want to stop shipping code, fix the things causing the error, and make a decision whether you want to continue that. It’s a refinement process. We do it over time. It’s very, very easy to do in Honeycomb.

26:32 

Kelly Gallamore: 

I really like how you describe it. It seems to me that all of these nuances actually allow production to be more resilient, more reliable. And I feel like we could actually talk about that for a really long time. I’m going to switch over. We have a few questions coming in. How many SLIs make up an SLO? 

Michael Wilde: 

A Service Level Objective has a Service Level Indicator. One SLI makes up an SLO. But, an SLI, itself, could be a fairly complex filter on your events or a complex set of scenarios that you might have a bunch of “ifs, ands, ors,” things like that, that give you the idea. From that SLI, we’re going to have a successful or failed event. And that’s going to correspond to an SLO. What you don’t want to do is make a thousand SLOs because then you’ve basically replicated the process of errors of triggers or alerts in any other way. You want to look at your service and say, how should we measure ourselves? How do we want to perform? The great thing is, you can actually report to your executive team and your whole team, hey, you know what? I know you might not think we’re up, but actually we’re still above four 9s. Sure, some people complain, but our team is doing what they’re supposed to do.

Kelly Gallamore: 

You’re right. That story can often stand out. I appreciate that SLOs are a data-driven way to measure how production is actually performing, where everybody can key into it so you can take the stories that people remember and talk about the data behind it to say, okay, this is what’s actually going on. You get a deeper meaning behind it. 

I’m going to skip to this one right here, you talked about not doing thousands of SLOs to start. What’s your recommendation, generally, for how many SLOs someone should start out with? Or a team should start out with? 

Michael Wilde: 

Is there a chance you could take over sharing? All I see is a black screen. Kill my ability to share so we can talk on the screen. Anyways, to answer that question, how many SLOs should you start out with? I would say, I would try to get coverage on most of your services. If you have 150 microservices, does that mean you want 150 SLOs? Maybe. Maybe not. You might look at the core services the thing that you build does. Let’s say if you’re a financial services institution, you have a login page, authentication, that’s a good one. And then perhaps that next screen which paints balances, retrieves accounts, or whatever, maybe there are two or three services that make up parts of that page, you may want to have SLOs for those. You also might have a back end service that the front end users would never notice but other developer services call frequently and they are the ones that have to say, hey, are we up? 

Because the one thing about SLOs, you can literally give this to a first-line support person because the very first thing people ask is, are we up? The SLO says, yes, you’re up. Or, no, we’re not. Because even on the front line, you want to be able to know that other people…   awesome, somebody is seeing both of us talking, that’s great. You want other people, you know, if you’re running a production system, you want awareness for everybody on how we’re doing, right? So, it’s not like Molly has to ask, hey, are we up? Molly can look at the production system of the front end of Honeycomb and say, oh, it looks like we’re experiencing some trouble and it’s been triggered. Awesome. I have two pieces of info that I can now tell my end customer and then I can contact on call and see what they’re doing about the problem. 

Then the other thing is the keys to get management buy-in. One great way to look at this is how do you measure success? Sure, if you make software as a service, or you provide something to customers that actually sign contracts with you, you often have an SLA. But those are very high level. Sometimes, internally, people measure themselves, and I’m sounding incredulous here. Sometimes the way people measure themselves is the number of complaints or thumbs up on your Yelp page, or the number of times Pingdom says we’re up, or the status page. How is anybody in ops and engineering able to know that the stuff that I shipped is working and that I did a good job? Yeah, it works. Great, it works in tests and everything looks fine. But what’s the real experience of the user? 

I think asking yourselves, asking management how is it that we measure ourselves? I hope to heck it isn’t the number of times alerts happen. I hope the heck it is looking at a user, what their experience is, whether it’s internal/external, understanding what a successful experience is, and if you’re meeting your goals. You look at an SLO and the number of nines, that translates to N number of allowable failures, which is important because you actually have to give yourself permission to fail. If you don’t, you’re going to go mad. You have to tell your executives we cannot succeed 100% of transactions, not every single trade goes through 100% of the time on NASDAQ. It’s a fact. 

So, that said, if you can understand how to measure yourselves, give yourselves some sanity, you end up having a happier team, that stays working at your company, and your executives have the ability to say, oh, look, our authentication service is performing. What if we add 10,000 users a second to it? Those are some of the things to think about and I’ll bet if you ask the internal question of, how do we measure ourselves, you’ll be quite surprised. 

Kelly Gallamore: 

Okay. That’s really, really helpful. And I like how you put it. I think if anybody had this idea, in the before times, as we call it right now, in the past several weeks, perfection doesn’t exist and that’s being proven globally. It’s a good time to have these nuanced conversations about, here’s what we can do if we talk about the math behind our experience. If we talk about our users and their happiness as the ultimate goal, this observability gives you the nuance to get in there and focus on a more kind of individual level. But without sacrificing speed or reliability. That’s a really kind of big thing to wrap heads around. That’s something we could talk about for a while. What I’m learning is people have tons of questions. How do we ensure that our alerts are actionable?

34:09

Michael Wilde: 

That’s a good question. So, like, you know, how do you ensure they’re actionable? I would say that, first, I think actionable is an interesting word. Right? Because it could mean the alert that goes off, is it being sent to the right place? Is it being sent to the right people? Is it going to the right PagerDuty group? Is it going to the right Slack channel? Is it going to the right email address? 

So, now you have to make sure that nobody could help you with it, just like we showed at the very beginning, we have a group, in Slack, called eng on call. That group changes every day. If we look at who is going to service these alerts, then, are they getting the alerts that they need to? And then if you look at actionables, can they do something about it when it actually fires? Right? So, we might hope, and that’s another whole diatribe, meaning that Alyson and Martin are engineers and our engineers are on call. We believe strongly in that. The folks that build and support Honeycomb, build and support Honeycomb. That means, is it possible for Martin to take action on that particular alert? In this case, it is. He has access to production code, can ship code to production with review. So, that question is a little bit more than just monitoring. Finally what you do want to do is take a look at the alerts you have and make sure they’re worthy of someone’s action because that was how I started out this. I don’t want to create an alert because some random person had a slow time. 

Because if you just do it like that, then what am I giving Martin and Alyson? Just, hey, here’s some stuff that lacks context. Because if Uyen had a bad time, and it really wasn’t a bad time, maybe it was a super long query and that’s a good thing. That’s why you gotta be mindful of the people that are on call. Put people on call. Put engineers on call. It’s not a bad thing. It allows them to be actionable on the alerts that they actually have.

Kelly Gallamore: 

I also like what you’re talking about. We learn this in emergency response situations in physical environments in the Bay. The team can be more resilient as each person has more capability to have more context and be more resilient. If everybody can do that, the whole group gets stronger. I’m blown away by that. That’s all we have time for today. Michael, any final wrap up?

Michael Wilde: 

You know, I would say that, stop creating alerts on CPU. That’s just my opinion. We’re in a cloud-native world, CPU alerts are just fine, but do the kind of alerts that let you know if someone’s having a bad time. And my dog, Dixie, was quite good on the call so she did get another cookie if anyone was wondering how Dixie was doing. 

Kelly Gallamore: 

Thank you. I’m going to have a few words of wrap up. Thank you Michael for your time and thank you for showing us this. If you have more questions, you can reach out to Team@Honeycomb.io. If you know us and love us, there’s a support chat on our website that you can reach out there and you’ll get routed to the right place. Nathan is part of our team. Nathan LeClaire talks in our webcast, SLOs Get Started and Build Just One Simple SLO. It really goes into detail about how one of our customers, Clover Health, got started getting their team on the same page and bridging engineering and management. And then Nathan specifically demos how to get started and build that first SLO. So, I recommend that you check it out. It’s on our website. 

Also, there will be a survey after this. If you could fill it out. Help us be better by filling out some information so we can improve for you. If you’re new to Honeycomb and want to play with an anonymized data set, you can go to Honeycomb.io. If you’re ready to get started, you can get in for a free trial and explore what observability can bring to the table. Thanks, again, Vanessa, over at White Coat Captions. I really appreciate you doing these live captions for us. Thank you to our audience and team here. Y’all stay safe and we’ll see you in the future.

Michael Wilde:

See ya, thanks. 

If you see any typos in this text or have any questions, reach out to marketing@honeycomb.io.