Raw & Real Ep. 1
Remote, But Not Alone
How to Team Up in Honeycomb

 

+ Transcript:

Kelly Gallamore [Manager, Demand Gen|Honeycomb]:

Hi, everyone. Welcome to Raw and Real, a series from Honeycomb’s short and sweet product demos. We really want to show you how Honeycomb uses Honeycomb. So we’re going to put our dog food where our mouth is. It’s 10:00 Pacific time. Wednesday, it’s a beautiful day here in the Bay Area. I hope it’s a beautiful day wherever you are at this moment. We are going to give folks a few more minutes to sign in. We will begin our session promptly at 10:02. So we’re going to hang out for a minute. Glad to have you here.

For anyone still signing up, we’re going to begin in just a few moments. Short, sweet product demos how Honeycomb uses Honeycomb. We’ll start here in just a minute.

Nice gloves. 

Michael Wilde [Dir of Sales Engineering|Honeycomb]:

We’re in a different time, ready for whatever happens.

2:06

Kelly Gallamore:

This is fantastic. For everybody just joining, welcome to raw and real our short and sweet product demo series. We’ve been asked to show more about how Honeycomb actually uses Honeycomb and put a little of our dog food where our mouth is. And I want to let you know that for anyone who needs live captions, here is the link. If you would like to follow along. Live captions are happening for this event. Give you a moment to get the link there.

This series, today’s episode will definitely be recorded, so you can pass it along to those you think it will help. 

And welcome to today’s episode: Remote, but not alone. How to team up in Honeycomb. Just some quick housekeeping. Our agenda for today is to talk for a minute, tell you who we are. And show a product demo answering just the basic questions of, how do I collaborate inside of Honeycomb? How do I help new teammates on board how my team uses Honeycomb? We’ll do a short product demo. You can ask questions at any time, just hit the ask question and submit it that way. If we don’t answer it during the demonstration we will take questions at the end and we love the interaction. So please ask us anything.

 And with that, my name is Kelly, team marketing at Honeycomb. I’m here with Michael Wilde. 

Michael Wilde:

Hello, there. Hello Kelly and fellow earthlings. I’m fully ready for whatever is raw and however real it gets. 

Kelly Gallamore: 

For whatever may come. We do want to let you know that teaming up in Honeycomb is very safe by nature and Michael is going to talk to you about it today. Michael, what can you tell us about — well, I’ll tell you this. I think that engineering, developing, and maintaining software is a team sport. And we have features inside of Honeycomb that really cater to that, that make it easier, as though it’s kind of in your DNA. Can you talk to us more about that, and show how Honeycomb enables team collaboration?

Michael Wilde: 

Totally. This is, as I rub my hands together, this is one of my favorite things to talk about, about Honeycomb. So in my past, I have a 25-year career in keeping air pods in. Been in IT and OPS and SRE for a while and I myself, even though I’m an account exec guy, was on call. And it was always difficult to find out what people are doing other than just chatting, which is awesome. And to learn from each other and leave bread crumbs. And to see the team that builds and supports Honeycomb use our product in a really cool way is something I’m passionate about and want to show you. What do you say, Kelly? Should I jam a little bit of demo for everybody and kind of show them what I think is so cool about sharing? 

Kelly Gallamore: 

Yeah. I’m really excited to see it. Let’s go for it.

Michael Wilde: 

I’m going to go. Everybody ask questions if you can. After I finish this deal, we’ll stop for questions or whatever. Let’s give it a shot. When you’re onboarding to a new product there are all sorts of nuances to learn. But when you’re onboarding to a new team, joining an engineering team, maybe as a new employee, or moving a group, one of the things you need to do is become aware of how they work, what their systems are and how they look at the world. Here at Honeycomb, we use Honeycomb in a special deployment called Dogfood to keep an eye on the production of Honeycomb.

So we take a look at what we’ve got here. We’ve got one of our core services internally, we call it shepherd. It’s our ingestion API. And while these charts might be fairly interesting, what I kind of see as a great way to get onboarded is to understand what the rest of your team is doing. You’ll see that in a few places at Honeycomb that the first place you see is here on the recent activity feed. And instead of just approaching the world by looking at every system, these are all of the users, all of the team members on my team, in this case, they work in Honeycomb, that are making queries about parts of our system. It looks like Martin was running a number of queries about the user interface service for Honeycomb. Travis is looking at system statistics. And Irving is examining things that are happening in the ingestion pipeline. 

In the same way, you might use a collaboration tool like Slack to gain awareness of what your team is doing, that very same behavior is here in Honeycomb. In fact, we can use this history here to see what’s going on. Now, if I, for example, wanted to shadow one of my teammates, let’s say if I was shadowing Irving, I can click on his history and see the query history that he might have been running yesterday, for example. If I were to click on this query, history result, I see that query is instantly loaded for me and it looks like Irving was looking into some information about particular data sets and doing statistics associated with that.

You also get an awareness of what’s happening inside the team that you’re working on. Irving was looking at shepherd, which is our ingestion API located at API.Honeycomb.io. All of our user and customer events come to that one service. But we also see a visual record of every query that has been ran. It’s almost like I can follow an engineer and their thought path. In the case of Alyson, she might have been running queries to see how data was coming into Honeycomb. How did she arrive at this last query? By clicking on the previous queries in that history, it allows me to understand how she’s doing her work, how she arrived at something, and maybe what her thought pattern was. And I may use some of what she ended up building as a starting point for myself. In fact, I have my own history here of all the queries that I’ve ever ran. And there’s a chance I might want to go back in time and take a look at things that happened earlier this year.

8:58

Now we end up here back at the home page. Another thing we might want to ask ourselves when we’re trying to get the lay of the land of a particular system is, what’s been deployed recently? As we can see over here, there are some deploy markers in this particular time range. We click on any of them, they go directly to that workflow, in this case in circle CI, on exactly what was deployed. 

While keeping an eye on production, the ability to understand what’s happening in the external environment gives me awareness, because that’s mostly what engineers need, what’s going on, what’s happening, who’s having a good time and who’s having a bad time, and how does the system react to the software that we’re deploying? Was there any direct correlation between a deploy and some behavior? So those are some other things. Looking at deploy markers to get an idea of what’s happening in the external environment. An engineer also might take a look at a series of boards. And we call these dashboard boards in Honeycomb, for the sole reason that they can be displayed visually or in a list. So this becomes a list of how Molly goes through her day and looks at useful queries. This allows someone who is supporting the system, and keeping software up and running and keeping customers happy, to be able to use what others create. 

Sometimes, not even logging in to Honeycomb is the first time that you are aware of what’s happening in Honeycomb. When Honeycomb users share the results of a query on slack, they’re properly unfurled. In fact, we can see the comments from the person who created the query, how the query was crafted, and even a rendering of exactly what it looked like. So maybe you’re not in Honeycomb right now, but this becomes really informative because you can see what’s going on.

Two more things I want to show you. First, an engineer might wonder, is the feature that I’ve built in production yet? We can see some deploys that actually happened. We see an account of events with a specific build ID. In this case 182263 going offline. 182369 going online. And not even having to ask the build team or wonder makes it really easy to gain awareness, let’s see how frequently Honeycomb is building and deploying.

The last thing I want to show you is, aside from taking a look at a specific service, we might wonder, what is the experience like for users? How are they doing? In that end we might ask, how is ingestion to Honeycomb going on, and how are the users doing? And for that, we might use Honeycomb SLO. And SLOs allow us as a team to create an agreement with each other that our service is going to work the way in which we want it to work over the time range which we like. In this case, we had the ingestion API to work 99.99% of the time over a 30-day period. As you can see our error budget which starts at 100% is doing great. These charts below that we call BubbleUp help us understand events that are failing our SLI.

On the user side, if we look at our service, our goal is to have 99% of every request that happens on the UI to load in less than 10 seconds. It looks like we’re doing fairly well, according to our SLO. But as we can see, maybe particular users are having a little bit of slowness. But the ability for me to not only see how our service is doing for our users, but also understand what the rest of my team is doing, and what I’ve been doing over time, makes it a really powerful way to gain awareness, makes onboarding process on a team a lot easier because we can see what everybody is doing. And I can participate in that a lot better than in other systems where I’m just logged in and no one else knows that I’m there. 

So what do you think about that, Kelly? 

Kelly Gallamore: 

Well, I think it’s really important to be able to understand what other people are thinking, especially when a lot of us are working separately from each other right now. I know a bunch of our team works remotely and a bunch of teams are learning to work remotely as well. What I really like about this is the idea that it really helps me to understand the questions that my teammates are asking. Could you talk about that just a little bit more? Like when you talk about Molly creating her boards or what Alyson is doing, could you talk a little more about what’s a specific — okay, I’m a new person on a team. What’s a specific scenario when I’m going to need to follow along. Is that on-call?

14:15

Michael Wilde: 

Sure. That’s a great idea. One of the best things to do when you’re onboarding at a new team, at a new company, all new credentials, all your 2FA, get your access to Pager Duty or Service Now or whatever you use and, boom you get stuck on-call. Which is not a bad place to be actually. It’s a great place to learn if you’ve got the right tools. And being able to watch an engineer or an ops person or someone supporting the system while they’re running queries, so you might be in — sometimes we used to call it a war room. You might be in a Zoom call or a WebEx or Microsoft Teams where you’ve got three or four people on there and you’re listening and you’re watching this P1, P0 incident happen and you’re seeing how other engineers are, what they’re querying. You learn a lot faster by shadowing people. And it becomes a really cool thing. One of our teammates, Irving, he’s our customer success director and he joined us a couple of months ago. One of the things he’s able to do is say, hey let’s take a look at all the queries that have been ever ran for a particular Honeycomb customer. He can see how we were thinking about them and learn a heck of a lot quicker. I love that about that idea.

Kelly Gallamore: 

I like how you’re talking about it not just being for debugging, being somebody else on the team that I wouldn’t have considered could use it. That’s really important. That ties everybody together.

Michael Wilde: 

You know, when I was showing on the side, my history is kind of like a visual browser history. The cool thing is, those queries are saved forever as results. Our user interface team sends events from the UI. If you happen to have that window open, a tab open, that is a state. The sidebar is open. If you have the details, like the schema for the fields, it knows that that’s there. If you have it closed or if you have it in any other state. The UI team actually can run queries in Honeycomb to see how people are using it. Maybe, is it closed, are they an old customer, a new customer, on a mobile phone or whatever? We use it for a lot of different things. The average Honeycomb user starts out with some logs or traces and we wanted to show raw and real so you could see what a fairly mature team does when they have observability locked down. 

Kelly Gallamore: 

Okay. Awesome. And just really, really quick. Can you tell me what SLO stands for? 

Michael Wilde: 

Totally. SLO really means service level objective. And as I said, it’s really just in the agreement between us, the folks that build and support the software. Companies like us will have an SLA, like an agreement with our customers on how the service level we’re going to have at Honeycomb, you can see that at status.Honeycomb.io. But internally we have to be able to say, I’m not going to get alerted every single possible failure. How are we going to commit to each other, and obviously our customers, how this service is going to work? 

When you saw things like Poodle at 99%, shepherd at 99.99%, that’s how we measure ourselves and we try to have this balance because balance is key in the situation where we’ve graded a great product, it can be a whole bunch of alerting but you don’t want to over alert. So the SLO gives us the ability to say we’re going to commit to having it work the following amount of times. But we’re not going to bug Liz and Travis or the rest of the team and Irving and Molly are going to check it out to make sure you’re having a good time. Happy customers, happy engineers, happy life. 

Kelly Gallamore: 

I like the idea the team can work together to keep an eye on that too based on their own focus. We have a question here as well, and Michael I want you to tell me whether or not we have time to get into this today. The home page error rate seems noisy, like ours. Any tips on making a home page a clearer signal? Is that something you can go into? 

Michael Wilde: 

Just the error rate on the home page?

Kelly Gallamore:

Yeah. 

18:48

Michael Wilde: 

Yeah. Well, sure. Let me give you an example. Let me pop over to Dogfood and just take a look at a few things. Here we go. Share the screen again. Awesome. I shared the very screen I was sharing. It’s raw and real. In this case, my error rate doesn’t look that big because I’m looking at the ingestion API. But it wouldn’t surprise me there might be errors on the front end. Things like that happen. This is just a general set of golden signals. Does the error rate matter? It might matter. Just because there is a high error rate on this chart over 24 hours, we might end up looking at it over the last 10 minutes and obviously it might be simple or same or different. We have some errors that happen at a certain time. I might use this actual error rate to go and further debug. But I might ignore it, actually. 

There’s a chance that if the SLO for poodle is doing well, then someone may not drill into this. But this is an idea of what you would call, some people call these red metrics. So requests right here. Error rate and duration. So just because someone has an error doesn’t mean you have to do something about it. But you may choose to either improve the algorithm here or the description. As you can see an error, in this case, might be the response status code is 400 which most people don’t care about. It’s still an error but it’s the wrong URL. So not every single one we want to pay attention to. But if you’re doing observability, most folks try to look at what’s happening as frequently as possible without raising an alert or an alarm unless an SLO is violated. 

Kelly Gallamore: 

I know my own partner has been on call recently and I know he had a pile of noise that he just couldn’t get it through his teammates’ heads that if they would just turn this one thing off it would really surface the most important things and he wished he had something to show. I haven’t talked them into Honeycomb yet, we’ll get there one day. I appreciate the idea of having the goal to help set your noise level and that everybody can see it.

For the person who asked that question if you have more questions about that, please reach out to team@Honeycomb.io. I want to make sure you get the answer that you were looking for. We have another one for you, Michael. Can you see the questions?

Michael Wilde: 

I can. I have somebody asking — we have a service-oriented architecture which means that in order for traces to span several services we have data sets like production and staging where written each we use the standard service name field to break down our filter by individual services. Is there a way we can tweak the user history to be grouped by service name instead of data set. 

That’s actually a pretty decent idea. I’ll definitely forward this to product because I’m sure they would like to talk to you more. One of the things you can do with the query history at least, and the person asking this is a customer, one of the things you can do is maybe do a search. So the feature wasn’t there, can I do a search in the query history where I look at, where I’m searching by specific service names? So if I have 10 services, I might even have 10 links to query history. It’s a little bit of a hack but I understand what they’re asking for, like maybe making that history in the entire set of queries and for the folks that haven’t used Honeycomb, this is kind of exactly what we’re talking about here.

So while I can — when we were looking at a new query, we see history on the right-hand side. We saw that a little bit earlier. Team history and all that. That’s for a specific data set. But inside of query history, it’s just a general feed. And if say you’ve got a situation where you’ve got things across multiple data sets, you can see everything here. And you can do some search here. So you can search by user, by time range, any of the terms that are in there. It might actually be kind of cool to have a little bit more filtering around here. Obviously you can do things like data set equals this. But it sounds like maybe he or she wants to have something different. And that’s a totally cool idea. I’ll pass it on to the team. And actually it would be great to see how, to learn more about what you’re thinking as well. 

Kelly Gallamore: 

Okay. So for the person at Shipps, I hope that answers your questions for now if you have more questions you can reach out to team@Honeycomb.io or you can reach out to Michael@Honeycomb.io. Michael, we have about 6 minutes. Can you tag your own queries in any way to make the query history more useful to others and maybe highlight ones that are more likely to be useful for others. I understand this in the tools that I work in every day. What do you do or what does our team do in this situation, Michael? Got any pointers?

24:42

Michael Wilde: 

Yeah, I do. That is a fantastic question. Because it’s fairly often we’re just, during the day you’ll just do your work. But there are some situations like when you’re writing code and you do a commit message, you can do comments and things like that, where you may want to describe what it is you’re doing. Obviously, in every commit message, you generally do that and you might have a comment on it. That’s actually really helpful to do in Honeycomb. So if we were just to scroll through the query history as well here, we see one query here that’s called total span. If I were to load up this query that Alyson did, there are a couple of fields at the very top here. I’m going to do exactly what I said and I’m going to take Alyson’s query and I’m going to do something different with it. 

Before I jump ahead, check this out on the lower right-hand side. If I can zoom in here. In this case, I’ve actually got to take off my protective gloves to do this. If I scroll to zoom in, do you see how it says what integrations are customers using to send us post-beeline, that means something to Alyson by her or whoever put a title in the description, that’s not only searchable, but it also shows up here. So if I were to take her query, and let’s just do, show where response status code equals 500. Or while we’re at it, I could add a heat map of duration or something like that. So let’s do something raw and real. We’ll say heat map of duration. This is just going to show a latency and count where the request is actually an error. So not too much going on here, but we have a few of these. 

If I click on bubble up, I’m going to draw a box around here. Honeycomb is going to do its thing, going to find the differences between the baseline and the selection and, okay, actually it looks like this is kind of interesting. So we see all the errors for this particular request are coming from a Python library. So while we’re at it, I’m going to say, show only where the field is of this value. So we’ve done a filter, done a count, done a heat map, filtered by 500s, and using the Python library. And now I’m going to say errors, 500 status errors coming in through Libhoney Python, so I’m going to save that. So that is saved. It’s not sitting on a dashboard anywhere. I could add it to a dashboard, we could add it to Molly’s dashboard as well so people could use this as something to look at. But if I end up reloading, I don’t even think I have to reload the page. The title’s right there.

As I’m scrolling through we see that, and of course if we go to query history, and we end up doing the search on things like well, I could search on Libhoney Python, we’ll probably end up seeing my query in there — so there’s mine. And then there are also other things where let’s say Paul was querying. Having the description is a really good idea because you’ve got this set of artifacts that last forever. And if this happens, like, you know two years down the road and I want to go look at something like that, I don’t remember what I queried, but I kind of remember the scenario. So I might end up searching on that and it’s a really helpful thing to do.

Kelly Gallamore: 

Wonderful. My favorite part of that is being able to see the question the person is asking. That helps me understand. I can imagine being a new person on a team, not just a new engineer, but someone who’s coming from a different place trying to jump into the rhythm of a team that’s going. I really appreciate how that query history works, how the activity shows, and how the permalinks last forever. Plus you can share them in slack. How many times do I jump into slack which is right over here and go, oh, I need context, can you send me a screenshot? Can you send me anything? I just came from something else, I’m switching. I really appreciate how these exist in our product to try and help bring everybody together because the stronger the team, the more resilient they are. 

Michael Wilde: 

The awesome thing about that without a feature like history, and without it being searchable, you would end up maybe taking a bunch of links and putting it on your internal documentation, your Quip, your Confluence, whatever you have. That stuff would be stale, your super engineer moves on to a different job, nobody updates it. Being able to have the entire mental history of what everybody does in your team makes it so people can learn from each other. That’s why I really dig these features in Honeycomb.

Kelly Gallamore: 

Being able to walk around inside people’s brains and figure out what they’re thinking, it’s a really magical place to be. 

So that’s our time for today. It looks like we’ve answered all of our questions. Our next episode of Raw and Real will come out sometime in April. We’re still figuring out our next topics but we’re thinking about things like, once you find latency or errors, then what? How do you ask the second question? How do you search and find out what’s been deployed lately, especially when something’s a hiccup and it’s either a big issue or you want to get ahead of it before somebody sees it, what’s been deployed lately? Or, we talk a lot about SLOs. How can you configure SLOs or how can you get started with that? Or if you walk into a team and you’re understanding it, how can you help iterate to make them better? Those are a few things we’re thinking about. Your questions are going to help us figure out what else you want to know. So I really appreciate the people who have interacted with us today. What you can do, let me see if I can be smooth about this on this TV station here. 

Michael Wilde: 

Live from Raw and Real.

Kelly Gallamore: 

Exactly. I want to make sure you can get in touch with us if you have more questions. You can reach out to team@Honeycomb.io at any time. If you haven’t played at Honeycomb at all and you want to jump into our Sandbox we have a play scenario at play.honeycomb.io. After this episode ends we’ll send out a survey. We really want to know how you feel about today’s episode, what worked for you, what didn’t work for you, so we can make the next one even better. And if you haven’t started us yet, you can find our free trial. Just go ahead and sign up. All right. I think that’s it for this episode of Raw and Real. Michael?

Michael Wilde: 

Thanks, everybody. That was awesome. I look forward to you all tuning in to Raw and Real TV later on down the road.

Kelly Gallamore: 

Thank you so much, everybody. Have a great day and stay safe and healthy, and we’ll see you in the future. 

If you see any typos in this text or have any questions, reach out to marketing@honeycomb.io.