What’s the practical value of DORA metrics, and how are real engineering organizations using them? To find out, we invited a panel of engineering leaders & industry experts to share their experiences.
Code Climate Senior Product Manager Madison Unell moderated the conversation, which featured:
- Scott Aucoin, Technology Director at Liberty Mutual
- Emily Nakashima, VP of Engineering at Honeycomb
- Karthik Chandrashekar, SVP of Customer Organizations at Code Climate
Over 45 minutes, these panelists discussed real-world experiences using DORA metrics to drive success across their engineering organizations.
Below are some of the key takeaways, but first, meet the panelists:
Scott Aucoin: I work at Liberty Mutual. We’ve got about 1,000 technology teams around the world. We’ve been leveraging DORA metrics for a while now. I wouldn’t say it’s perfect across the board, but I’ll talk about how we’ve been leveraging them. The role that I play is across about 250 teams working on our Agile enablement, just improving our agility, improving things like DevOps and the way that we operate in general; and our portfolio management work…from strategy through execution; and user experience to help us build intuitively-designed things, everything from the frontend of different applications to APIs. So finding ways to leverage DORA throughout those three different hats, it’s been awesome, and it’s really educating for me.
Emily Nakashima: I’m the VP of Engineering at a startup called Honeycomb. I manage an engineering team of just about 40 people, and we’re in the observability space. So basically, building for other developers, which is a wonderful thing. And we had been following along with DORA from the early days and have been enthusiasts and just made this switch over to using the metrics ourselves. So I’m excited to talk about that journey.
Karthik Chandrashekar: I’m the Senior VP of our Customer Organization at Code Climate. I have this cool job of working with all our engineering community customers solving and helping them with their data-driven engineering challenges. DORA is fascinating because I started out as a developer myself many years back, but it’s great to see where engineering teams are going today in a measurement and management approach. And DORA is central to that approach in many of the customer organizations I interact with. So I’m happy to share the insights and trends that I see.
Why did your organization decide to start using DORA?
Emily Nakashima: I first came to DORA metrics from a place of wanting to do better because we’re in the DevOps developer tooling space ourselves. Our executive team was familiar with the DORA metrics, and we had used them for years to understand our customers, using them as a tool to understand where people were in their maturity and how ready they would be to adopt our product…we had this common language around DORA…[At the same time,] our engineering team was amazing, and we weren’t getting the credit for it that we deserved. And by starting to frame our performance around the DORA metrics and show that we were DORA Elite on all these axes, I think it was a really valuable tool for helping to paint that story in a way that felt more objective rather than just me going, “We’ve got a great team.” And so far, honestly, it’s been pretty effective.
Scott Aucoin: Liberty Mutual being a 110-year-old insurance company, there are a lot of metrics. There are some metrics that I think we might say, “Okay, those are a little bit outdated now.” And then there are other ones that the teams use because they’re appropriate for the type of work the teams are doing. What we found to be really valuable about DORA metrics is their consistency…and the ability to really meet our customers and their needs through leveraging DORA metrics.
Karthik Chandrashekar: Speaking with a lot of CTOs and VPs of different organizations, I think there’s a desire to be more and more data-driven. And historically, that has been more around people, culture, teams, all of that, but now that’s transcended to processes and to data-driven engineering.
How did you go about securing buy-in?
Scott Aucoin: This has been pretty grassroots for us. We’ve got about 1,000 technology teams across our organization. So making a major shift is going to be a slow process. And in fact, when it’s a top-down shift, sometimes there’s more hesitancy or questioning like, “Why would we do this just because this person said to do this?” Now, all of a sudden, it’s the right thing to do. So instead, what we’ve been doing and what’s happened in different parts of our organization is bringing along the story of what DORA metrics can help us with.
Emily Nakashima: The thing I love about Scott’s approach is that it was a top-down idea, but he really leveraged this bottom-up approach, starting with practitioners and getting their buy-in and letting them forge the way and help figure out what was working rather than dictating that from above. I think that it’s so important to really start with your engineers and make sure that they understand what and why. And I think a lot of us have seen the engineers get very rightly a little nervous about the idea of being measured. And I think that’s super-legitimate because there’s been so many bad metrics that we’ve used in the past to try to measure engineering productivity, like Lines of code, or PRs Merged. I think we knew we would encounter some of that resistance and then just a little bit of concern from our engineering teams about, what does it mean to be measured? And honestly, that’s something we’re still working through. I think the things that really helped us were, one, being really clear about the connection to individual performance and team performance and saying, we really think about these as KPIs, as health metrics that we’re using to understand the system, rather than something we’re trying to grade you on or assess you on. We also framed it as an experiment, which is something our culture really values.
DORA’s performance buckets are based on industry benchmarks, but you’re all talking about measuring at the company level. How do you think about these measures within your company?
Emily Nakashima: This was absolutely something that was an internal debate for us. When I first proposed using these, actually, our COO Jeff was a proponent of the idea as well. So the two of us were scheming on this, but there was really resistance that people pointed out that the idea of these metrics was about looking at entire cohorts. And there was some real debate as to whether they were meaningful on the individual team or company level. And we are the engineering team that just likes to supplement disagreements with data. So we just said, that might be true, let’s try to measure them and see where it goes. And I will say they are useful for helping us see where we need to look in more detail. They don’t necessarily give you really granular specifics about what’s going wrong with a specific team or why something got better or worse. But I do think that they have had a value just for finding hotspots or seeing trends before you might have an intuition that the trend is taking place. Sometimes you can start to see it in the data, but I think it was indeed a valid critique, ’cause we’re, I think, using them in a way that they’re not designed for.
Something important about the DORA metrics that I think is cool is that each time they produce the report, the way they set the Elite and High and other tiers can change over time. And I like that. And you also see a lot of movement between the categories…And to me, it’s a really good reminder that as engineering teams, if we just keep doing the same thing over and over and don’t evolve our practices, we fall behind the industry and our past performance.
Scott Aucoin: I look at the DORA metrics with the main intent of ensuring that our teams are learning and improving and having an opportunity to reflect in a different way than they’re used to. But also, because of my competitive nature, I look at it through the lens of how we are doing, not just against other insurance companies, which is critical, but setting the bar even further and saying, technology worldwide, how are we doing against the whole industry? And it’s not to say that the data we can get on that is always perfect, but it helps to set this benchmark and say, how are we doing? Are we good? Are we better than anyone else? Are we behind on certain things?
Karthik Chandrashekar: One thing I see with DORA as a framework is its flexibility. So to the debate that Emily mentioned that they had internally, it’s a very common thing that I see in the community where some organizations essentially look at it as an organizational horizontal view of how the team is doing as a group relative to these benchmarks.
What pitfalls or challenges have you encountered?
Karthik Chandrashekar: From a pure trend perspective, best practice is a framework of “message, measure, and manage.” And not doing that first step of messaging appropriately with the proper context for the organization means that it actually can cause more challenges than not. So a big part of that messaging is psychological safety, bringing the cultural safety of, “this is to your benefit for the teams.” It empowers. The second thing is we all wanna be the best, and here’s our self-empowered way to do that. And then thirdly, I think, “how do we use this to align with the rest of the organization in terms of showcasing the best practices from the engineering org?”
So the challenges would be the inverse of the three things I mentioned. When you don’t measure, people look at it as, “Oh, I’m being measured. I don’t wanna participate in this.” Or when you measure, you go in with a hammer and say, “Oh, this is not good. Go fix it.” Or then you do measure, and everything is great, but then when you are communicating company-wide or even to the board, then it becomes, hey, everything’s rosy, everything is good, but under the hood, it may not necessarily be…Those are some of the challenges I see.
Emily Nakashima: To me, the biggest pitfall was just, you can spend so much time arguing about how to measure these exact things. DORA has described these metrics with words, but how do you map that to what you’re doing in your development process?
For us in particular, we have an hour-timed wait for various reasons because things roll to a staging environment first and get through some automated tests. Our deployment process is an hour. We will wait for 60 minutes plus our test runtime. So we can make incredible progress, making our test faster and making the developer experience better. And we can go from 70 to 65 minutes, which doesn’t sound that impressive but is incredibly meaningful to our team.
And people could get focused on, “Wait, this number doesn’t tell us anything valuable.” And we had to just say, “Hey, this is a baseline. We’re gonna start with it.” We’re gonna just collect this number and look at it for a while and see if it’s meaningful, rather than spend all this time going back and forth on the exact perfect way to measure. It was so much better to just get started and look at it, ’cause I think you learn a lot more by doing than by finding the perfect metric and measuring it the perfect way.
Scott Aucoin: You’re going to have many problems, more than your DevOps practices. And Emily, I think the consistency around how you measure it is something we certainly have struggled with. And I would say in some cases, we still wonder if we’re measuring things the right way, even as we’ve tried to set a standard across our org. I’ll add to that, though, and say the complexity of the technology world, in general, is a significant challenge when you’re trying to introduce something that may feel new or different to the team or just like something else that they need to think about…You have to think about from the standpoint of the priorities of what you’re trying to build, the architecture behind it, security, the ability to just maintain and support your system, your quality, all of the different new technology that we need to consider ways to experiment all of that. And then, and we throw in something else to say, “Okay, make sure you’re looking at this too.” I think just from a time capacity and bandwidth perspective. It can be challenging to get folks to focus and think about, okay, how can we improve on this when we have so many other things we need to think about simultaneously?
What are you doing with DORA metrics now?
Scott Aucoin: It’s a broad spectrum. We’re doing all these fantastic things. Some groups are still not 100% familiar with what it means to look at DORA metrics or how to read them.
It’s kind of a map and compass approach. You’re not only looking at a number; you’re able to see from that number what questions you have and how you can learn from it to map out the direction you want to go. So if you’re lagging behind in Deployment Frequency, maybe you want to think more about smaller batches, for example. So within our teams, we’re looking at it through that lens.
And again, it’s not 100% of the teams. In fact, we still have more promotion and adoption to do around that, but we have the data for the entire organization. So we also look at it from the levels of the global CIO and monthly reports that are monthly operational reports that go to the global CIO. And while I can think about someone who I’ve gotten to know over the last few months, Nathen Harvey, who’s a developer advocate for Google’s DORA team, I have him in the back of my mind as I say this, as he would say, “The metrics are really for the teams.”
We think about the value of it from the executive level as well. And when we think about the throughput metrics of Deployment Frequency and Lead Time for Changes, we can get a little bit muddy when you roll up thousands of applications to this one number for an exact, especially since many of those applications aren’t being worked on regularly. Some are in more of a maintenance mode. But when we can focus on the ones actively being worked on and think about trends, are we improving our Deployment Frequency or not? It can lead the CIO or any of the CIOs in the organization to ask the right questions to think about “what I can do to help this?” Especially when it comes to stability, regardless of whether an application is getting worked on actively today or not, we need stability to be there. So we really are looking at them at multiple levels and trying to be thoughtful about the types of questions that we ask based on the information we’re seeing.
Emily Nakashima: My example is the backward and wrong way to do this. I started by basically just implementing these myself for myself. And the first thing I did with them was to show the stakeholders that I was trying to paint this story too. And I think if you can start with getting practitioners to work with them, getting your software engineers to work with them first, tune them a little bit, and find them relevant, I honestly think that’s the best approach in the organization if you could do it. That wasn’t the situation I happened to be in, but I started with that, used them to radiate these high-level status numbers to other folks on the exec team and the board, and then started to roll them out team by team to allow for that customization.
So we’re still in that process now, but I started to pull managers in one by one and go, hey, these metrics that I’m tracking, this is what they mean to me. Let’s sit down together and figure out what’s gonna be meaningful for your engineering team and how to build on this baseline here…Let’s build on top of it together.
And we’re hoping to get into this quarter to have teams start working with them more directly and get more active in tuning and adding their metrics. We think about observability for systems, and we always want people to be adding instrumentation to their systems as they go. Each time you deploy a feature, add instrumentation that tells you whether or not it’s working. And we wanna bring that same approach to our engineering teams where we have these baseline metrics. If you don’t think they’re that good and they don’t tell you that much, then you go ahead and tell us what metric we add, and we’re gonna work together to build this higher fidelity picture that makes sense to you, and then also have that shared baseline across teams.
To hear more from our panelists, watch the full webinar here.
Actionable metrics for engineering leaders.Try Velocity Free