
Engineering teams are more distributed than ever. Nearly 70% of active engineering positions are open to remote applicants, and many companies have a mix of remote, hybrid, in-person, and contracted employees. So, how do engineering leaders measure performance uniformly across all of these teams? By creating a framework for understanding performance, asking the right questions, and using data to answer them.
Engineering leaders want to mitigate surprises before they impact software delivery or the business. It’s not enough to make decisions based on gut feel or anecdotal performance reviews — especially when an engineering organization is made up of multiple teams with unique working styles and deliverables. To truly understand performance across teams, leaders must establish which metrics are important to their company and create a framework to measure them. Establishing a performance framework ensures that leaders are measuring engineering teams in a consistent and equitable way so they can identify and resolve bottlenecks faster to optimize the flow of work.
Using a common framework like DORA is a great starting point, but leaders must tailor measurement to the needs of their unique team. Traditional engineering metrics, and even frameworks like DORA, can overrotate on the quantity of code that’s produced and underrotate on the quality of that code, how efficiently it was written, or how effectively it solves a specific problem. Solely measuring quantity can result in bloated, buggy code because engineers may prioritize simple features they can get out the door quickly rather than spending time on more complex features that can move the needle for the business.
Adding metrics and context that apply to your specific team can provide a more accurate look at engineering performance. For example, to understand team productivity, leaders may look at engineering metrics like Mean Lead Time for Change (MLTC) alongside Cycle Time. If MLTC is high, it could indicate that Cycle Time is also high. These metrics can be viewed in tandem with other metrics like Time to Open, Time to Merge, and Time to First Review to understand where changes need to be made. These metrics can then be compared across teams to understand which teams are performing well and establish best practices across the organization.
Data-driven insights can provide engineering leaders with objective ways to evaluate developer competency, assess individual progress, and spot opportunities for improvement. While quarterly KPIs and annual performance reviews are great goalposts, managers are constantly thinking about how their teams are progressing toward those targets. Reviewing engineering metrics on a monthly basis is a good way to assess month-over-month progress and performance fluctuations on an individual level and a team level. Which metrics a team considers depends on its defined framework and overall company goals. Here are a few to consider:
Looking at these metrics together can show how the two key responsibilities of writing and reviewing code are spread across a team.
This helps leaders understand what amount of thoroughness of Code Reviews results in a desired action.
To understand the effect that back-and-forth cycles in Code Review have on shipping speed, leaders can look at Review Cycles vs. Cycle Time.
Comparing Impact and Rework will show which teams are making the most significant changes to the codebase and how efficiently they are doing so.
Understanding and communicating engineering team performance is an effective way to ensure teams are aligned and that all requirements are understood and met. Making this a standard across the engineering organization — especially in a distributed or hybrid environment — is essential to its success. How leaders communicate their findings is equally important as gathering the information. When feedback is a fundamental part of a blameless team culture, team members understand that feedback is critical to growing as a team and achieving key goals, and will likely feel more secure in sharing ideas, acknowledging weaknesses, and asking for help. Leaders can tailor the questions listed above to meet the unique needs of their organizations and use engineering metrics as a way to understand, communicate, and improve team performance.

To deliver innovative products and experiences, engineering teams must work efficiently without compromising quality. Over the years, the software development lifecycle (SDLC) has evolved to include code reviews to ensure this balance. But, as engineering teams grow, so can the complexity of the review process. From understanding industry benchmarks to improving alignment across teams, this article outlines strategies that large engineering organizations can use to optimize Review Cycles.
The Review Cycles metric measures the number of times a Pull Request (PR) goes back and forth between an author and a reviewer. The PR review process is an essential component of PR Cycle Time, which measures the time from when the first commit in a PR is authored to when it’s merged. Leaders use this data to understand how long it takes to deliver innovation and establish baseline productivity for engineering teams.
Consider a PR for a new feature. Before the PR gets merged, it must be reviewed by a member of the team. If the PR gets approved and merged in a single cycle with no further interaction from the author, then the Review Cycle count is one. If the PR is not approved and requires changes, then the author must make an additional commit. The reviewer then checks the new version before it’s approved. In this scenario, the number of Review Cycles is two. This number increases as the PR is passed back and forth between the author and the reviewer.
By evaluating engineering metrics across enterprise companies, Code Climate identified a pattern in high-performing teams. Research found that the top 25% of organizations have an average of 1.1 Review Cycles, whereas the industry average is 1.2 cycles. When Review Cycles surpass 1.5, it’s time to investigate why.
A high number of Review Cycles in engineering might stem from a combination of challenges that hinder the efficiency of the process. These include differing interpretations of what constitutes "done," misalignment between the expected changes and the actual changes resulting from the review, or conflicting views on the best approach to implement a solution. If there are anomalies where Review Cycles are high for a particular submitter, it could indicate they’re struggling with the codebase or aren’t clear about the requirements. This presents an opportunity for leadership to provide individualized coaching to help the submitter improve the quality of their code.
The first step in addressing a high number of Review Cycles is to identify the reason PRs are being passed back and forth, which requires both quantitative and qualitative information. By looking at Review Cycles alongside other PR metrics, leaders can look for correlations. For example, Review Cycles tend to be high when PR Size is high. If this is true in your organization, it might be necessary to re-emphasize coding best practices and encourage keeping PRs small.
Leaders might also want to do a closer review of PR data to understand which PRs have the highest Review Cycles. They can bring this information to the teams working on those PRs to uncover what exactly is causing the PRs to bounce around in review. Maybe there’s a misalignment that can be worked through, or requirements are shifting while the project is in progress. Leaders can work with teams to find solutions to limit the number of times PRs are volleyed back and forth by establishing expectations for reviews, how solutions should be implemented, and when a review is complete. Best practices for the PR review process should be documented and referenced by all team members.
For large engineering organizations, paying attention to Review Cycles is essential. Keeping Review Cycles low can boost efficiency, productivity, and innovation by minimizing delays and facilitating swift project progression. In addition, when Review Cycles are high, it can be a signal of a bigger issue that needs to be addressed, like a misalignment within a team, or a failure to maintain best practices.

Google Cloud’s DevOps Research and Assessment (DORA) team’s 2023 Accelerate State of DevOps report examines the relationship between user-facing strategies, process enhancements, and culture and collaboration and its impact on engineering performance.
The DORA team re-emphasizes the importance of the four key DORA metrics, important benchmarks for gauging the speed and stability of an engineering organization. These metrics are a baseline for any engineering team looking to improve, and are a gateway for a more data-driven approach to engineering leadership. Pairing DORA metrics with other engineering metrics can unlock critical insights about a team’s performance.
However, the 2023 report makes significant strides in broadening out their approach to measurement. It recognizes that the four foundational metrics are an essential starting point, but also highlights additional opportunities for enhancing engineering performance. As teams continue on their data-driven journey, there are more dimensions of team health to explore, even in areas that don’t initially seem like they would lend themselves to measurement.
Two such areas highlighted in this year’s report are code review — an important window into a team’s ability to communicate and collaborate — and team culture.
Notably, the report’s most significant finding indicates that accelerating the code review process can lead to a 50% improvement in software delivery performance. While many development teams are disappointed with their code review processes, they simultaneously recognize their importance. Effective code reviews foster collaboration, knowledge sharing, and quality control. And, according to the report, an extended time between code completion and review adversely affects developer efficiency and software quality.
At Code Climate, we’ve identified a few key strategies for establishing an effective code review process. First, it’s important for teams to agree on the objective of review. This ensures they know what type of feedback to provide, whether it’s comments pertaining to bug detection, code maintainability, or code style consistency.
It’s also important for leaders to create a culture that prioritizes code review. Ensure that your teams understand that in addition to ensuring quality, it also facilitates knowledge sharing and collaboration. Rather than working in a silo to ship code, developers work together and help each other. Outlining expectations — developers are expected to review others’ code, in addition to writing their own — and setting targets around code review metrics can help ensure it’s a priority.
Code Review Metrics
Leaders at smaller companies may be able to understand the workings of their code review process by talking to team members. However, leaders at enterprises with large or complex engineering teams can benefit from using a Software Engineering Intelligence (SEI) platform, like Code Climate, to act on DORA’s findings by digging into and improving their code review processes.
An SEI platform offers essential metrics like Review Speed, which tracks the time it takes from opening a pull request to the first review submission, and Time to First Review, which represents the average time between initiating a pull request and receiving the first review. These metrics can help leaders understand the way code is moving through the review process. Are PRs sitting around waiting for review? Are there certain members of the team who consistently and quickly pick PRs up for review?
Reviewing these metrics with the team can help leaders ensure that team members have the mindset — and time in their day — to prioritize code review, and determine whether the review load is balanced appropriately across the team. Review doesn’t have to be completely equally distributed, and it’s not uncommon for more senior team members to pick up a greater proportion of PRs for review, but it’s important to ensure that the review balance meets the team’s expectations.
A Note About Bottlenecks
The DORA report noted that even if code reviews are fast, teams are still unlikely to improve software delivery performance if speed is constrained in other processes. "Improvement work is never done,” the report advises, “find a bottleneck in your system, address it, and repeat the process."
Data from an SEI platform can help leaders continue the work of identifying and removing bottlenecks. Armed with the right information, they can enhance visibility and support informed decision-making, enabling them to detect bottlenecks in the software development pipeline and empower developers to collaborate on effective solutions. Equipped with the right data, leaders can validate assumptions, track changes over time, identify improvement opportunities upstream, scale successful processes, and assist individual engineers in overcoming challenges.
Though the DORA team highlights the importance of effective processes, it also found that culture plays a pivotal role in shaping employee well-being and organizational performance. They found that cultivating a generative culture that emphasizes belonging drives a 30% rise in organizational performance. Additionally, addressing fair work distribution is crucial, as underrepresented groups and women face higher burnout rates due to repetitive work, underscoring the need for more inclusive work cultures. To retain talent, encourage innovation, and deliver more business value, engineering leaders must prioritize a healthy culture.
Just as they can provide visibility into processes, SEI platforms can give leaders insight into factors that shape team health, including leading indicators of burnout, psychological safety, and collaboration, and opportunities for professional development.
It’s fitting that the DORA report identifies code review as a process with a critical link to team performance – it’s a process, but it also provides insight into a team’s ability to collaborate. Metrics like Review Speed, Time to first Review, and Review Coverage, all send signals about a team’s attitude toward and facility with collaboration.
Other data can raise flags about team members who might be headed towards burnout. The Code Climate platform's Coding Balance view, for example, highlights the percentage of the team responsible for 80% of a team’s significant work. If work is uneven — if 10% of the team is carrying 80% of the load — it can indicate that some team members are overburdened while others are not being adequately challenged.
Data is the Key to Acting on DORA Findings
The findings from the DORA report are clear: even those teams that are successfully using the four DORA metrics to improve performance should look at other dimensions as well. Prioritizing process improvements like code reviews and promoting a healthy team culture are instrumental to performance — and data can help leaders learn more about these aspects of their team. Request a consultation to find out more about using an SEI platform to action the 2023 DORA findings.
EverQuote prides itself on having a data-driven culture. But even with this organization-wide commitment, its engineering team initially struggled to embrace metrics that would help them understand and improve performance. Many leaders would give up after a failed attempt, but Virginia Toombs, VP of Engineering Operations, took a step back to understand what went wrong so they could try again — and succeed.
Along the way, the EverQuote team learned what to avoid when implementing engineering metrics and how to successfully roll them out. For them, it was all about empowering team members, collecting actionable data in the right platform, and correlating metrics for a more holistic view of what was happening across the organization.
When EverQuote decided to measure engineering productivity a few years ago, it started by purchasing a tool, like many organizations commonly do. But it encountered problems as it hadn’t considered what to measure or how to leverage those insights to improve performance. Because EverQuote didn’t know which engineering metrics best suited its unique team and processes, it ended up with a tool that didn’t have mature throughput or flow metrics — two things it would learn were core to its success. The result? Virginia and her team saw detailed engineering metrics but lacked a comprehensive view of the organization’s performance.
This issue caused a domino effect. Measuring only granular metrics made team members feel that individual performance was being judged, rather than the process itself, and enthusiasm for the program came to a halt. That’s when the engineering operations team decided to rethink the approach and start from scratch.
EverQuote’s engineering operations team is a central function within engineering whose main goal is to create an environment where engineers can thrive. This team optimizes processes, encourages collaboration, and coaches on agile techniques. For them, it’s essential to understand how engineering processes are performing so they can make data-driven decisions to improve. This team made two important decisions when rolling out engineering metrics for the second time.
First, they took the time to understand which engineering metrics applied to their organization. Rather than starting with granular metrics, they decided to lead with the big picture, adopting the four original DORA metrics: Deployment Frequency (DF), Mean Lead Time for Changes (MLTC), Mean Time to Recover (MTTR), and Change Failure Rate (CFR). From these high-level metrics, they would still be able to identify bottlenecks or issues and drill down into more granular metrics as needed.
To support DORA, and to provide visibility into its corresponding metrics, EverQuote adopted Code Climate. With Code Climate's Software Engineering Intelligence platform, they could identify organizational trends, look at data by teams or applications, and dig into specific DORA metrics. For example, if they see that MLTC is high, they can click into it to see exactly where the holdup is — maybe a long Time to Open or Time to First Review is preventing the PRs from getting to production as expected. Starting at a high level helps them understand their systems holistically, and then they can drill down as needed, which is more efficient and saves team members from metric fatigue.
Second, they empowered teams to own their metrics by educating them in how to read and interpret the data, and creating processes to discuss performance at the end of a sprint. They held these conversations as a team, not during one-on-ones, and focused on how they could better collaborate to improve as a unit. This strategy exemplifies one of EverQuote’s core principles: If you work as a team, you succeed as a team.
The EverQuote journey to measurement has come full circle. Now, engineers embrace engineering metrics as a tool for continuous improvement. After two iterations of implementing metrics, the team has learned three major lessons for successful adoption:
Combining DORA DevOps metrics with other engineering metrics in Code Climate's insights platform has helped EverQuote nurture its data-driven culture. To learn more about successfully rolling out engineering metrics within your organization, request a consultation.

Objective data is a necessary complement to engineering leadership. By digging into the key metrics of the Software Development Life Cycle (SDLC), leaders can better assess and improve engineering processes and team performance.
Each organization may choose to focus on a different set of engineering metrics based on their goals. Our collaboration with thousands of organizations has revealed a key set of metrics that have proven valuable time and again, including Review Cycles, a key factor in Pull Request (PR) Cycle Time.
In this blog, we’ll dig into the importance of the Review Cycles metric, what it can tell you about the health of your teams and processes, and how to improve it.
The Review Cycles metric refers to Pull Request reviews, and measures the number of times a PR goes back and forth between an author and reviewer.
The Pull Request review process is an essential component of PR Cycle Time, which measures the time from a first commit in a pull request being authored to when that PR is merged, and looking at Pull Request data can help leaders understand teams’ Time to Market and baseline productivity over time.
Whether the PR is created to ship a new feature or fix a bug, for example, the proposed change needs to be reviewed by a member of the team before it is merged. If that PR gets approved and merged with no further interaction from the author, then the Review Cycle is 1 — the changes were reviewed and approved in a single cycle.
If a PR does not get approval and requires changes, then the author must make an additional commit. The reviewer then checks the new version before it is approved. In this scenario, the number of Review Cycles is 2. Of course, this number increases as the PR is passed back and forth between author and reviewer.
Pull Requests require software engineers to context switch and focus on one particular line of work. When this happens often, and Review Cycles are high, the PR review process can spread engineers’ attention too thin. It can also become a bottleneck to shipment.
There are various reasons why Review Cycles may be high:
If the Review Cycle metric is high for a particular submitter, it could mean that they’re struggling with the codebase or dealing with unclear requirements.
While data can offer a concrete number of Review Cycles for a specific PR, it does not tell the whole story. If a specific developer has a high number of Review Cycles tied to their work, engineering leaders should open a dialogue with both the developer and reviewer to pinpoint potential cause.
Sure, they may be struggling with the codebase because they are new to it, but it’s also possible that their teammates may be unfairly scrutinizing their work. There are a number of potential biases that could be skewing perception of ICs and their work. One engineering leader was able to use data from Code Climate's platform to uncover that a woman engineer’s PRs were moving disproportionately slower than those of her male counterparts and concluded that bias was a problem within the team.
To identify what’s affecting your teams’ Review Cycles and PR review process overall, examining the data will give you a starting point to have a conversation with the team and ICs involved so you can align on processes.
When a developer first joins a team, it may take time for them to get up to speed. Looking at Review Cycles in a Software Engineering Intelligence (SEI) platform allows leaders to observe changes and progress over time. With these insights, you can measure the ramp time for newly onboarded engineers by observing whether their Review Cycles decrease over time. If Review Cycles for new hires are not decreasing at the expected rate, leaders may want to further investigate the efficacy of onboarding processes and ensure that new developers have the tools they need to excel in their roles.
When you use a Software Engineering Intelligence (SEI) platform like Code Climate, you can gain visibility into the entire PR review process. The Analytics module in Code Climate's platform is a good place to investigate PR review processes. You’ll want to run a query for Review Cycles, or the number of times a Pull Request has gone back and forth between the author and reviewer. Here’s how:
Click on the arrow next to the Review Cycle number to see all of Hecate’s individual PRs from the selected timeframe. Sort the number of Review Cycles from high to low and start with the ones that have been open the longest. In this case, the top two PRs, which have both undergone 4 Review Cycles and are still open, are worth bringing to a standup, retro, or 1-on-1.
Prepare for a standup, retro, or 1-on-1 with developers by taking a look at Pull Request data. This will allow you to be more informed ahead of a meeting, and be able to focus specifically on units of work rather than developers or teams themselves.
Ask your team questions about specific PRs with high Review Cycles to uncover where the misalignment is happening. Work with the team to find solutions to limit the amount of times a PR is volleyed back and forth by establishing what is expected in a review, how solutions should be implemented, and when a review is complete. Document best practices for the Pull Request review process to use as a reference in the future.
Code Climate has produced proprietary benchmarks that engineering leaders can use. We have found that the top 25% of organizations have an average of 1.1 Review Cycles or less, whereas the industry average is 1.2 cycles. If Review Cycles are above 1.5, it’s time to investigate why.
Review Cycles are one of many critical metrics that engineering leaders can measure to understand and improve team health and processes. By looking at Review Cycles alongside other Pull Request-related metrics, you can uncover the cause of a slowdown and make informed decisions towards improvement. The data is only a starting point, however — it’s essential that leaders speak directly to teams in order to find a sustainable solution.
Request a consultation to learn how measuring engineering metrics can lead to faster software delivery and healthier teams.

The DORA research group, (DevOps Research and Assessment), now part of Google Cloud, identified four key software engineering metrics that their research showed have a direct impact on the teams' ability to improve deploy velocity and code quality, which directly impacts business outcomes.
The four outcomes-based DORA metrics include two incident metrics: Mean Time to Recovery (MTTR) (also referred to Time to Restore Service), and Change Failure Rate (CFR), and two deploy metrics: Deployment Frequency (DF) and Mean Lead Time for Changes (MTLC).
Gaining visibility into these metrics offers actionable insights to balance and enhance software delivery, so long as they are considered alongside other key engineering metrics and shared and discussed with your team.
The Mean Time to Recovery metric can help teams and leaders understand the risks that incidents pose to the business as incidents can cause downtime, performance degradation, and feature bugs that make an application unusable.
Mean Time to Recovery is a measurement of how long it takes for a team to recover from a failure in production, from when it was first reported to when it was resolved. We suggest using actual incident data to calculate MTTR, rather than proxy data which can be fallible and error-prone, in order to improve this metric and prevent future incidents. While the team may experience other incidents, MTTR should only look at the recovery time of incidents that cause a failure in production.
Even for high-performing teams, failures in production are inevitable. MTTR offers essential insight into how quickly engineering teams respond to and resolve incidents and outages. Digging into this metric can reveal which parts of your processes need extra attention; if you’re delivering quickly but experiencing frequent incidents, your delivery is not balanced. By surfacing the data associated with your teams’ incident response, you can begin to investigate the software delivery pipeline and uncover where changes need to be made to speed up your incident recovery process.
Recovering from failures quickly is key to becoming a top-performing software organization and meeting customer expectations.
Each year, the DORA group puts out a state of DevOps report, which includes performance benchmarks for each DORA metric, classifying teams as high, medium, and low-performing. One of the most encouraging and productive ways to use benchmarking in your organization is to set goals as a team, measure how you improve over time, and congratulate teams on that improvement, rather than using “high,” “medium” and “low” to label team performance. Additionally, if you notice improvements, you can investigate which processes and changes enabled teams to improve, and scale those best practices across the organization.
More than 33,000 software engineering professionals have participated in the DORA survey in the last eight years, yet the approach to DORA assessment is not canonical and doesn’t require precise calculations from respondents, meaning different participants may interpret the questions differently, and offer only their best assumptions about their teams’ performance. That said, the DevOps report can provide a baseline for setting performance goals.
The results of the 2022 State of DevOps survey showed that high performers had a Mean Time to Recovery of less than one day, while medium-performing organizations were able to restore normal service between one day and one week, and low-performing organizations took between one week and one month to recover from incidents. For organizations managing applications that drive revenue, customer retention, or critical employee work, being a high performer is necessary for business success.
Visibility into team performance and all stages of your engineering processes is key to improving MTTR. With more visibility, you can dig into the following aspects of your processes:
A long MTTR could indicate that developers have too much WIP, and lack adequate resources to address failures.
One of the benefits of using a Software Engineering Intelligence (SEI) platform is that you can add important context when looking at your MTTR. An SEI platform like Code Climate, for example, allows you to annotate when you made organizational changes — like adding headcount or other resources — to see how those changes impacted your delivery.
You can also view DORA metrics side by side with other engineering metrics, like PR Size, to uncover opportunities for improvement. Smaller PRs can move through the development pipeline more quickly, allowing teams to deploy more frequently. If teams make PR Sizes smaller, they can find out what’s causing an outage sooner. For example, is debugging taking up a lot of time for engineers? Looking at other data like reverts or defects can help identify wasted efforts or undesirable changes that are affecting your team’s ability to recover, so you can improve areas of your process that need it most.
What did you learn from assessing your team’s incident response health? Documenting an incident-response plan that can be used by other teams in the organization and in developer onboarding can streamline recovery.
To improve your team’s incident response plan, it’s helpful to use an automated incident management system, like Opsgenie or PagerDuty. With an SEI platform like Code Climate you can push incident data from these tools, or our Jira incident source, to calculate DORA metrics like MTTR. In Code Climate's platform, users can set a board and/or issue type which will tell the platform what to consider an “incident.”
We spoke with Nathen Harvey, Developer Advocate at DORA and Google Cloud, for his perspective on how to best use DORA metrics to drive change in an organization. Harvey emphasized learning from incident recovery by speaking with relevant stakeholders.

Looking at DORA metrics like Mean Time to Recovery is a key starting point for teams who want to improve performance, and ensure more fast and stable software delivery. By looking at MTTR in context with organizational changes and alongside other engineering metrics, speaking with your team after an incident, and documenting and scaling best practices, you can improve MTTR overall and ultimately deliver more value to your customers.
Learn how you can use these metrics to enhance engineering performance and software delivery by requesting a consultation.

Engineering teams know that technical debt, or “tech debt,” is an inevitable, and often necessary, part of software development. Yet, it can be difficult to explain the significance of tech debt to stakeholders and C-suite leadership. While stakeholders might want to prioritize constant innovation over paying down tech debt, letting tech debt build up can ultimately slow down an engineering team. When that happens, it can be challenging to prove that the resulting delays and complications don't fully reflect an engineering team's adeptness.
What are the reasons an engineering team might accrue tech debt, and how can they overcome it before it impacts delivery?
Technical debt is a term used to describe the implications of immature code being pushed through the software development pipeline to expedite delivery. Because the code was merged prematurely, or was a quick fix to a complex problem, it often needs to be refactored or redone, resulting in a backlog of work that will need to be taken on at some point in the future.
The term "technical debt" was first coined by Ward Cunningham, who posited that "a little debt speeds development so long as it is paid back promptly with refactoring. The danger occurs when the debt is not repaid."
Tech debt can be thought of similar to financial debt. Taking out a loan for a larger purchase makes it possible to expedite the purchase, rather than waiting to save up a large sum of cash. In exchange, you must repay the loan plus interest, which builds up exponentially over time.
With technical debt, the interest is not only the extra developer time spent refactoring the code, but the consequences resulting from not addressing that refactoring early on. As the work builds up and other work is prioritized, going back to deal with the technical debt becomes increasingly costly and difficult. In this sense the time needed to address tech debt grows, similar to interest.
First, it's important to note that technical debt is inevitable in order to remain competitive in the industry, and doesn't necessarily imply that an engineering team has done something "wrong."
Similar to financial debt, there are reasons for intentionally racking up technical debt. The marketplace today moves lightning fast and, to stay afloat, you might opt for shortcuts that lead to technical debt in order to ship new features quickly and bring in revenue.
The associated tech debt you take on might be worth it when you compare it against the downsides of waiting to bring your features to the market. This is completely normal — the danger arises when, as Cunningham said, the debt isn't properly repaid.
Instead of working on developing new features, engineers are often left to work through technical debt, further slowing innovation and impacting a slew of business outcomes.
Even while there are good reasons why organizations accrue tech debt, the earlier it’s addressed, the better. It’s vital for engineering leaders to pay attention to tech debt and be aware of the issues it can pose to the organization:
To minimize or overcome tech debt, start by investigating the source.
Engineering leaders can take a cue from one Code Climate customer, and use a Software Engineering Intelligence (SEI) platform — sometimes known as an Engineering Management Platform (EMP) — to demonstrate how tech debt can limit deployment. The engineering team at a popular crowdsourcing platform often worked with legacy code, and had nearly a decade’s worth of tech debt.
The company’s VP of Engineering had a relatively easy time getting developers on board to prioritize the backlog of tech debt. When it came to getting executive buy-in, however, the VP of Engineering needed concrete data to present to stakeholders in order to justify dedicating resources to refactoring the legacy code.
Using Code Climate's solutions, the engineering leader was able to demonstrate, in real time, how many Pull Requests (PRs) were left open for longer than is ideal while authors and reviewers sent comments back and forth. Code Climate's insights showed this as a lasting trend with high-risk PRs stacking up. They used this as evidence to executives that legacy code was significantly impacting deployment.
Once you outline how to tackle your current tech debt, think about how you can manage new debt going forward. Team leaders might decide to be mindful of high-risk PRs and monitor them over time to ensure that tech debt does not become insurmountable; or, you may have developers take turns refactoring legacy code while others put their efforts towards innovation. Use concrete evidence from an SEI platform to request additional resources. Once you find what works, you can scale those best practices across the organization.
Technical debt is inevitable, and even mature engineering teams will need a strategy for mitigating the debt they’ve accrued. Communicate with your company leadership about tech debt and its implications, work to find the root cause within your teams, and adopt a slow-but-steady approach towards a resolution.
You will never be able to address and solve all technical debt at once, but you can prioritize what to tackle first and move toward a more efficient future.
A Software Engineering Intelligence platform can provide the visibility leaders need to refine engineering processes. Request a consultation to learn more.

Performance reviews in engineering are often tied to compensation. End-of-year and mid-point check-ins can be great opportunities to discuss individual goals and foster professional growth, but too often they are used as ways to assess whether direct reports are eligible for raises or promotions based on their ability to hit key metrics.
For engineering leaders, a more valuable way to use performance reviews is as a structured opportunity to have a conversation about a developer’s progress and work with them to find ways to grow in their careers through new challenges. This benefits both developers and organizations as a whole — developers are able to advance their skills, and companies are able to retain top engineers who are continually evolving. Even senior engineers look for opportunities for growth, and are more likely to stay at an organization that supports and challenges them.
The key to achieving this is by focusing on competency, rather than productivity measurements, when it comes to evaluating compensation and performance. But how do we define competency in engineering?
Where productivity might measure things like lines of code, competency looks at how an engineer approaches a problem, collaborates with their team to help move things forward, and takes on new challenges.
Questions to consider when evaluating a developer’s competency might include:
Engineering leader Dustin Diaz created a reusable template for evaluating competency in engineering, which he links to in this post. The template borrows from concepts in The Manager’s Path, by author and engineering leader Camille Fournier, and is modeled after the SPACE framework, and outlines levels of competency for engineers at different levels of seniority. The matrix can be helpful for leaders looking to hone in on areas like collaboration, performance, and efficiency/quality. It includes markers of competency for different tiers of engineers, including anticipating broad technical change, end-to-end responsibility on projects, and taking initiative to solve issues.
We’ve addressed how performance reviews during a difficult year can be especially challenging. Yet no matter the circumstances, there are principles of a successful performance review that will always be relevant.
Reviews should always include:
When performance reviews are based on hitting productivity benchmarks that are implicitly linked to compensation, developers might be less focused on ambitious goals and more on checking boxes that they believe will earn them a raise; rather than challenging themselves, they will likely be incentivized to play it safe with easy-to-achieve goals.
Encouraging a focus on competency invites engineers to make decisions that have more potential to move the needle in an organization, and to take risks, even if they could lead to failure.
During our webinar on data-driven leadership, Sophie Roberts, Director of Engineering at Shopify shared why rewarding productivity over growth could have adverse effects: “You end up in a situation where people want to get off projects that they don’t really think they have executive buy-in, or try and game the work they’re doing,” Roberts said. “I’ve canceled people’s projects and promoted them the next month, because how they were approaching the work was what we expect from a competency level of people who are more senior…They may try to get work that is a more sure shot of moving a metric because they think that’s what’s going to result in their promotion or their career progression.”
An emphasis on competency can improve the following:
Data-driven insights can provide engineering leaders with objective ways to evaluate developer competency, without tying metrics directly to compensation. They can help you assess a developer’s progress, spot opportunities for improvement, and even combat common performance review biases.
One effective way to use metrics in performance reviews is to quantify impact. In our webinar on data-driven performance reviews, Smruti Patel, now VP of Engineering at Apollo, shared how data helps IC’s on her team recognize their impact on the business during self-evaluations.
“It comes down to finding the right engineering data that best articulates impact to the team and business goals. So if you think about it, you can use a very irrefutable factor, say, ‘I shipped X, which reduced the end-to-end API latency from 4 or 5 seconds to 2.5 seconds. And this is how it impacted the business,” she said.
In the same discussion, Katie Wilde, now Senior Director of Engineering at Snyk Cloud, shared how metrics explained a discrepancy between one engineer’s self-evaluation and her peer reviews. This engineer gave herself a strong rating, but her peers did not rate her as highly. When Wilde dug into the data, she found that it was not a case of the individual being overconfident, but a case of hidden bias — the engineer’s PRs were being scrutinized more heavily than those of her male counterparts.
In both instances, data helped provide a more complete picture of a developer’s abilities and impact, without being directly tied to performance benchmarks or compensation.
Overall, metrics are able to provide concrete data to counteract assumptions, both on the part of the reviewer and the engineers themselves. By taking a holistic approach to performance reviews and contextualizing qualitative and quantitative data, including having one-on-one conversations with developers, leaders can make more informed decisions about promotions and compensation for their teams.
Keep in mind that performance reviews should be opportunities to encourage growth and career development for engineers, while gaining feedback that can inform engineering practices.
Most importantly, rewarding competency is an effective way to align developer goals with business goals. This way, leaders are invested in the growth of valuable members of their team who make a significant impact, while engineers are recognized for their contributions and able to grow in their careers.

What’s the practical value of DORA metrics, and how are real engineering organizations using them? To find out, we invited a panel of engineering leaders & industry experts to share their experiences.
Code Climate Senior Product Manager Madison Unell moderated the conversation, which featured:
Over 45 minutes, these panelists discussed real-world experiences using DORA metrics to drive success across their engineering organizations.
Below are some of the key takeaways, but first, meet the panelists:
Scott Aucoin: I work at Liberty Mutual. We’ve got about 1,000 technology teams around the world. We’ve been leveraging DORA metrics for a while now. I wouldn’t say it’s perfect across the board, but I’ll talk about how we’ve been leveraging them. The role that I play is across about 250 teams working on our Agile enablement, just improving our agility, improving things like DevOps and the way that we operate in general; and our portfolio management work…from strategy through execution; and user experience to help us build intuitively-designed things, everything from the frontend of different applications to APIs. So finding ways to leverage DORA throughout those three different hats, it’s been awesome, and it’s really educating for me.
Emily Nakashima: I’m the VP of Engineering at a startup called Honeycomb. I manage an engineering team of just about 40 people, and we’re in the observability space. So basically, building for other developers, which is a wonderful thing. And we had been following along with DORA from the early days and have been enthusiasts and just made this switch over to using the metrics ourselves. So I’m excited to talk about that journey.
Karthik Chandrashekar: I’m the Senior VP of our Customer Organization at Code Climate. I have this cool job of working with all our engineering community customers solving and helping them with their data-driven engineering challenges. DORA is fascinating because I started out as a developer myself many years back, but it’s great to see where engineering teams are going today in a measurement and management approach. And DORA is central to that approach in many of the customer organizations I interact with. So I’m happy to share the insights and trends that I see.
Why did your organization decide to start using DORA?
Emily Nakashima: I first came to DORA metrics from a place of wanting to do better because we’re in the DevOps developer tooling space ourselves. Our executive team was familiar with the DORA metrics, and we had used them for years to understand our customers, using them as a tool to understand where people were in their maturity and how ready they would be to adopt our product…we had this common language around DORA…[At the same time,] our engineering team was amazing, and we weren’t getting the credit for it that we deserved. And by starting to frame our performance around the DORA metrics and show that we were DORA Elite on all these axes, I think it was a really valuable tool for helping to paint that story in a way that felt more objective rather than just me going, “We’ve got a great team.” And so far, honestly, it’s been pretty effective.
Scott Aucoin: Liberty Mutual being a 110-year-old insurance company, there are a lot of metrics. There are some metrics that I think we might say, “Okay, those are a little bit outdated now.” And then there are other ones that the teams use because they’re appropriate for the type of work the teams are doing. What we found to be really valuable about DORA metrics is their consistency…and the ability to really meet our customers and their needs through leveraging DORA metrics.
Karthik Chandrashekar: Speaking with a lot of CTOs and VPs of different organizations, I think there’s a desire to be more and more data-driven. And historically, that has been more around people, culture, teams, all of that, but now that’s transcended to processes and to data-driven engineering.
How did you go about securing buy-in?
Scott Aucoin: This has been pretty grassroots for us. We’ve got about 1,000 technology teams across our organization. So making a major shift is going to be a slow process. And in fact, when it’s a top-down shift, sometimes there’s more hesitancy or questioning like, “Why would we do this just because this person said to do this?” Now, all of a sudden, it’s the right thing to do. So instead, what we’ve been doing and what’s happened in different parts of our organization is bringing along the story of what DORA metrics can help us with.
Emily Nakashima: The thing I love about Scott’s approach is that it was a top-down idea, but he really leveraged this bottom-up approach, starting with practitioners and getting their buy-in and letting them forge the way and help figure out what was working rather than dictating that from above. I think that it’s so important to really start with your engineers and make sure that they understand what and why. And I think a lot of us have seen the engineers get very rightly a little nervous about the idea of being measured. And I think that’s super-legitimate because there’s been so many bad metrics that we’ve used in the past to try to measure engineering productivity, like Lines of code, or PRs Merged. I think we knew we would encounter some of that resistance and then just a little bit of concern from our engineering teams about, what does it mean to be measured? And honestly, that’s something we’re still working through. I think the things that really helped us were, one, being really clear about the connection to individual performance and team performance and saying, we really think about these as KPIs, as health metrics that we’re using to understand the system, rather than something we’re trying to grade you on or assess you on. We also framed it as an experiment, which is something our culture really values.
DORA’s performance buckets are based on industry benchmarks, but you’re all talking about measuring at the company level. How do you think about these measures within your company?
Emily Nakashima: This was absolutely something that was an internal debate for us. When I first proposed using these, actually, our COO Jeff was a proponent of the idea as well. So the two of us were scheming on this, but there was really resistance that people pointed out that the idea of these metrics was about looking at entire cohorts. And there was some real debate as to whether they were meaningful on the individual team or company level. And we are the engineering team that just likes to supplement disagreements with data. So we just said, that might be true, let’s try to measure them and see where it goes. And I will say they are useful for helping us see where we need to look in more detail. They don’t necessarily give you really granular specifics about what’s going wrong with a specific team or why something got better or worse. But I do think that they have had a value just for finding hotspots or seeing trends before you might have an intuition that the trend is taking place. Sometimes you can start to see it in the data, but I think it was indeed a valid critique, ’cause we’re, I think, using them in a way that they’re not designed for.
Something important about the DORA metrics that I think is cool is that each time they produce the report, the way they set the Elite and High and other tiers can change over time. And I like that. And you also see a lot of movement between the categories…And to me, it’s a really good reminder that as engineering teams, if we just keep doing the same thing over and over and don’t evolve our practices, we fall behind the industry and our past performance.
Scott Aucoin: I look at the DORA metrics with the main intent of ensuring that our teams are learning and improving and having an opportunity to reflect in a different way than they’re used to. But also, because of my competitive nature, I look at it through the lens of how we are doing, not just against other insurance companies, which is critical, but setting the bar even further and saying, technology worldwide, how are we doing against the whole industry? And it’s not to say that the data we can get on that is always perfect, but it helps to set this benchmark and say, how are we doing? Are we good? Are we better than anyone else? Are we behind on certain things?
Karthik Chandrashekar: One thing I see with DORA as a framework is its flexibility. So to the debate that Emily mentioned that they had internally, it’s a very common thing that I see in the community where some organizations essentially look at it as an organizational horizontal view of how the team is doing as a group relative to these benchmarks.
What pitfalls or challenges have you encountered?
Karthik Chandrashekar: From a pure trend perspective, best practice is a framework of “message, measure, and manage.” And not doing that first step of messaging appropriately with the proper context for the organization means that it actually can cause more challenges than not. So a big part of that messaging is psychological safety, bringing the cultural safety of, “this is to your benefit for the teams.” It empowers. The second thing is we all wanna be the best, and here’s our self-empowered way to do that. And then thirdly, I think, “how do we use this to align with the rest of the organization in terms of showcasing the best practices from the engineering org?”
So the challenges would be the inverse of the three things I mentioned. When you don’t measure, people look at it as, “Oh, I’m being measured. I don’t wanna participate in this.” Or when you measure, you go in with a hammer and say, “Oh, this is not good. Go fix it.” Or then you do measure, and everything is great, but then when you are communicating company-wide or even to the board, then it becomes, hey, everything’s rosy, everything is good, but under the hood, it may not necessarily be…Those are some of the challenges I see.
Emily Nakashima: To me, the biggest pitfall was just, you can spend so much time arguing about how to measure these exact things. DORA has described these metrics with words, but how do you map that to what you’re doing in your development process?
For us in particular, we have an hour-timed wait for various reasons because things roll to a staging environment first and get through some automated tests. Our deployment process is an hour. We will wait for 60 minutes plus our test runtime. So we can make incredible progress, making our test faster and making the developer experience better. And we can go from 70 to 65 minutes, which doesn’t sound that impressive but is incredibly meaningful to our team.
And people could get focused on, “Wait, this number doesn’t tell us anything valuable.” And we had to just say, “Hey, this is a baseline. We’re gonna start with it.” We’re gonna just collect this number and look at it for a while and see if it’s meaningful, rather than spend all this time going back and forth on the exact perfect way to measure. It was so much better to just get started and look at it, ’cause I think you learn a lot more by doing than by finding the perfect metric and measuring it the perfect way.
Scott Aucoin: You’re going to have many problems, more than your DevOps practices. And Emily, I think the consistency around how you measure it is something we certainly have struggled with. And I would say in some cases, we still wonder if we’re measuring things the right way, even as we’ve tried to set a standard across our org. I’ll add to that, though, and say the complexity of the technology world, in general, is a significant challenge when you’re trying to introduce something that may feel new or different to the team or just like something else that they need to think about…You have to think about from the standpoint of the priorities of what you’re trying to build, the architecture behind it, security, the ability to just maintain and support your system, your quality, all of the different new technology that we need to consider ways to experiment all of that. And then, and we throw in something else to say, “Okay, make sure you’re looking at this too.” I think just from a time capacity and bandwidth perspective. It can be challenging to get folks to focus and think about, okay, how can we improve on this when we have so many other things we need to think about simultaneously?
What are you doing with DORA metrics now?
Scott Aucoin: It’s a broad spectrum. We’re doing all these fantastic things. Some groups are still not 100% familiar with what it means to look at DORA metrics or how to read them.
It’s kind of a map and compass approach. You’re not only looking at a number; you’re able to see from that number what questions you have and how you can learn from it to map out the direction you want to go. So if you’re lagging behind in Deployment Frequency, maybe you want to think more about smaller batches, for example. So within our teams, we’re looking at it through that lens.
And again, it’s not 100% of the teams. In fact, we still have more promotion and adoption to do around that, but we have the data for the entire organization. So we also look at it from the levels of the global CIO and monthly reports that are monthly operational reports that go to the global CIO. And while I can think about someone who I’ve gotten to know over the last few months, Nathen Harvey, who’s a developer advocate for Google’s DORA team, I have him in the back of my mind as I say this, as he would say, “The metrics are really for the teams.”
We think about the value of it from the executive level as well. And when we think about the throughput metrics of Deployment Frequency and Lead Time for Changes, we can get a little bit muddy when you roll up thousands of applications to this one number for an exact, especially since many of those applications aren’t being worked on regularly. Some are in more of a maintenance mode. But when we can focus on the ones actively being worked on and think about trends, are we improving our Deployment Frequency or not? It can lead the CIO or any of the CIOs in the organization to ask the right questions to think about “what I can do to help this?” Especially when it comes to stability, regardless of whether an application is getting worked on actively today or not, we need stability to be there. So we really are looking at them at multiple levels and trying to be thoughtful about the types of questions that we ask based on the information we’re seeing.
Emily Nakashima: My example is the backward and wrong way to do this. I started by basically just implementing these myself for myself. And the first thing I did with them was to show the stakeholders that I was trying to paint this story too. And I think if you can start with getting practitioners to work with them, getting your software engineers to work with them first, tune them a little bit, and find them relevant, I honestly think that’s the best approach in the organization if you could do it. That wasn’t the situation I happened to be in, but I started with that, used them to radiate these high-level status numbers to other folks on the exec team and the board, and then started to roll them out team by team to allow for that customization.
So we’re still in that process now, but I started to pull managers in one by one and go, hey, these metrics that I’m tracking, this is what they mean to me. Let’s sit down together and figure out what’s gonna be meaningful for your engineering team and how to build on this baseline here…Let’s build on top of it together.
And we’re hoping to get into this quarter to have teams start working with them more directly and get more active in tuning and adding their metrics. We think about observability for systems, and we always want people to be adding instrumentation to their systems as they go. Each time you deploy a feature, add instrumentation that tells you whether or not it’s working. And we wanna bring that same approach to our engineering teams where we have these baseline metrics. If you don’t think they’re that good and they don’t tell you that much, then you go ahead and tell us what metric we add, and we’re gonna work together to build this higher fidelity picture that makes sense to you, and then also have that shared baseline across teams.
To hear more from our panelists, watch the full webinar here.