The DORA Four Key Metrics are the empirically derived measures of software delivery performance that emerged from the state-of-devops-report-series and were codified in accelerate-book (2018). They represent the field's most rigorous answer to the question: what should we measure to know whether our DevOps practices are working?
The Four Metrics
Deployment Frequency (DF): How often an organization deploys code to production. Elite performers deploy on-demand, multiple times per day. Low performers deploy monthly to every six months. DF is a proxy for batch size — high frequency means small batches, which means faster learning and lower risk per deployment.
Lead Time for Changes (LT): The time from a code commit to that code running in production. Elite performers achieve lead times of less than one hour; low performers measure in months. Lead time captures the speed of the entire delivery system — testing, approval processes, deployment steps, environment constraints.
Change Failure Rate (CFR): The percentage of changes to production that result in a degraded service requiring remediation (rollback, hotfix, patch). Elite performers maintain 0-15% CFR; low performers see 46-60% failure rates (approximate figures based on DORA cluster analysis; exact thresholds have shifted across report years).
Mean Time to Restore (MTTR): The time to restore service when a production incident occurs. Elite performers restore in less than one hour; low performers take one to six months. MTTR measures the effectiveness of monitoring, incident response, and rollback capability.
Why These Four
The selection of these four metrics is deliberate: they capture both throughput and stability dimensions of delivery performance, preventing local optimization on one dimension at the expense of the other.
A team could theoretically score well on speed by deploying broken code frequently — CFR and MTTR expose this. A team could theoretically score well on stability by deploying rarely — DF and LT expose this. The four metrics together are harder to game than any single metric.
The Central Finding: Speed and Stability Are Not Tradeoffs
The single most important finding from the DORA research is that high-performing organizations outperform low performers on ALL four metrics simultaneously. Elite performers deploy more frequently AND have lower change failure rates AND restore faster. This directly contradicts the intuitive assumption (and conventional IT management practice) that speed and stability trade off against each other.
This finding has the character of a paradigm shift: if the intuitive model were true, organizations managing the speed-stability tradeoff would be doing the right thing by going slowly. The DORA evidence shows the intuitive model is wrong — the practices that increase speed also increase stability, because they reduce batch sizes, increase feedback, and build quality in earlier.
Evidence Base
The metrics emerged from four-plus years of the state-of-devops-report-series, encompassing more than 23,000 survey responses from technology professionals worldwide. nicole-forsgren led the statistical methodology, using structural equation modeling (SEM) to establish causal relationships rather than mere correlation — an unusually rigorous approach for practitioner-facing research.
The dora-research program was acquired by Google in 2018 (google-acquires-dora-2018), continuing as an independent research program within Google Cloud.
Cluster Analysis: Performance Tiers
The DORA research uses cluster analysis to identify distinct performance tiers:
The gap between elite and low performers is measured in orders of magnitude — 973x faster deployment frequency and 6,570x faster lead time were among figures reported in specific research iterations (approximate; figures vary by year).
Relationship to the 24 Capabilities
accelerate-book identifies 24 technical, architectural, process, and cultural capabilities that drive the four key metrics. The capabilities most strongly predictive of high performance include:
This means the metrics are not merely measurement artifacts — they are outcome measures of a coherent set of causal practices.
Comparison to Lean Metrics
The four metrics have direct lean manufacturing analogues:
The DORA metrics are, in this sense, the application of lean production measurement to software delivery. This connection is explicit in accelerate-book's subtitle: "The Science of Lean Software and DevOps."
Comparison to DeMarco and Lister
accelerate-book positions itself as the "Coding War Games" of the DevOps movement — a reference to Tom DeMarco and Timothy Lister's work in the 1980s that proved developer productivity varied by factors of 10 and that environment (not individual talent) was the primary driver. The DORA research does for delivery systems what Coding War Games did for individual developers: it proves that practices and organization matter, and that the variance between organizations is enormous and explainable.