WIP limits — constraints on the maximum number of work items permitted in any workflow stage at one time — are the core operational mechanism of Anderson's kanban-method. Limiting Work in Progress is what transforms a visualization tool into a pull system; it is the mechanism that forces flow, exposes bottlenecks, and creates the organizational pressure that drives improvement.
The Mechanism
A WIP limit specifies that a workflow column (e.g., "In Development," "In Review") may hold no more than n items simultaneously. When the limit is reached, no new work can enter that column until existing work completes and exits. This creates two effects:
1. Pull behavior. Downstream stages pull work only when they have capacity. Work is not pushed into stages regardless of whether they can absorb it. 2. Bottleneck exposure. When a stage consistently holds items at its WIP limit while upstream stages sit empty, the limit is revealing a constraint. The system is telling you where the problem is.
The second effect is as important as the first. Anderson frequently emphasized that WIP limits are a diagnostic instrument, not merely a traffic control mechanism. They make dysfunction visible in a way that large batches and full queues obscure.
Intellectual Sources
Reinertsen's Queueing Theory
don-reinertsen's analysis of product development economics, drawing on M/M/1 queueing models from operations research, provided Anderson with the economic foundation for WIP limits. The key insight: at high utilization rates (close to 100% of capacity), queue length and wait time grow exponentially, not linearly. A system running at 80% utilization has queues roughly four times longer than one running at 50%. The cost in lead time — and therefore in economic value — is enormous.
Reinertsen showed that this was not intuitive to most organizations: people equate idle capacity with waste, but idle capacity in a knowledge work system is what enables fast flow and short queues. WIP limits enforce a utilization ceiling that keeps queues manageable.
Goldratt's Theory of Constraints
eliyahu-goldratt's TOC framework (developed in The Goal and applied to projects in Critical Chain) provides a complementary logic. TOC argues that every system has one binding constraint and that improving anywhere other than the constraint is local optimization that cannot improve global throughput. WIP limits operationalize this at the workflow level: by forcing focus onto the constraint (the stage that backs up when limited), teams are directed toward the work that matters most for throughput.
Goldratt's distinction between the "throughput world" and the "cost world" is directly relevant: a cost-world mindset treats idle developers as waste and pushes more work into the system; a throughput-world mindset treats long lead times as waste and protects flow by constraining WIP.
Anderson's Contribution
Anderson's synthesis was to apply these manufacturing and operations research concepts to knowledge work and to package them in a form palatable to software organizations. The early application at microsoft (2004, see microsoft-xit-kanban-2004) demonstrated that WIP limits worked for software feature development. The corbis-kanban-experiment (2007) refined the practice with column-specific limits and the discovery of classes-of-service as a mechanism for managing different urgency levels within limits.
The key adaptation from manufacturing: in knowledge work, WIP limits must account for the variability of work items (some take days, some take weeks) and the invisible nature of work (unlike physical inventory, software tasks in progress are not self-evidently visible without a visualize-workflow practice first).
WIP Limits and Classes of Service
classes-of-service interacts directly with WIP limits: an Expedite lane typically has a WIP limit of 1 (no more than one expedite item at a time, because expedite items preempt normal flow and create coordination overhead). Standard items have the main column WIP limits. This creates a multi-tiered pull system where urgency is handled structurally rather than through ad hoc escalation.
Common Misapplications
Anderson documented several failure modes in applying WIP limits:
The kanban-book addresses each of these through case study analysis, connecting them back to the queueing theory foundations that explain why the failure modes are costly.