top of page
Search

The Test Pyramid Was a Lie (Or at Least an Oversimplification)

  • Writer: Phil Hargreaves
    Phil Hargreaves
  • 2 hours ago
  • 4 min read

For years, we’ve been told to follow the “Test Pyramid.”



Lots of unit tests. Fewer integration tests. Even fewer end-to-end tests.


It’s been presented as an almost unquestionable truth - a universal law of good engineering.


But what if the test pyramid was never a law?

What if it is a simplification that became dogma?

And more importantly: What testing actually matters?


Where the Test Pyramid Came From


The concept popularised by Mike Cohn in Succeeding with Agile was meant to guide teams away from slow, brittle UI-heavy test suites.


The idea was simple:


  • Unit tests are fast and cheap → have many.

  • Integration tests are slower → have fewer.

  • End-to-end tests are slow and fragile → have very few.


As a principle of feedback speed and maintainability, this made sense.

But somewhere along the way, a heuristic became a rigid rule.


And that’s where the problems began.


The Pyramid Optimises for Cost - Not Risk


The pyramid is fundamentally an economic model.


It optimises:


  • Execution speed

  • Maintenance cost

  • Developer productivity


It does not optimise for:


  • Business risk

  • Customer impact

  • Revenue exposure

  • Security threats

  • System complexity


You can have thousands of unit tests and still ship a catastrophic production defect.


Why?


Because the pyramid optimises for test quantity by layer, not risk coverage.


And risk is what actually matters.


The Illusion of Safety


Many teams proudly report:


  • 90%+ unit test coverage

  • Fast CI pipelines

  • A “healthy” pyramid shape, so to speak


And yet:


  • Payment systems fail in production

  • Authentication breaks after refactors

  • Critical user journeys silently degrade

  • Integrations collapse under real-world conditions


The pyramid never promised safety. It promised speed and maintainability.


Did we just assume safety came with it?


Modern Systems Don’t Fit the Pyramid Model


When the pyramid was introduced, systems were:


  • More monolithic

  • Less distributed

  • Less dependent on third parties

  • Less API-driven

  • Less cloud-native


Today’s systems are:


  • Microservice-heavy

  • Event-driven

  • Dependent on external APIs

  • Continuously deployed

  • Highly integrated


Where do you put:


  • Contract testing?

  • Observability validation?

  • Data pipeline verification?

  • Infrastructure-as-code validation?

  • Machine Learning model validation?


Thats before we start thinking of how and where we leverage AI to make our approach smarter, faster, and more predictive.


They don’t neatly stack into a pyramid. The model feels increasingly artificial.


The Real Question: What Testing Reduces Risk?


Instead of asking:


“Do we have enough unit tests?”


We should be asking:


“What could hurt the business most, and how are we validating that it won’t?”


That shifts the conversation from structure to impact.


Testing that truly matters often includes:


1. Critical Path Validation

Does the primary revenue-generating/critical workflow work under realistic conditions?


2. Integration Confidence

Do your services behave correctly together - not just in isolation?


3. Contract and Schema Protection

Are changes breaking downstream consumers?


4. Resilience and Failure Testing

What happens when dependencies fail?


5. Security Testing

How could this system be exploited?


6. Production Monitoring as Testing

Are you detecting real-world failures quickly?


There are, of course, others, e.g., usability.


None of these are about the testing pyramid. They’re about risk management, and there is so much to consider.


The Pyramid Encouraged the Wrong Metric


The most damaging side effect of the test pyramid wasn’t structure. It was a measurement.


Teams started tracking:


  • Test count

  • Coverage percentage

  • Layer distribution


Instead of:


  • Risk exposure

  • Defect leakage

  • Incident severity

  • Customer impact


Coverage is easy to measure. Risk reduction is harder. So we optimised for the easiest metric.


I spoke about how output measures often dominate reporting in a post about OKRs because they are simple, familiar, and readily available, but it's outcomes that are more important to any business. The same applies here.


What Actually Matters in Testing


If we strip away the pyramid metaphor, what remains?


Testing that matters is:


Risk-Focused

It protects what would hurt most if it failed.


Behaviour-Driven

It validates real user workflows, not just code paths.


Change-Aware

It increases scrutiny where code is unpredictable.


System-Level Conscious

It acknowledges distributed complexity.


Economically Rational

It balances the cost of testing against the cost of failure.


Notice what’s missing? There’s no one-size-fits-all view.


A Better Mental Model: The Risk Radar


Instead of a pyramid, imagine a radar.


Each release introduces potential exposure in different directions:


  • Revenue

  • Security

  • Performance

  • Reliability

  • Compliance

  • Reputation


Testing effort expands outward where risk signals are strongest.


Some releases may need deep unit validation. Others may need heavy integration scrutiny. Others may need chaos testing or load testing.


Your approach changes release to release. Because risk changes.


So… Was the Pyramid a Myth?


Not entirely.


It was useful. It corrected an over-reliance on UI tests. It improved feedback loops. But it was never a universal blueprint.


The mistake wasn’t the pyramid. The mistake was treating it as a rule instead of guidance.


The Future of Testing Isn’t One Shape


As systems grow more complex and interconnected, rigid models become less helpful.


What matters is:


  • Understanding where failure hurts most

  • Aligning validation effort with impact

  • Measuring real-world outcomes

  • Adapting continuously


Testing isn’t about building the right shape.

It’s about reducing meaningful risk.

And risk doesn’t stack neatly into layers.


Lastly


If your team stopped using the phrase “test pyramid” tomorrow, what would change?


If the answer is “nothing,” then it was never your strategy. It was just a diagram.


Diagrams don’t protect production systems. Intentional, risk-aligned validation does.

 
 
 
logo_transparent_background.png

© 2026 Evolve Software Consulting Ltd.

bottom of page