What is wrong with 5 Whys and root cause thinking?

In brief

5 Whys can be useful as a teaching exercise, but it is too weak to be a full accident analysis method.
The main problem is not asking “why”; the problem is forcing a complex event into a narrow chain.
Different starting points can produce different “root causes”, each plausible and each incomplete.
Multiple 5 Whys chains are better than one, but they still struggle to show interactions, feedback, timing, trade-offs and recovery.
The aim of investigation should be to understand how the system allowed the event to develop and how it can be made safer, not to stop once one or more “root causes” have been named.

Why this topic matters

After an accident or serious incident, people naturally want an answer. What caused it? Who made the mistake? What should be fixed? These are understandable questions, especially when harm has occurred or nearly occurred. But if the analysis is too eager to find a simple answer, it can miss the very conditions that need to change.

Root cause analysis is often presented as a structured way to learn from incidents. Used carefully, some forms of RCA can support disciplined investigation, evidence gathering and improvement. The problem is not structured incident analysis; the problem is when “root cause” becomes a promise of a single deep explanation. In complex socio-technical systems, this is rarely how accidents develop.

Healthcare, aviation, rail, process safety and digital systems are not simple machines with one broken part. They involve people, equipment, procedures, interfaces, regulators, suppliers, staffing models, information systems and local conditions. A serious event may only become possible when several ordinary features of the system line up in an unfortunate way.

The problem with 5 Whys is that it looks deceptively reasonable. It encourages us to keep asking why until we reach something that sounds deep enough to fix. But the method gives little help with deciding where to start, which path to follow, when to stop, what evidence is strong enough, which causes matter most, or how multiple causes interacted.

The core idea

The question “why did this happen?” is useful. The assumption that the answer will form a single neat chain is not.

A 5 Whys analysis usually begins with a problem statement and then asks “why?” repeatedly. The fifth answer is often treated as the root cause, and a corrective action is then attached to that answer. This can be helpful for a simple, local, well-bounded problem. For example, it may help a small team notice that a repeated equipment problem is linked to a maintenance process rather than to the person who last used the equipment.

But serious safety events are different. They usually contain several interacting pathways. Some are technical. Some are organisational. Some concern information design, fatigue, supervision, training, regulation, workload or the physical environment. Some concern recovery: why did the event not become worse? A single chain cannot show all of this.

A better investigation question is therefore not “what is the root cause?” but “what system conditions made this outcome possible, and what controls would make a similar outcome less likely or less severe?”

A case example: Air Canada Flight 759

The Air Canada Flight 759 event is a useful teaching case because the official investigation identified several interacting safety issues rather than one simple failure.

In July 2017, Air Canada Flight 759 was cleared to land on Runway 28R at San Francisco International Airport. The parallel Runway 28L was closed and dark. Four aircraft were waiting on Taxiway C. The Air Canada aircraft lined up with Taxiway C instead of Runway 28R and descended to very low altitude before the crew initiated a go-around. The aircraft overflew the first aircraft on the taxiway at about 100 feet and the second at about 60 feet before climbing away.

The investigation discussed many relevant conditions: runway closure information was present but not effectively retained; backup lateral guidance was not available because the ILS frequency was not manually tuned; the visual scene was ambiguous; the crew were fatigued; expectation bias affected interpretation of cues; tower alerting did not detect the taxiway alignment in time; and runway closure cues were not sufficiently conspicuous for this configuration.

This is exactly the kind of event where a narrow 5 Whys chain can become dangerous. It may produce a clean answer, but that answer may look clean because much of the system has been left out.

What a single 5 Whys can do

A single 5 Whys analysis of the Air Canada event might look like this:

Single chain example

Problem: The aircraft aligned with Taxiway C instead of Runway 28R.
Why? The crew misidentified Taxiway C as the intended landing runway.
Why? They were not sufficiently aware, at the moment of approach, that the parallel runway was closed and dark.
Why? The runway closure information in preflight and arrival information was not effectively retained and used during the approach.
Why? The information was embedded in operational documents and messages in a way that did not make the runway closure sufficiently salient.
Why? The information-presentation system did not reliably support pilot review, memory and threat formation for a non-standard night configuration.

Possible root cause: ineffective information design for safety-critical operational changes.

This chain is not absurd. It points toward a real issue. But it can easily create the impression that improving NOTAM or ATIS presentation would be the answer. That would be an important improvement, but not a sufficient explanation of the event.

The same incident can be analysed from several equally plausible starting points. Each path leads to a different “root”. This is the danger.

Multiple 5 Whys: same event, different roots

Practitioners often use more than one 5 Whys chain, and that is certainly better than relying on a single chain. But even multiple chains can still fragment the system into separate lines of reasoning. The following examples show how the same event can generate several different “root causes”, depending on where the analyst begins.

One event, many possible roots. ACA759 aligned with Taxiway C and descended to very low altitude.

Information pathway: Runway closure information was present but not retained. Here, “root” becomes information design.

Navigation pathway: ILS/localiser backup was not tuned. Here, “root” becomes exceptional manual tuning and chart/database design.

Visual pathway: Taxiway C looked runway-like under a dark parallel runway configuration. Here, “root” becomes closure cue conspicuity.

ATC pathway: Misalignment was not detected early enough. Here, “root” becomes surveillance alerting and monitoring support.

Fatigue pathway: Crew performance was vulnerable at the end of a long duty period. Here, “root” becomes fatigue-risk controls and regulation.

Chain 1: information was available, but not usable enough

Problem: The flight crew did not use the runway closure information effectively during final approach.
Why? The closure was not strongly retained as an active threat during the approach.
Why? The information appeared among other operational information before and during the flight.
Why? The presentation did not sufficiently distinguish this runway closure as safety-critical for the approach configuration.
Why? The system relied on pilots to extract, remember and reapply a critical configuration change under workload and fatigue.
Why? Operational information design did not provide enough salience, prioritisation and just-in-time support for non-standard runway configurations.

Possible root cause: safety-critical information was not designed for reliable human use at the point of need.

Chain 2: backup lateral guidance was not available

Problem: The crew did not have localiser guidance available to help verify runway alignment.
Why? The ILS frequency for Runway 28R was not tuned.
Why? The approach required manual tuning, and the first officer missed that step.
Why? This manual tuning requirement was unusual for the crew’s Airbus A320 operation.
Why? The approach database and charting arrangement did not provide consistent automatic tuning or sufficiently salient manual-tuning support.
Why? The design of the approach-support system relied on exceptional manual action for an important verification resource.

Possible root cause: an exceptional manual configuration requirement weakened an important cross-check.

Chain 3: the visual scene supported the wrong interpretation

Problem: Taxiway C was perceived as the intended runway.
Why? The closed parallel runway was dark, while aircraft lights on Taxiway C created a straight-line visual pattern.
Why? Some cues supported the crew’s expectation that they were aligned with Runway 28R, while contradictory cues were not strong enough.
Why? The runway closure marker was not designed to capture the attention of a crew approaching a different runway in a parallel-runway configuration.
Why? Closure-cue design assumed that meeting marking requirements would be sufficient for visual discrimination.
Why? Airport visual-control design did not fully account for how people interpret a changed night-time airfield scene under expectation bias.

Possible root cause: runway closure cues were not sufficiently conspicuous for the actual visual task.

Chain 4: detection and intervention occurred too late

Problem: ATC did not intervene until the aircraft was on very short final.
Why? The developing taxiway alignment was not recognised early enough.
Why? The aircraft’s position appeared compatible with the expected visual approach offset, and the controller’s tools did not alert to a likely taxiway landing.
Why? The airport surface surveillance system was not designed to predict this type of wrong-surface alignment and warn the controller.
Why? The control system relied on visual scan and controller interpretation under a non-standard night configuration.
Why? The monitoring and alerting architecture did not provide a strong enough control for wrong-surface approach detection.

Possible root cause: surveillance and controller-support systems did not detect taxiway-alignment risk in real time.

Chain 5: crew performance was vulnerable

Problem: The crew did not detect and resolve the misalignment until very late.
Why? Their monitoring and cross-checking were degraded by workload, fatigue and expectation bias.
Why? The event occurred late relative to the crew’s body-clock time, after a long period awake and after a demanding flight segment.
Why? The applicable duty and reserve arrangements allowed this operating pattern.
Why? Fatigue controls did not adequately address evening reserve duty extending into the window of circadian low.
Why? The regulatory and organisational fatigue-risk controls did not fully match the operational risk profile.

Possible root cause: fatigue-risk controls were insufficient for this duty pattern and operating context.

Chain 6: learning evidence was lost

Problem: Cockpit voice recorder data were not available to the investigation.
Why? The recording was overwritten before the severity of the event was recognised and preserved.
Why? The event was not escalated quickly enough to trigger immediate data-protection actions.
Why? Reporting and notification pathways did not reliably treat the near-miss as a time-critical learning event.
Why? The system did not provide sufficiently strong triggers for immediate evidence preservation after a serious near miss.
Why? Organisational learning controls were not designed around the short time window for preserving critical evidence.

Possible root cause: evidence-preservation controls were not robust enough for high-severity near-miss learning.

What these chains show

Each chain tells a reasonable story. That is precisely the problem. A team can work carefully, ask “why” several times, and still end up with a partial explanation that feels complete.

The information chain suggests improving NOTAM and ATIS presentation. The navigation chain suggests improving FMS autotuning and approach-chart salience. The visual chain suggests improving runway closure cues. The ATC chain suggests improving wrong-surface alerting and tower procedures. The fatigue chain suggests improving fatigue-risk management. The learning chain suggests improving reporting and evidence preservation.

None of these is wrong. But none is “the” root cause. The event developed through their interaction. The runway closure changed the visual scene. The information about that closure was present but not effectively carried into action. The unusual manual-tuning requirement removed a stabilising cross-check. Fatigue and expectation bias made contradictory cues harder to recognise. The tower did not receive a strong automated alert. Other crews on the taxiway became an informal safety net. The recovery was not the result of one perfect control; it was a late convergence of pilot doubt, third-party warnings, light cues and go-around action.

A good analysis should preserve that interaction, not compress it into one chain.

Why multiple 5 Whys are still not enough

Using multiple 5 Whys is often presented as the solution to the “single root cause” problem. It is certainly an improvement. It encourages analysts to explore more than one line of questioning. But it still has important limitations.

Swipe horizontally to view full table.

Resource comparison table
Problem	Why it matters
Starting-point dependence	The answer depends heavily on the first problem statement. “Why did the crew misidentify the taxiway?” leads somewhere different from “why did the system not detect the alignment?”
Linear structure	Each chain implies a sequence, but many safety events involve parallel pathways, feedback loops and mutually reinforcing conditions.
Weak interaction modelling	Separate chains do not easily show how fatigue, information design, visual cues, navigation support and ATC monitoring influenced one another.
Arbitrary stop rule	There is no principled reason why the fifth answer is deeper or more actionable than the third, seventh or tenth.
Evidence fragility	The method does not require each step to be tested against evidence. A plausible statement can become accepted because it fits the story.
Solution bias	The last answer in a chain often becomes the target for action, even when a more effective control may sit earlier, later or elsewhere in the system.

The key issue is not the number of chains. It is the model of causation underneath the method. If the method still treats the event as a set of lines, it will struggle to show the network.

What “root cause” language can hide

Root cause language often hides uncertainty. It can make the final report sound more certain than the evidence allows. It can also move attention toward causes that are easier to write down, easier to assign to a department, or easier to fix within the organisation’s authority.

It can also encourage weak recommendations. If the “root cause” is written as “staff did not follow procedure”, the action may become “retrain staff”. If the “root cause” is “information not reviewed”, the action may become “remind staff to review information”. These actions may be appropriate in some cases, but they are often fragile controls. They depend on people remembering, noticing and compensating in the same difficult conditions that contributed to the event.

A systems-safety view asks a different set of questions. Was the procedure usable? Was the important information visible at the right time? Did the tools make the safe action easier? Did the organisation monitor whether the control was working? Did the regulator, supplier or designer hold part of the control problem? Did the system have enough recovery capacity when the first control failed?

How to analyse more carefully

A stronger analysis does not need to reject every RCA tool. It does need to avoid treating 5 Whys as sufficient. For complex events, the investigation should combine a timeline with a broader system model and a disciplined approach to evidence.

Start with what happened, but do not stop there. A timeline is useful for sequencing events. It helps establish what occurred before what. But a timeline alone cannot explain the organisational, technical and regulatory conditions that shaped those events.
Look for interactions, not just causes. A taxiway-lineup risk was not created by one weakness. It emerged from the interaction between a non-standard airport configuration, information presentation, visual perception, workload, fatigue, automation and monitoring.
Map controls and feedback. Ask which controls were expected to prevent, detect or recover the hazardous state. In the Air Canada case, this includes flight crew cross-checking, navigation guidance, runway closure cues, ATC monitoring, surveillance systems, fatigue controls and reporting mechanisms.
Analyse recovery as well as failure. The event did not become a collision. That matters. Other aircraft crews noticed the developing risk, one crew transmitted a warning, lights were switched on, the Air Canada crew initiated a go-around and the controller issued a go-around instruction. Understanding recovery helps strengthen resilience, not only prevent recurrence.
Connect recommendations to controls. Each recommendation should be traceable to a specific weakness in prevention, detection, response or learning. “Be more careful” is not a system improvement. A stronger recommendation changes information design, tooling, alerting, procedures, staffing, authority, training realism, or feedback.

Methods that can help

Different methods answer different questions. No method is universally best, and method choice should depend on the purpose of the analysis, the available evidence and the system boundary.

Swipe horizontally to view full table.

Resource comparison table
Method or approach	What it helps reveal
Timeline	What happened, when it happened and where key opportunities for detection or recovery occurred.
AcciMap	Contributory factors across system levels, such as regulation, organisation, management, operations, equipment and environment.
HFACS	Human, supervisory and organisational contributors, especially when used as a coding and classification aid rather than as the whole analysis.
CAST	Control actions, feedback, responsibilities, flawed process models and safety constraints across a socio-technical control structure.
FRAM	Everyday work functions, performance variability, couplings and how normal adaptations can combine to produce both risk and recovery.
Bow-tie or barrier analysis	Preventive and mitigative barriers, their degradation and whether controls were designed, implemented and monitored effectively.
Task analysis and human factors review	How tasks, tools, interfaces, workload, environment, communication and team coordination shaped performance.

The point is not to use every method on every event. The point is to choose methods that match the learning need. A serious socio-technical event usually deserves more than a string of whys.

Practical questions for investigators

These questions can help move an investigation beyond root cause thinking:

What exact hazardous state developed, and how did it become possible?
Which controls were expected to prevent the state from developing?
Which controls were expected to detect it once it began?
Which controls were expected to support recovery?
What information was available, missing, ambiguous, delayed or too weak to influence action?
What assumptions did people and organisations appear to be relying on?
Where did procedures assume more certainty, time or attention than the work allowed?
Which parts of the system had authority to change the conditions that mattered?
What would have made the safe action easier, earlier or more reliable?
What evidence supports each finding?
How will the organisation know whether the corrective actions actually worked?

These questions do not make analysis anti-practical. They make it more useful. They help the investigator move from naming causes to improving control.

Common mistakes to avoid

Be careful when an investigation does these things

Uses “human error” as the conclusion rather than the beginning of the analysis.
Chooses the most convenient cause rather than the most evidence-supported explanation.
Stops at the first cause that the local organisation can control.
Treats the final “why” as automatically more important than earlier conditions.
Produces recommendations that rely mainly on reminders, awareness or retraining.
Ignores recovery, near-miss learning and evidence preservation.
Fails to test whether recommended controls are strong enough and actually implemented.

Weak investigations can create a false sense of closure. A report may be completed, a form may be signed, and a training package may be issued, while the system remains largely unchanged.

Limitations and cautions

This does not mean every local problem needs a full systems-theoretic analysis. Some problems are simple, bounded and close to the work area. In those cases, a short 5 Whys discussion may be a useful prompt for reflection, provided that people do not overclaim what it proves.

The caution is strongest for serious incidents, near misses with catastrophic potential, recurrent events, events involving multiple organisations, or events involving technology, regulation, staffing, fatigue or complex decision-making. In these cases, 5 Whys should not be treated as the investigation.

It is also important not to replace one ritual with another. A CAST, FRAM, HFACS or AcciMap analysis can also be weak if it is poorly evidenced, overcomplicated or used mechanically. The value lies in disciplined thinking: clear boundaries, good evidence, attention to interactions, traceable findings and practical improvements.

A better takeaway

Do not ask only: “What was the root cause?”

Ask: “How did the hazardous state develop, why was it not detected earlier, how was it recovered, and what would make prevention, detection, response and learning stronger next time?”

That question is less tidy. It is also more honest. Safety in complex systems is not produced by a single root, and accidents are not prevented by cutting one branch. Safety is produced through many interacting controls, feedback loops, adaptations and design choices. Good investigation should make those relationships visible.

Related publication(s)

Kaya, G.K. (2021). A system safety approach to assessing risks in the sepsis treatment process. Applied Ergonomics. 94,103408. DOI: 10.1016/j.apergo.2021.103408.
Kaya, G.K., Humphreys, M., Camelia, F. and Chatzimichailidou, M. (2025). Integrating causal analysis based on system theory with network modelling to enhance accident analysis. Ergonomics. 1-28. DOI: 10.1080/00140139.2025.2516060.
Kaya, G.K. Good Risk Assessment Practice in Hospitals. (2018). PhD thesis, University of Cambridge. DOI: 10.17863/CAM.20813.
Kaya, G.K., Ward, J.R. and Clarkson, P.J. (2019). A framework to support risk assessment in hospitals. International Journal for Quality in Health Care. 31(5):393-401. DOI: 10.1093/intqhc/mzy194.
Losi, E., Kaya, G.K., Camelia, F., Chatzimichailidou, M., Slater, D.H., Patriarca, R. and Sujan, M. (under revision). Systemic safety analysis of complex socio-technical events: insights from applying CAST and FRAM. Reliability Engineering & System Safety. Publication details forthcoming.

Selected references

Card, A.J. (2017). The problem with ‘5 whys’. BMJ Quality & Safety, 26, 671–677. DOI: 10.1136/bmjqs-2016-005849.
Latino, R.J. (2015). How is the effectiveness of root cause analysis measured in healthcare? Journal of Healthcare Risk Management, 35(2), 21–30. DOI: 10.1002/jhrm.21198.
Leveson, N.G. (2011). Applying systems thinking to analyze and learn from events. Safety Science, 49(1), 55–64. DOI: 10.1016/j.ssci.2009.12.021.
Leveson, N.G. (2019). CAST Handbook: How to Learn More from Incidents and Accidents. MIT Partnership for Systems Approaches to Safety and Security.
National Transportation Safety Board. (2018). Taxiway Overflight, Air Canada Flight 759, Airbus A320-211, C-FKCK, San Francisco, California, July 7, 2017. NTSB/AIR-18/01.
Peerally, M.F., Carr, S., Waring, J. and Dixon-Woods, M. (2017). The problem with root cause analysis. BMJ Quality & Safety, 26, 417–422. DOI: 10.1136/bmjqs-2016-005511.
Reason, J. (1990). Human Error. Cambridge University Press.
Salmon, P.M., Cornelissen, M. and Trotter, M.J. (2012). Systems-based accident analysis methods: A comparison of Accimap, HFACS and STAMP. Safety Science, 50(4), 1158–1170. DOI: 10.1016/j.ssci.2011.11.009.
Vincent, C. (2004). Analysis of clinical incidents: A window on the system not a search for root causes. Quality & Safety in Health Care, 13, 242–243. DOI: 10.1136/qshc.2004.010454.

Back to all resources