How to apply ISO 14971 risk-management (it’s easier than you think)

6 minute read.

In 1986 the space shuttle Challenger exploded, killing the crew and a high-school teacher that had trained with the astronauts.

The explosion originated near an O-ring gasket that sealed rocket fuel. The gasket protected a known design flaw that could expose fuel to fire. NASA’s purchasing department had bought the O-ring based on expectations of warm conditions at the launch-site in Florida. The launch was delayed until January. A rare cold day lowered temperatures below the specified rating of the O-ring, decreasing its ability to seal the design flaw. The incomplete seal led to leaking fuel, which led to an explosion at takeoff.

The explosion is shown in this video. An important aspect of Risk Management is alluded to at 1:30; NASA had been under pressure to launch on that day.

I’ve simplified the situation because I wanted to focus on the need for risk management standards and transparent communication. Engineers had warned of the risk, but NASA didn’t have processes linking risk analysis with purchasing and launch-team decisions. In fairness, it would be hard to see the significance of an O-ring, which emphasizes that Risk isn’t just about the part, it’s about what happens if that part fails, and the hazardous situations that would have to align for that failure to occur.

The international standard for risk management is ISO 14971, which was updated in 2007. If it had been available in 1986 the Challenger probably would not have exploded and the astronauts and first teacher in space would be alive. If we could travel back in time, this is how modern risk management would have looked for NASA and the space shuttle team, with ISO 14971 sections (in parentheses)

Plan

A risk management plan would have been created by before doing any work, when the team is not under pressure (3.4). The team would document what other risk management documents are required and who’s responsible, acceptable levels of risk, priorities for how to reduce risk, how to monitor assumptions, and how to update and improve the plan based on new information.

Brainstorm hazardous situations

A diverse team would have ensured understanding of intended use of the space shuttle and it’s parts, including the o-ring (4.2). They would have listed high-level hazards, including “The space shuttle could explode if fire reached the fuel tank” (4.3). The team would brainstorm sequences of events that could lead to hazardous situations (4.4). For example:

The launch date could be postponed from a warm month to a colder monthThe weather in Florida, where launch occurs, could have an unusually cold day where the temperature drops below o-ring specificationsThe o-ring could “shrink” due to the cold weather, becoming unable to seal the pathwayThe launch team could not know the risk of cold weatherAfter takeoff fuel could be exposed to fire, causing an explosion

The hazardous situations would be documented and updated based on new information (4.4). I believe it’s advantageous to document the sequence of events, too.

Estimate risk levels

Risk would be quantified for each hazardous situation (5). Risk is the combination of the severity of an event and the probability of that event occurring (2.16).

Risk = Severity X Probability

Teams would assign numbers to severity and probability, multiply them together, and the result would be compared to pre-determined acceptable risk levels in the plan (3.4). In this example the severity of an explosion is so high that risk control would have been required.

Risk levels would be documented and continuously improved based on new information.

Control risks

The team would have prioritized risk control (6.2):

Improve the designAdd protective measuresAdd warning labels, instructions, or procedures

For this example, options could have included:

Improve the space shuttle design, such as eliminating the path to rocket fuel.Add safeguards to reduce risk, such as changing the o-ring specifications to be resistant to cold weather.Create procedures to reduce risk, such as a policy for the launch team to delay a launch if the temperature is below 40 degrees.

Risk controls would be verified and documented (6.3), and the hazards document would be revisited to ensure all hazards were addressed (6.7) and new hazards were not introduced. Any residual risks would be evaluated to ensure the benefits outweigh risks (6.4, 6.6).

Monitor for effectiveness

The risk management plan would be continuously improved using real-world data to adjust risk assumptions. For example, current, real-world information about probability assumptions would have been adjusted. There are two probabilities in risk analysis:

P1 = the probability that a sequence of events will occur

P2 = the probability that the sequence of events will result in harm

P1 must be proactively researched and documented because hazardous situations are rarely recognized or reported. For example, it’s possible that other space shuttle launches had the same sequence of events but had not exploded because of slight differences in temperatures or air currents between the leaking fuel and fire. Many other teams of astronauts may have unknowingly come close to a similar explosion, but this wouldn’t be known if we weren’t monitoring P1 assumptions. P1 is proactive risk management that reduces P2.

P2 is reactionary, resulting from deaths or catastrophes.

Please focus on P1.

Link risk management to all department policies

All departments would fall under risk management, and the risk plan would be referenced for all decisions. In the space shuttle example, information from design engineers would be fed into a risk management policy used by purchasing, manufacturing, and the launch-team.

Links within an organization and monitoring and improving a plan are known as the “process approach” to risk management. Modern quality system standards require risk-based decisions in a process of continuous improvement.

Documents

All risk management work would have been documented (3.5). The final document would have been be a trace-matrix ensuring all hazards are addressed by risk control, and that work was carried out according to a plan. This would have allowed subsequent teams to continuously reduce risk by adjusting probabilities based on new information, adding newly identified hazardous situations, and using state-of-the-art risk analysis methods.

Standards

Lessons from the Challenger explosion are part of current, international standards for quality control and risk management. The most common international standard for quality management is ISO 9001, and the standard specific to medical devices quality systems is ISO 13485, which is the foundation of a new audit method, the Medical Device Single Audit Program (MDSAP). They require that quality systems function as a risk-driven process of continuous improvement, which is also emphasized by the FDA quality system requirements and the European Union medical device requirements (EU MDR). All use the concepts prescribed by ISO 14971:2007, Risk Management. and the supplemental version for Europe, EN ISO 14971:2012.

Regulatory requirements are emphasized by a MDSAP diagram showing Risk Management as the highest level of guidance for companies.

Practical application

In the case of the case of Challenger explosion, it’s obvious in hindsight how a series of disconnected processes led to harm. What’s less obvious is how to apply risk-based decisions into your existing quality systems. I give an examples of how to apply risk management to decisions in purchasing, vendors, and supply chains in another blog, That’s not a knife! How to make risk-based decisions, which uses another phenomenon from 1986, the film Crocodile Dundee, about an Australian crocodile hunter in New York City, as a fun way to learn risk management techniques.

Risk management requires a corporate that culture understands and applies the concepts. These consulting companies can help your organization continuously improve.

Oriel STAT-A-MATRIX (I consult with Oriel)

Maetrics LNE G-Med MDI Consultants

Me(Jason 🙂

Summary

Plan

Pre-determined acceptable risk levelsHow to monitor risk assumptionsHow to update plan

Hazards analysis

Sequence of eventsHazardous situations

Risk analysisRisk = Severity X Probabilty

Risk control

Improve the designAdd safeguardsCreate warning, labels, instructions, or procedures

Continuously improve

Europe has additional risk requirements in .

Consulting companies can help you continuously improve.

Please share

Risk affects all of society, and the more people who think in big-picture concepts the safer our world becomes for everyone. Please share this article if you think others would benefit.

Parting thoughts

The space shuttle Challenger explosion was a rare event that in hindsight had preventable sequences of events. History has many similar examples, such as the , the , and every . These events are often referred to as “” because of the book “,” which emphasizes that we can’t predict outlier events but we can build robust systems resistant to their impact and able to adapt to changes.

But, no amount of quality-control and mathematical modeling can replace humans working together and communicating effectively. In the case of the space shuttle example, individuals struggled to have their voices heard. To improve your company, focus on culture, communication, and transparency.

How to apply ISO 14971 risk-management (it’s easier than you think)

This Site

Book

Bio