A Four-Step Problem Solving Approach

In August 1990 the United States starting sending military forces to the Persian Gulf with the intent of expelling Saddam Hussein’s forces from Kuwait.  We called the buildup Desert Shield, and when we actually went to war on 16 January 1991, the name transitioned to Desert Storm.  When Desert Storm finally started, the engagement was decisive.  In short order, Kuwait was free of Iraqi forces.  It was the beginning of the end for Saddam Hussein.

Desert Shield (the buildup) lasted a good 6 months. The question in those days was:  Why the delay?  We had our forces and those of allied nations in place relatively quickly. Why did 6 months elapse before we crossed the border into Kuwait to expel Saddam?

The true reasons for the lengthy delay may never be known, but I can tell you that a key component of our smart munitions delivery capability was not ready in August 1990. You all remember the dramatic videos…munitions being dropped directly down chimneys, one-drop hits, etc.  All that was made possible through laser-guided munitions (along with the bravery and skill of our fighting forces).

One of the key laser targeting devices was the Mast Mounted Sight, shown in the photo above.  It’s the thing that looks like a big basketball on top of the helicopter.

The Mast Mounted Sight contained a laser target designator, an infrared sensor, and a television sensor.  All were slaved to the pilot’s helmet.  Wherever the pilot looked, that’s where all three beams were supposed to point.  The Mast Mounted Sight had been in production and deployed on helicopters for years.  Everyone thought everything was fine.

But it wasn’t.

When the Desert Shield buildup started, the Army tested its Mast Mounted Sight systems a bit more rigorously, and it discovered what it and the manufacturer thought was an alignment error in the laser, IR, and television lines of sight.  This could have been disastrous.  It meant that the pilot might launch a missile based on the television or the IR sensor being on target, but the laser beam would guide the munition to the wrong spot.  If a miss occurred, it would alert the bad guys, and they could return fire against the helicopter.  Mind you, this system had been in production and deployed in the field for years.

The manufacturer went into high gear to find and fix the failure cause. The Mast Mounted Sight contains an internal alignment mechanism, which is supposed to align all three instruments (the laser, IR sensor, and the TV sensor).  The manufacturer spent the next 6 months looking for a problem in the MMS alignment subassembly.  They didn’t find anything.

Hold that thought.

Ever hear the joke about the drunk looking for his car keys at night under a street light?

It goes like this: I offered to help the drunk find his keys, and after we both searched for an hour, we came up empty-handed.

“Gee,” I said, “are you sure you dropped them here?”

“Oh, no,” responded the drunk. “I lost them over there, by those bushes in the dark…”

“Then why are you looking here under the street light?” I asked incredulously.

“Because I can see here,” he answered.

Many times when we have a production shutdown, or even a low-level recurring failure, finding the root cause is elusive. Production shutdowns get a lot of attention.  Recurring nonconformances frequently do not, but they can just as expensive (sometimes more so) than a line-stopping failure.

So how do we go about finding the root cause of a failure?

Many years ago, the smartest man I ever knew once shared a simple four-step problem solving process with me.  It goes like this:

  • Define the problem
  • Define the causes
  • Define the solutions
  • Select the best solution

Where we usually go south when analyzing failures is with that first step: Defining the problem. Frequently, we start jumping to conclusions about potential causes without taking the time to fully understand the problem. The results are predictable: We spend lots of time chasing our tails, and the problem continues.

Need proof?  Try this exercise:  Tell your staff that you walked into a room, flipped the light switch, and the light did not illuminate.   Then ask them what the problem is.  In most cases, folks will immediately start listing potential failure causes: A broken filament, breaks in the wiring, a defective switch, failure to flip the switch properly, etc.  But those are all incorrect answers.

The question should be:  What is the problem?  That should be our first step.  In this case, the problem is that the light bulb does not illuminate.  All of the other suggestions listed above involved jumping to conclusions about potential causes.

Let’s turn back to the Mast Mounted Sight.  After several months of trying to find a failure cause in the MMS alignment mechanism, the failure analysis team finally decided to take a step back. They reviewed the test data again, and to their amazement, they found that the TV and the laser were aligned.  Only the IR sensor was out of alignment.  The failure analysis team had been solving the wrong problem.  Once the problem came into focus, the team looked outside the alignment mechanism, and they found an IR window heater anomaly.  The fix was a simple software patch.  It was implemented on 15 January 1991, and US troops rolled across the Kuwait border on 16 January 1991.

Would you like to know more about our fault-tree-based Root Cause Failure Analysis training program, or perhaps our book on Systems Failure Analysis?   Check out our Root Cause Failure Analysis page, and give us a call at 909 204 9984 if you would like to know more!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: