Many times companies use sampling techniques to assess a production lot’s acceptability. You know the drill…you pull a specified sample size, and if all of the samples are acceptable you buy the lot. If any are unacceptable, you reject the lot. This approach often works for components, assuming the sample represents the rest of the lot. But what about larger subassemblies or complete systems? Does it work for them, too?
Here’s the basic question: Is your acceptance testing approach consistent with your product’s required reliability?
This is an area where a lot of companies (and buying organizations) put themselves in a serious bind without realizing what they are doing. In the munitions game, for example, it’s pretty common to pull a specified sample and buy the lot if all of the samples go bang. The problem is that we think if a product’s reliability is high (say, 95%), we ought to be able to pull a sample and have them all work. That’s not the way it works in the real world, though. We can’t go with our intuition here; we have to evaluate the probability of passing the acceptance test more rigorously to assure that it is consistent with the required reliability of whatever it is we are testing.
I first ran into this at Aerojet when we were building munition fuzes. We were failing most of our lot acceptance tests, and we thought we had a pretty good product. The submunition had a 95% reliability requirement, and in live tests we showed we met that requirement. We routinely dropped bombs and had more than 95% of the submunitions detonate.
We had a lot acceptance requirement on the fuzes, however, that required firing a sample of 32 with no failures. We were only passing about one lot out of every five. What was going on?
What we didn’t realize (at least initially) is that there’s a fundamental difference between demonstrating a product’s reliability and passing a specified-sample-size test with zero failures. There’s a relationship between a product’s reliability and the probability of passing its acceptance test that can be shown with something called an operating characteristic curve. For that test I just described (n = 32, acc/rej =0/1), the x-y plot below shows it clearly:
Check the above plot, and you’ll see that with a product reliability of 95%, you’ll only pass the acceptance test about 20% of the time (and that was exactly what we were experiencing).
When we explained this to our Air Force customer, they didn’t like what they were hearing, but they recognized and agreed with the mathematics. Ultimately, they modified the fuze acceptance test requirement so that it was consistent with the product’s required reliability. Don’t think that this allowed lower quality munitions to get into the inventory, either. That particular munition system routinely delivered reliability well in excess of its requirements, and during the 1991 Persian Gulf War, it was the munition that took out the bulk of Saddam Hussein’s Republican Guard tanks.
Another manufacturer was not so lucky. They manufactured flares for the US Navy, and they encountered precisely the same problem with precisely the same numbers. The Navy’s reliability requirement was 95% (which the flare met), but they imposed that same lot acceptance requirement (a sample size of 32 flares, accept on 0 failures, reject on 1 or more failures). Predictably, the company failed 80% of their lot acceptance tests. Unfortunately, in this case, neither the Navy nor the manufacturer realized what was happening.
I know about that second situation because I was an expert witness when the manufacturer sued the Navy. When I testified at the Armed Services Board of Contract Appeals, my task was to explain all of the above in a manner that lawyers and the trial judge could understand. In my experience, lawyers and judges don’t grasp probability and statistics concepts easily, so just stating that the situation was governed by the binomial distribution wasn’t going to cut it.
I went shopping the night before I testified and bought two bags of coffee beans (one with white beans, and one with brown beans). I put 5 white beans in a bag (representing unreliable product), and 95 beans in the same bag (representing product that would work). I stuck the bag in my pocket the next morning and went to court.
After explaining the binomial distribution, the nature of the relationship between a product’s reliability and the probability of passing a test, and the x-y plot you see above, I could see that the judge (who was a good guy) had glazed over. When I finished, I told the judge I could demonstrate the concept for him. I pulled the bag of coffee beans out of my pocket and explained the contents, and I offered to pull out 32 beans. The lights came on. The judge smiled. He told the Navy’s attorney to pull the beans out of the bag. The 17th bean was a white one, representing a flare that wouldn’t work (and a failed lot acceptance test).
It was a cool display, it was a deciding factor in the manufacturer winning its $25.4 million claim against the Navy, and that little demonstration was cited as one of the best Armed Services Board of Contract Appeals explanations that year.
So, think about this…when you specify (or agree to) a sample-based test, what’s the reliability of the thing you’re testing, and is it consistent with your test? If you are failing sample-based acceptance tests, you may simply have an overly-stringent acceptance test. These kinds of evaluations sound complicated, but Excel makes it a lot easier than it used to be. The operating characteristic curve is one of the key concepts we should always consider in such situations, and it’s a key part of the root cause failure analysis training we offer.