Tuesday, March 29, 2011

What Is Six Sigma (and Why Should I Care)?

I'm an indirect casualty of the Borders bankruptcy. I did a proposal for something called The Manager's Answer Book, done at a publisher's request based on their idea, but by the time the publisher was ready to offer a contract, the standard advance had been cut by 60%, so I declined. I did three samples, and here's the first:

“There’s this guy in the office who keeps going on about Six Sigma and claims to be a green belt. What is Six Sigma and why should I care?”
- MD, Bethesda, Maryland

When quality is poor, you can lose money two ways. It costs money to do it wrong, it costs more money to find out that you did it wrong and have to fix it, and it costs still more money to do it over again. And that’s not to mention the risk of losing a customer in the process. The answer, for most companies, was inspection: catch the mistakes before they get out the door. That kept customers from getting bad products, but it didn’t do anything about the cost of making the bad products in the first place.

Enter Walter Shewhart, a quality control engineer who went to work for Western Electric, the manufacturing arm of the telephone monopoly, in 1918. In 1924, he wrote a revolutionary one-page memo outlining a new approach: using statistical tools to observe the process of manufacturing, with the goal of finding and correcting the causes of potential defects before they occurred. This was known as Statistical Process Control (SPC) or Statistical Quality Control (SQC), and it’s at the root of modern thinking about quality.

Shewhart was a major influence on physicist W. Edwards Deming, who adopted and promoted many of his ideas. After World War II, Deming worked for General Douglas MacArthur in Japan, where he famously trained Japanese engineers, managers, and executives (including Sony co-founder Akio Morita) in these techniques.

One of his students, Kaoru Ishikawa, took the teachings of Deming and another quality guru, J. M. Juran, and added customer satisfaction to the mix, calling the new hybrid Total Quality Management, or TQM. (Ishikawa developed one of the basic tools of TQM, the cause-and-effect analysis diagram, also called a fishbone diagram because it looks like the skeleton of a fish, and an Ishikawa diagram for obvious reasons.

In 1980, NBC aired a documentary on Deming and his influence on the Japanese, called “If Japan Can… Why Can’t We?” and TQM fever took off in the United States. But as you can tell, quality philosophies don’t stand still. There’s the zero defects approach, total quality leadership (TQL), kaizen, kansei, business process reengineering (BPR), and more. Because any company could claim they had implemented TQM, along came ISO-9000, a standards-based way to certify internationally that you really had a workable quality program.

Enter Six Sigma, developed by Motorola in 1986. It’s been a highly successful and widely adopted strategy — and, of course, it has its detractors and critics.

The goals of Six Sigma are in line with other quality strategies: it uses statistics, it aims to reduce the causes of defects or errors, and it tries to minimize variability. “Six sigma” itself is a statistical term, representing fewer than 3.4 defects per million opportunities.

Six Sigma advocates claim that their approach does a better job of measuring financial return, promotes more passionate leadership, provides a trained cadre of champions (the “green belt” is an example), and makes decisions based on data instead of guesswork.  Critics claim there’s nothing new to see here, that Six Sigma is just TQM with karate belts.

In terms of results, Motorola claims to have saved over $17 billion from Six Sigma. Jack Welch at General Electric was another successful champion. However, a 2006 Fortune magazine article reported that 91 percent of companies that announced Six Sigma programs ended up trailing the S&P 500.

From your point of view, however, the controversy doesn’t matter. Six Sigma critics argue that it’s derivative, not that it’s wrong. If your company’s made an investment in the program, it’s good sense for you to get on board.

You need Six Sigma certification if you’re going to be doing Six Sigma related projects. If those projects are alongside your regular duties, a green belt is sufficient. If you’re going full-time on Six Sigma projects, shoot for a black belt. If, on the other hand, Six Sigma activities are going on around you, but not in your area of the business, it may be enough for you to pick up the basic vocabulary and concepts — in other words, what you’re doing right now.

Of course, things don’t sit still. Cutting edge companies now practice “lean six sigma,” combining Six Sigma with ideas of lean manufacturing, a technique to eliminate waste. New iterations are surely on management consulting drawing boards. When in doubt, cite tradition. “It’s all just warmed-over Shewhart, you know.”

Tuesday, March 22, 2011

The Square of Risk

In researching a chapter on risk triage for my book Creative Project Management, I came across a concept known as the PIVOT score. The elements of PIVOT are:

  • Probability — the likelihood a particular risk event will happen
  • Impact — the consequence of the risk event if it happens
  • Vulnerability — the relationship of the threat to core mission, values, and business objectives
  • Outrage — the expectation (E) of how things should be minus the degree of satisfaction (S) with the way things are.
  • Tolerance — the degree of enthusiasm or anger in response to the risk event impact if it happens.

Probability (P), impact (I), vulnerability (V), expectation (E), and satisfaction (S) each get a rating of between 0 and 3. The formula for outrage (O) is:

O = E – S

And the formula for tolerance (T) is:

T = (P x (I + V))O

Outrage, as you can see, is hyperbolic. It has a disproportionate impact of outrage on the final PIVOT score. Let’s imagine the following:

An event is moderately unlikely (P = 1), has a very high impact (I = 3), the event relates to our core business objectives (V = 3), but it’s unlikely to get much publicity because people aren’t too surprised when it happens, so E – S is only 1. The PIVOT score is  (1 x (3 + 3)1, or 6.

Now imagine that the impact is actually low, but it’s the sort of thing that will be smeared all over the headlines and every commentator will talk about it (O = 3). The PIVOT score is  (1 x (1+3))3, or 64! Even though the actual impact in the first instance is three times that of the second case, the PIVOT score of the less serious impact is more than ten times as high as that of the more serious case.

The impact of outrage on risk decisions tends to be disproportionate, especially when the outrage itself is the result of misinformation. Low impact risks take on catastrophic urgency and objectively more serious risks barely ripple the waters.

The confirmed death toll in Japan as I write is approaching 10,000, with the likely death toll predicted to top 18,000. Serious by any measure, but not outrageous because — hey, it was a huge tsunami and earthquake. Do you really expect all the safety procedures to be sufficient? Low outrage means not only less obsessive coverage, but also less pressure to improve safety.

The latest IAEA report I can find (March 17) lists a total of 44 injuries and no deaths. The UK Telegraph reports five workers dead, but I can’t confirm that, or whether they are part of or in addition to the 44. The level of relative outrage — expectation minus satisfaction — is off the wall.

Using outrage as the square (or higher power) of risk dramatically distorts decision-making. Do 9,000+ real deaths truly mean less than some uncounted but low number of potential deaths? In risk management practice, it often does. 

Where the outrage is, so goes the money and the effort. This is not always in our best interest.

From http://xkcd.com/radiation/.

Tuesday, March 15, 2011

Fukushima Number One

As reported in my article "Homer Simpson: Man of the Atom" in Trap Door magazine, I once got to run a nuclear reactor — admittedly, a low-power one used only for training students. This hardly makes me an authority on nuclear power, but I do know something about risk management.

Like many of you, I'm following the evolving Fukushima Dai-ichi Nuclear Power Station story with great interest. I'm a pro-nuclear safety conscious environmentalist, if that makes any sense. I think a lot of anti-nuclear sentiment is rooted in emotion rather than analysis, and contains the same anti-science bias that I object to so strongly when practiced by the right wing.

That doesn't make the case for nuclear power a slam dunk by any means. The downsides are obvious and substantial, and the tendency to rely on nuclear power generation to supply plutonium for other purposes has led to what seem to me to be false choices. I'm following with interest the discussion of thorium reactors, and I think the investment we're making in fusion is ridiculously low. That doesn't mean I don't like wind and solar as well. But all forms of power impose risks and costs.

The question in risk management isn't whether a proposed solution has drawbacks (technically known as secondary risks). Most proposed solutions, regardless of the problem under discussion, tend to have secondary risks and consequences.

The three questions about secondary risk that matter are:

  1. How acceptable is the secondary risk? The impact and likelihood of secondary risks can vary greatly. Some secondary risks are no big deal. We accept them and move on. Others are far more serious. A secondary risk can indeed turn out to be much greater than the primary risk would have been.
  2. How manageable is the secondary risk? A secondary risk, like a primary one, may be quite terrible if you don't do anything about it. The key word, of course, is "if." What can be done to manage or reduce the secondary risk? 
  3. How does the secondary risk compare to other options? As I've argued elsewhere, the management difference between "bad" and "worse" is often more important than the difference between good and bad. If the secondary risk of this solution is high, and if you can't do anything meaningful to reduce it, you still have to compare it to your other options, whatever they are.
In the case of nuclear power, the unmitigated secondary risk is unacceptably high. But all that does is demonstrate that the risk needs to be mitigated — reduced to some acceptable level. Ideally, that level is zero, but that may not be possible, and it may not be cost-effective to reduce it beyond a certain point. The leftover risk, whatever it is, is known as residual risk. Residual risk is what we need to worry about. Like with secondary risk, the three questions of acceptability, manageability, and comparison help us judge the importance of the residual risk.

We make one set of risk decisions at the outset of the project. We decide which projects we want to do; we decide what overall direction and strategy we will follow; and we decide what resources to supply. All the decision are informed by how people perceive the risk choices.

As the project evolves, the risk profile changes. Some things we worry about turn out to be non-issues, and other times we are blindsided with nasty surprises. Our initial risk decisions are seldom completely on target, so they must evolve over time.

When disaster strikes, suspicion automatically and naturally falls on the risk planning process. Were project owners and leaders prudent? Armed with the howitzer of 20-20 hindsight, the fact of what did happen carries a presumption of incompetent planning for those who failed to anticipate it. Sometimes it's a fair judgment. Other times not so much.

I'm still working out what I think about the Fukushima case, but some initial indications strike me as positive when it comes to evaluating the quality of the risk planning. The basic water-cooled design of Fukushima made a Chernobyl outcome impossible. The partial meltdown didn't rupture the containment vessel, and although the cleanup will be messy and expensive, it's not likely to spread outside the immediate area.

The effects of radiation may not be known for some time, but even those have to be put into perspective. Non-nuclear power plants, however, cost lives too, even though you don't hear about these disasters as often. A quick Google search turned up the following:

  • September 2010: Burnsville, Minnesota, explosion, no deaths.
  • February 2010: Connecticut, 5 dead
  • February 2009: Milwaukee, 6 burned
  • June 2009: Mississauga, Ontario

And, of course, several thousand people a year die mining coal.

Tuesday, March 8, 2011

Schrödinger's Cat Walked Into a Bar — And Didn't

The famous story of the boxed cat who is simultaneously dead and alive was first proposed as a thought experiment by Austrian physicist Erwin Schrödinger in 1935. The cat came to quasi-life as part of an argument between Schrödinger and Albert Einstein concerning elements of the Copenhagen interpretation of quantum mechanics.

No cats, of course, were actually injured in the making of this theory.

In his 1935 article, Schrödinger wrote:

"One can even set up quite ridiculous cases. A cat is penned up in a steel chamber, along with the following device (which must be secured against direct interference by the cat): in a Geiger counter, there is a tiny bit of radioactive substance, so small that perhaps in the course of the hour, one of the atoms decays, but also, with equal probability, perhaps none; if it happens, the counter tube discharges, and through a relay releases a hammer that shatters a small flask of hydrocyanic acid. If one has left this entire system to itself for an hour, one would say that the cat still lives if meanwhile no atom has decayed. The psi-function of the entire system would express this by having in it the living and dead cat (pardon the expression) mixed or smeared out in equal parts."

“Ridiculous” is the tip-off. Schrödinger didn’t want us to take the cat — or the argument — seriously. But if you move from the realm of quantum mechanics to the realm of our macro reality, Schrödinger's Cat is far from ridiculous: it’s our everyday experience.

Imagine a call comes in from Cat Rescue HQ. That Schrödinger boy is at it again, locking yet another innocent kitty inside that infernal device. As you load up the van, what do you bring? Well, that depends on the state of the cat. So you bring some food and medicine, or a cat carrier — but just in case, you need to pack a pet-size body bag and some disposable gloves.

Operationally, you treat the cat as alive and dead up until the moment the sad (or happy) truth is revealed.

That’s risk management. You have to plan and prepare for a range of outcomes, treating each as in some sense real until the state collapses and time’s final verdict is rendered. It’s seldom wise to believe in a single deterministic future.

Tuesday, March 1, 2011

Four Dimensions of Risk

As a science fiction reader and alternate history writer, I’ve always lived in the future to some extent. From our time-bound perspective, the future is a wave front of uncertainty. Many things are possible, but ultimately only some things will happen.

Today, we have to make decisions about the future, and those decisions necessarily have to be made under conditions uncertainty. That’s the domain of risk, the place where philosophy and statistics meet. Yesterday, I sent in the manuscript for my 24th book, Project Risk and Cost Analysis, for AMACOM’s self-study sourcebook line. It’s been a fascinating project.

Risk is future tense, as opposed to problem, which is present tense. Risks are events that have not yet happened. The events can be good for us, or bad for us. They can have great impact, or little impact. They are more likely or less likely.

The risk environment changes over time. For example, there’s a lot of noise on the issue of climate change. Opponents argue that the science cannot say with certainty that the feared effects of climate change will happen. From a risk management perspective, that’s true, but it’s also completely irrelevant. Hardly anything in the future is really 100 percent (or, for that matter, zero percent) sure to happen. The measurement of a risk today is our estimate of its probability times our estimate of its impact if it happens (usually written R = P x I).

As time moves forward, our knowledge will change. Our estimate of the probability will increase or decrease. Our estimate of the potential impact will be refined. (Estimates of impact, by the way, tend to be more precise and have more agreement than estimates of probability. In the case of climate change, both sides agree on the claimed impact; what they disagree on is the likelihood of that impact occurring.)

And, by somewhere around the year 2050, the argument will eventually go away completely. By then, it will be incontrovertibly clear what has happened. One or both sides will be proved wrong. Uncertainty will collapse; Schrödinger’s cat will be out of the box, alive or dead.

As we move through the life cycle of a project, our vision changes. All risks on a project eventually go away, either by becoming true (problem or good fortune), or by becoming false (no harm, no foul). At the same time, new risks swim into view as we navigate forward through the rocky stream of time.

The uncertainty of the future inevitably becomes the fact of the present.

Risk can be thought of in four dimensions:

  1. Goodness/Badness. In practice, risk is often used a synonym of threat. But events can be beneficial or harmful, or a mixture. Sometimes you can choose.
  2. Impact. You can find a dollar bill on the sidewalk, or you can find a hundred dollar bill. Both qualify as opportunity. You can lose a dollar, or you can lose a hundred dollars. Both qualify as threats. The difference, in both cases, is impact.
  3. Probability. Most people carry a lot more ones than hundreds, and are more likely to miss and search for a lost hundred. There’s a greater chance of finding (or losing) the smaller amount.
  4. Time. What we knew yesterday is different from what we know today or will know tomorrow. The risks that should concern us, and the choices we can make, do not remain static.
  5. Remediation. What, if anything, can you do about it? What will it cost? The value of a risk all by itself doesn’t tell us much. Only when you compare the value of the risk with the cost of the risk response do you know the shape of the decision space

In thinking about risk, put the risk into context — what’s it’s effect on you and others, and what’s the relative cost of the solution compared to the (risk-based) cost of the problem?