The Space Shuttle Challenger
The loss of the space shuttle Challenger is a
good example of meta-failure. The Challenger was destroyed in 1986 because
an o-ring seal failed in one of it’s solid fuel boosters. This allowed
flames from the booster to burn into the shuttle’s main fuel tank. The
tank exploded shortly thereafter. All seven crewmembers on board the shuttle
were killed. Basically, four factors played a role in causing the accident.
First, were the higher leaders at NASA aware
of the costs of their decisions?
The leadership at NASA had decided on a tough
launch schedule for the shuttles; perhaps in response to political pressures,
and perhaps without considering the real costs of the schedule. Understand,
in any bureaucracy, requirements and attaboys come down, and yes sirs and
flattery go up. People lower down the ladder adjust data to please their
superiors. And sometimes, the people at the top let their subordinates
know that they want to be pleased. After several levels of this, data can
be remarkably distorted. So, the real costs of the schedule may not have
been considered. Also, there might have been adverse consequences for anyone
bringing up issues that could potentially set back the schedule. Remember,
in any bureaucracy, people who won't “go with the program” tend to be regarded
as “strange,” or even as “disloyal.” In any case, the investigation of
the accident revealed that the schedule had put severe strains on the shuttle
program.
Second, was anyone keeping track of safety
data?
Several years before the accident an investigation
of recovered solid fuel boosters found that the o-ring seals used on the
boosters tended to be burned when the shuttles were launched in cool (less
then 50°F) weather. Understand, the solid fuel boosters used on the
shuttle are composed of several sections connected together. The sections
are sealed by o-rings in their joints to prevent the hot gasses inside
the booster from leaking out while the motors burn. After the motors burn
out, they drop off the Shuttle and are recovered by parachute. In the investigation,
the researchers found that the o-rings were leaking because they lost resiliency
when cold. In some cases (prior to Challenger’s ill fated launch) the o-rings
had burnt through during liftoff. However, in cases where the seals had
failed, it happened late enough in the solid rocket’s “burn” to not endanger
the missions. On the day of the launch of the Challenger no one was aware
of this research. Someone should have been assigned to collect and analyze
this data.
Third, dangerous political snags
Part of Challenger’s last mission was overtly
political, the launch of Christa McAuliffe (the first teacher in space)
into orbit. So, there were politicians and newspeople waiting to chat with
Christa McAuliffe once she was in orbit. And, of course, there were administrators
at NASA determined to make sure this happened without any glitches. Delaying
Mrs. McAuliffe’s flight would have had negative political consequences
for managers at NASA. Understand, there was nothing new about this. For
example, the worst accident in the history of space exploration occurred
on October 24, 1960. About 100 people were killed in the launch pad explosion
of a Soviet rocket (the Soviet Union never released the details, this is
based on the on James Oberg’s books, “Red Star in Orbit” and “Uncovering
Soviet Disasters”). What happened was that a glitch in the rocket caused
an abort just before launch. Safety regulations required that the rocket
be de-fueled prior to being (safely) repaired. But that would have put
the launch way behind schedule. The manager at the site, Field Marshal
Mitrofan Nedelin decided to gamble on leaving the volatile fuel in the
rocket while it was being repaired, hoping this would prevent any serious
delay. He lost (both the rocket and his life). Why did he decide to put
his so many lives at risk? Maybe he decided to do this rather then risk
the political consequences of failure. A few weeks earlier, Nikita Khrushchev,
then the leader of the Soviet Union, gave a speech at the United Nations
hinting at new soviet space exploits in the near future. Back then, satellites
were the latest thing in high technology, and the Soviet Union was out
to show the world that it was at the forefront of this technology. The
plan was to launch two Mars probes shortly after the speech. Both probes
failed shortly after launch due to technical faults. Naturally, there were
political repercussions. Field Marshal Nedelin might have been worried
about the official reaction to his launch delay, coming as it did on top
of two major failures. Could anything have been done to prevent disaster?
Yes (they could have been more careful), and no. Unfortunately, as long
as pioneering space missions are played out on television, they will be
the stuff of politics. As long as the funding comes from politicians, just
“saying no” to politics is not possible. So, Christa McAuliffe was inevitable.
The accident that took her life wasn’t.
Fouth, the launch was outside of the engineering
specifications
The last launch of the Challenger happened during
one of the worst cold snaps to hit Florida on record. I don’t think anyone
anticipated the shuttle would be launched in weather that cold. After all,
the launch pad was located in Florida, and the only alternate was Vandenberg
in southern California. Now, some of the engineers at NASA, and Morton
Thiokol (the contractor who built the solid fuel boosters, now called Thiokol
due to a corporate split up), knew there was a danger in launching the
shuttle in weather so cold. But, no one told the managers at NASA (perhaps
because they had to “go with the program”). Also, it might have been that
the leaders at NASA had let it be known that they didn't want to hear bad
news. But, that aside, did anyone realize that there was a risk simply
because the weather was so cold. That launching in weather that cold had
never been anticipated. In the end, Christa McAuliffe (the first teacher
in space) who had to be launched on time. So, they crossed their fingers
and let the shuttle be launched; and seven people died in an accident that
was totally preventable.
What could have been done to save the Challenger?
There was not any one thing that caused the Challenger
accident. The accident was caused by a combination of factors that were
in place years before the failed launch. While in the very short run the
accident can be blamed on defective o-ring seals; in the long run, it was
caused by a series of meta-failures that all came together on the day of
the launch. But, even as late as the day of the launch, there were things
that could have saved the shuttle Challenger, if only someone had been
able to convince the managers at NASA that there was a danger in launching
the Shuttle.
In the short term, emergency action
The launch of the shuttle could have been delayed,
and a plausible excuse invented for the news media. Perhaps the icicles
hanging off the shuttle, tower, gantries, and just about everything else
(this was one of the worst freezes on record) could have been invoked as
a potential danger. For example, what if ice froze to the ceramic tiles
(used as a heat shield during reentry) and caused some of them to break
off during launch? Actually, this was a serious possibility, and there
were other dangers as well. Also, this would have had the right impact
on the news reporters and politicians who were waiting to chat with Christa
McAuliffe in orbit. Or, someone could have brought up, anonymously, the
possibility of rainwater freezing inside the joints of the solid rockets.
While in engineering terms, this may not have been a serious possibility,
it would have would have provided a simple reason for delaying the launch,
while heading off an embarrassing deeper look. And, a simple reason could
have been given, “While we think this hypothesis is totally implausible,
we have to investigate it just to be safe.” Finally, they could have simply
told the truth, that there was a problem with the temperature response
of the o-rings. And, a simple reason could have been given to explain why
the o-ring problem had been ignored, “We never expected to have a shuttle
launch in weather this cold.” The problem was that the truth might have
embarrassed NASA. So, finding a deceptive explanation might have been more
bureaucratically feasible. Note, while I think the morality of deception
is suspect in all cases. In this case, deception would have been necessary
to save lives, and because of the culture of the bureaucracy.
In the medium term, an engineering fix
The faulty o-ring seals could have been fixed
years before the accident. If only someone had mentioned (without appearing
“strange”) the problem to a manager with enough seniority to get money
allocated for a fix. The modifications to the seals could have been done
quietly, and relatively inexpensively. And, if anyone asked, the work could
have been explained as, “Making an ‘ultrasafe’ shuttle orbiter even safer.”
In the long run, design and policy
Twenty years before the accident, Morton Thiokol
could have been asked to build a plant either in Florida (next to Kennedy
Space Center), or at a site that allowed the boosters to be shipped by
barge. Why? Because the reason the joints and the o-ring seals were in
the solid fuel boosters in the first place was because the manufacturing
plant was located in Utah.
Because of the plant’s location the boosters had to be broken down into
little sections and shipped to Florida by rail, then assembled prior to
launch. If the Thiokol plant had been located on, say, the Missouri river,
then the boosters could have been built in big pieces and shipped by barge
to the Kennedy Space Center (via the Mississippi, and the intercoastal
waterway). Also, there are safety issues in shipping by rail. Sometimes
vagrants ride on railroad cars, Sometimes railroad shipments are vandalized.
And, sometimes fools shoot guns at trains. These sorts of things are much
less likely to happen to a barge in the middle of a river. Ultimately,
the reason a new plant was not built was budgetary. Congress had limited
the funds NASA had available for the Shuttle. And, Morton Thiokol was,
after all, the low bid contractor (and maybe the only
contractor capable of building those boosters). A properly located
shuttle plant would have been expensive, and would have duplicated the
other plants Thiokol already had for making military rockets. Understand,
most of Thiokol’s rocket business was (and is) making military boosters.
So, budgetary logic prevailed, rather than engineering sense.
Conclusion
The Challenger accident cannot be blamed on “pilot
error,” or other forms of human error, or software failure, or corrupt
contractors and politicians. It was caused by a series of meta-failures
that came together in a way that made the accident inevitable. There was
that tough launch schedule that had to be maintained. The faulty o-ring
seals could have been fixed years before. And, there was Christa McAuliffe,
the first teacher in space, who had to be launched on time. So, if you
ask, “Who is to blame for this accident?” It’s hard to say, choose your
suspect; everyone, no one, the system. Bureaucracy specializes in distributing
responsibility this way. In the last analysis, long term political and
budgetary decisions, in some cases made decades before the accident, and
short term bureaucratic behaviors made the shuttle accident inevitable.
Preventing this in the future
I think there is something that can be done to prevent these sorts
of accidents in the future. NASA should create a group of engineers to
independently compile and study safety related data. The engineers should
have the political clout, independence, and access to get their jobs done
quietly. They should have the power to independently review plans and programs.
They should only be accountable to NASA’s chief administrator. And, they
should have the power to intervene when they find potential problems. Bureaucracy
being what it is, some at NASA will object. Understand, in the government
your career can be blown at any seam, even if you are not really responsible
(it’s called the zero defects mentality). So, people in government have
a natural fear of outsiders coming in and nosing around. Understand, a
bureaucracy is a tight community, like a tribe, so there is a collective
psychology at work. If something is a threat to you, it’s a threat to me.
If you mess up, you bring shame on us all. Bureaucracy is an emotional
amplifier, things like shame, paranoia, or prudery can become dominant
motivators. To make things worse bureaucracy is a regulator, and tends
to see itself as a collective parent. So, ideally, the bureaucracy should
be an omniscient (all knowing) parent, and the regulatee should behave
like a three-year old child. This is why bureaucracies have a hard time
regulating themselves. Simply, because self regulation violates the unwritten
rules that guide all human communities. How can you, as a member of the
community, treat other senior members like three-year old’s? It just isn’t
done. And, admitting a mistake, is admitting that the system (your tribe)
is not omniscient (that brings everyone down). Now, NASA launches space
missions that cost the taxpayers hundreds of millions of dollars. And,
that makes NASA its own safety regulator. And, that is the problem. That’s
why there must be independent safety oversight. And, that’s why faults
have to be found and fixed quietly, if possible. But, whether these things
are found and fixed quietly or not is a secondary issue. These accidents
are very expensive, economically,
politically, and sometimes in human lives. Preventing these accidents must
be a primary goal at NASA.
I hope you enjoyed reading this.
The Pentagon should be famous as a creator of
monopolies. The reasons are simple. First, military technology is so expensive,
and capital intensive, that only the richest firms can afford to compete.
Second, this technology is so esoteric, that only a few people really understand
it. Third, the military’s contracting practices are so competitive, that
the losers are driven out of the business entirely. Now, if you have the
plant, and have the people who can do the technology, and have won the
last few rounds of bidding; then when the military wants something, they
come knocking on your door (and pay your price). That’s right. If they
need the technology, then they need you. Why? Because, the military’s system
“competitive” bidding created a situation where there is no effective bidding
at all. Because, the system let you own the technology. Now, the government
may try to hold a competition for the next big contract, but it will be
a sham. Why? Because your competitors, if they win, still have to build
a plant, and find experienced people, and negotiate with you for the right
to use your patents. So, even if you lose, you win. And, if something goes
wrong with the technology you sold the military, there isn’t an awful lot
they can do about it (as long as the defect isn’t too flagrant). Why? Because,
they still need you (and, of course, have to pay your price) if they want
to replace the broken stuff you sold them. There are two possible solutions
to this problem. First, Congress could nationalize your company (and to
pay you, and pay your stockholders). Or, Congress could change the way
contracting is done for big ticket technology. The contracting system could
be redesigned so that no one loses too much. That is, let the winners become
the main contractors, and the losers become the subcontractors. That way,
monopoly can be prevented because several firms would always have a hand
in the technology. Also, firms that do business with the Pentagon might
be required to participate in a patent sharing system with other Pentagon
contractors. That is, firm A gets to use firm B’s patents (for military
work only) provided it pays a standard percentage royalty. That way, firm
B is compensated for it’s creative work, firm A gets to use the patents
for a reasonable rate, and monopolization is prevented. Some economic pundits
might have fits with these proposals. But the issue here is simple; either
have a contracting system that works at least halfway, or a system that
doesn’t work at all. The alternative is between a system fosters creative
competition, or a system that inevitably creates expensive monopolies of
either the capitalist or the socialist variety. <return>
What if?
What if the engineers had someone to go to, anonymously,
outside of management, with the power to act? What if someone had been
in the position to independently look for possible problems in the months
before the probe was lost? Perhaps the fact that the thruster data from
the probe was being reported in pounds, but newtons were being used to
calculate the course corrections might have been noticed. Perhaps in time
for something to be done. But, that would have required someone empowered
to ask embarrassing questions. <return>
Books
Red Star in Orbit, by James Oberg, published
by Random House in 1981
Uncovering Soviet Disasters, by James Oberg,
published by Random House in 1988
Magazine Articles
“Why The Mars Probe Went Off Course,” by James
Oberg, in Spectrum Magazine, December 1999
Internet
Mark Wade's Encyclopedia Astronautica, at http://www.rocketry.com/mwade/spaceflt.htm
December 3, 1999 - May 30, 2000