Conclusion

The final model assumes a Binomial distribution in the outcome and uses temperature as the single explanatory variable. The model is of the form \mathrm{logit(\hat{\pi})=\hat{\beta_0} + \hat{\beta_1}(Temp)}where \hat{\pi} represents the probability of at least one O-ring failure per launch with the binary outcome and the probability of an individual O-ring failure with the full outcome.

In the binary model, the intercept \hat{\beta_0} had a value of about 15 while the coefficient of temperature \hat{\beta_1} had a value of about -0.2. The coefficient of temperature can be interpreted as increasing the odds of any O-ring failure by about 1.3 times with every 1°F decrease in temperature and all else constant. Likewise, in the full model the intercept \hat{\beta_0} had a value of about 5.1 while the coefficient of temperature \hat{\beta_1} had a value of about -0.1. In this case, for every 1°F decrease in temperature with all else constant the odds of a single O-ring failing would be increased by about 1.1 times. 

This is a key finding. It underscores the importance of atmospheric temperature on the day of the launch, and that lower temperatures would increase the odds of an O-ring failure occurring. This is supported by Dalal et al. when they state the resiliency of an O-ring is highly dependent on temperature.

On the morning of the Challenger‘s final launch the temperature was 36°F. At this point, the predicted the probability of at least one O-ring failure is 99.9%, with each individual O-ring having a predicted probability of failure of 71.6%. This corresponds to about four out of the six primary O-rings being expected to fail. As shown previously, the Wald, LR, and Bootstrapped 95% confidence intervals are large, ranging from only one to all six O-rings failing, due to a launch never before being attempted at such a low temperature. 

The original analysis by Morton Thiokol omitted all launch data when no O-ring failure was observed under the assumption that it did not have anything to offer for understanding O-ring temperature effects. This reduced the number of data points from 23 to only six. More importantly, such an action makes it fundamentally impossible to understand why some O-rings fail while others don’t when only considering failed O-rings and not the entire group. If this was corrected, the relationship between temperature and O-ring damage could have been found with the available data.

Dalal et al. further went on to estimate the probability of secondary O-ring failure with additional data. It was found that given failure in a primary O-ring, a secondary O-ring had a Bayesian probability of failure equal to p \approx 0.286 * 0.292 \approx 8.4%. This comes from the conditional probability of erosion followed by the conditional probability of blow-by in a secondary O-ring, similar to what could occur in a primary O-ring. Taking this, the full model’s estimate of a catastrophic failure for the Challenger launch would be 0.084 * 4 \approx 34%. This is given that only one secondary O-ring needed to fail out of the four behind the failed primary O-rings.

Certain engineers within Morton Thiokol had a similar intuition and suggested this relationship was important, but had their opinions dismissed. Using the model developed with the company’s method of dropping non O-ring damaged launches, the probability of catastrophe would only be estimated at 10.5%. Even this does not have a strong dependence on temperature, as shown in Figure 5, and would be nearly constant for any launch at any temperature. Had this statistical mistake been corrected, it would have been possible to predict that the launch temperature of the Challenger disaster was very likely to end in a catastrophic failure compared to previous launches. 

If the launch was postponed to a temperature of 60°F, the average for Cape Canaveral in January 2022, the probability of an individual O-ring failure would have dropped to 13.6% and the expected number of O-ring failures for the launch to ~0.8. At a value less than one, it would be expected that no primary O-rings failed. In this scenario, the probability of a catastrophe would only have been 0.084*0.8\approx6.7%.

Even at a warmer launch temperature, these estimates have the potential to be considered too high of an acceptable risk. Had this flaw been focused on earlier, work could have proceeded to make the launch system safer and more reliable. Ultimately, this catastrophe was avoidable had the program appropriately analyzed data by factoring in both the successful and failed O-ring launches, and that the O-ring flaw was inherent up to this point in Shuttle design.

All of these estimates are qualified by the fact that the original dataset only indicated the number of “O-ring distress events” without specifics as to whether erosion or both erosion and blow-by were observed. The latter may be considered a full O-ring failure, while the former is not. By assuming that the indicated O-ring distress events were all failures, these estimates are excessive in their prediction of catastrophe. Specific information on whether primary O-rings were observed with erosion or both erosion and blow-by would result in a more accurate estimate. In addition, analyzing a past event in hindsight may make the proper course of action seem obvious. However, the simple statistical methods and common criteria for evaluating them shown here should have been the proper approach no matter the time or context.

Challenger's Final Crew