Thank you for that kind introduction, and thank you for inviting me to speak on behalf of the National Transportation Safety Board.
Since most of you are involved in one way or another with NextGen, I thought this would be an excellent opportunity to talk about how the NTSB’s accident investigation experience can help inform the process regarding two major NextGen activities. The first is introducing new technologies into the NAS, and the second, which is broader and far more challenging, is introducing more automation into the NAS.
Before I do that, let me briefly describe what the NTSB does. As many of you know, the NTSB is an independent federal agency that investigates accidents in all modes of transportation, determines what caused those accidents, and makes safety recommendations to prevent recurrences. Our primary output is recommendations that are based upon our accident investigations – recommendations regarding improvements that could help avoid those accidents in the future. Because we are not regulators, however, we cannot require anyone to implement the recommendations. Nonetheless, contrary to what the media would have you believe, recipients of our recommendations respond favorably to them more than 80 percent of the time. That largely favorable response is a tribute to the quality of the work of our investigators and analysts.
So how can we help regarding NextGen, given that we haven’t yet investigated any accidents attributable to NextGen improvements? What we bring to the table is a wealth of experience from our accident investigations regarding both issues I mentioned – introducing advanced technologies into complex systems, and introducing more automation. We know from our investigation experience how challenging it can be to make changes in complex systems – and the NAS certainly qualifies as a complex system.
The aviation industry has a long history of demonstrating that one of the best ways to make changes successfully in complex systems is by using a collaborative process that brings everyone to the table who has a “dog in the fight.” The collaborative process is called CAST, the Commercial Aviation Safety Team, a process in which many of you have participated. The objective of the collaboration is to accomplish what I refer to as “System Think.” In a complex system of subsystems that are coupled together, System Think means understanding how a change in one subsystem will affect the other subsystems.
Many of you probably know how CAST began. In the early 1990’s, after the industry’s accident rate had been declining for decades, the rate began to flatten on a plateau. Meanwhile, the FAA was predicting that the volume of flying would double in 15-20 years. The industry became very concerned that if the volume doubled while the accident rate remained the same, the public would see twice as many airplane crashes on the news. That caused the industry to do something that, to my knowledge, has never been done before or since at an industry-wide level in any other industry – they pursued a voluntary collaborative industry-wide approach to improving safety.
This occurred largely because David Hinson, who was then Administrator of the FAA, realized that the way to get off the plateau was not more regulations or a bigger stick for the regulator, but figuring out a better way to improve safety in a complex system.
The voluntary collaborative CAST process brings all of the players –airlines, manufacturers, pilots, air traffic controllers, and the regulator – to the table to do four things: Identify the potential safety issues; prioritize those issues, because more issues would be identified than they had resources to address; develop interventions for the prioritized issues; and evaluate whether the interventions are working without causing unintended consequences.
This process has been an amazing success. It resulted in a reduction of the aviation fatality rate, from the plateau on which it was stuck, by more than 80% in less than 10 years. This occurred despite the fact that the plateau was already considered to be exemplary, and many thought that the rate could not decline much further.
The process also improved not only safety but also productivity, which flew in the face of conventional wisdom that improving safety generally decreases productivity, and vice versa. In addition, a major challenge of making improvements in complex systems is the possibility of unintended consequences; yet this process has generated very few unintended consequences. Last but not least, the success occurred largely without generating new regulations.
The moral of this collaboration success story is very simple: Everyone who is involved in a problem should be involved in developing the solution
I am pleased to note that the NTSB now serves on CAST in a non-voting role. All of our accident reports are public and most CAST participants are generally familiar with them. When we participate in person, however, we can also inform the process about issues that we uncovered during an investigation that may not have been included in the accident report because they played no role in that particular accident. We can also inform the process with trends that we have observed over the years. We can contribute significantly to CAST because we have our eyes on the same prize – improving aviation safety.
Given how successful the CAST collaboration has been, I am pleased to see that NextGen is being implemented using a collaborative process in the form of the NextGen Advisory Committee. What I would like to discuss today is a variation on that collaboration theme that is not industry-wide, but is more narrowly focused at the manufacturer level.
Makers of transport airplanes learned long ago that they could make better airplanes if the end-users, i.e., the pilots, were involved in the development process early in the design phase. In addition, they realized that the airplane must be easy to maintain in order to last long enough to be economically viable, so they also brought in mechanics early in the design phase. Finally, recognizing the importance of the airplane fitting into the air traffic system, the manufacturers included air traffic controllers early in the design phase.
How much are manufacturers other than those who make transport airplanes also using such a collaborative process? To what extent have some of the exciting air traffic control improvement innovations, such as space-based ADSB, data comm, or remote control towers, had the benefit of a collaborative process involving end-users? I raise this question because the publicity about new types of air traffic control equipment sometimes notes that air traffic controllers don’t like it. That makes me wonder how much the controllers, as the end-users of the equipment, were included in the design process. In addition, in situations in which the new equipment is also going to affect pilots, I wonder how much pilots were brought into the fold during the design, analogous to the way the aircraft manufacturers have included air traffic controllers.
That brings me to how the NTSB can help. We can inform the process with information about accidents we have investigated in which the design of the equipment did not adequately consider the needs and circumstances of the end users.
Now I would like to move from new technologies to the second area in which the NTSB can help – the analogous but much broader area of increasing automation. As stated by Prof. James Reason, world renowned human factors expert:
In their efforts to compensate for the unreliability of human performance, the designers of automated control systems have unwittingly created opportunities for new error types that can be even more serious than those they were seeking to avoid.
Stated another way, the good news is that there is more automation; but the bad news is that there is more automation. Airline pilots have a long history of transitioning to increasingly automated operations, and the same trend is occurring in our air traffic control systems. Experience has shown that automation can improve safety, reliability, efficiency, and productivity. Experience has also shown that there can be a downside.
In theory, if automation replaces the human operator, there will be no human error. Removing the human operator would also address at least four issues on the NTSB’s Most Wanted List of Transportation Safety Improvements – fatigue; distractions; impairment; and medical fitness for duty. Unfortunately, the theory does not reflect operational reality in several respects.
The first problem is that the theory assumes that the automation is working as designed. But what if the automation behaves differently than designed or fails altogether? We learned more than 100 years ago to stop saying “That ship can’t sink,” and we know now not to say “That automation can’t fail.” Will it fail in a way that is safe? If it cannot be guaranteed to fail in a way that is safe, will it inform the human operator of the failure in a timely manner, and will that operator then be able to take over successfully? The fatal Metro accident that occurred here in Washington in 2009 resulted from a failure in the automation that the operator was unaware of until it was too late.
Another problem is that the theory fails to address what happens when the automation encounters unanticipated situations. The automated system’s capability is only as complete as its programming. If a situation has never been encountered before, and was not imagined by the programmers, the automation might be unable to respond appropriately.
Last but not least, removing the controller does not remove other sources of human error. Humans are still involved in designing, manufacturing, and maintaining the various systems; and human error in these steps is likely to be more systemic in its effect and more difficult to find and correct. For example, we investigated a collision of a driverless airport people mover that resulted in part from improper maintenance.
The most fundamental lesson that we have learned from our accident investigation experience is that introducing automation into complex human-centric systems can be very challenging. The problems that we have seen thus far from increasing automation include increasing complexity, degradation of skills, complacency, and the potential loss of professionalism.
Taking an example from flight operations, in the 2013 crash of Asiana flight 214 in San Francisco, the pilots became confused by the behavior of the airplane’s automated systems. The pilot’s mode selection caused the auto-throttle to disengage, and the pilot incorrectly assumed that the auto throttle would “wake up” and maintain the desired speed. As a result, Asiana 214 approached the runway low and slow while landing and crashed into a seawall.
This crash illustrated not only confusion attributable to the complexity of the automation, but also the degradation of the pilot’s skills to the extent that he was unable to complete a manual approach and landing on an 11,000 foot PAPI-equipped runway on a clear day with negligible wind. Asiana’s automation policy emphasized full use of all automation and discouraged manual flight during line operations.
In addition to automation confusion and reduction of skills, our investigation experience has shown that the challenge of complacency can arise as the automation performs more and more of the functions. The more the human operators become accustomed to automation performing reliably, the more difficult it becomes to keep them engaged.
Can too much automation not only generate complacency but also undermine professionalism? An example of this is many subway systems, which are largely automated. In many systems the automation takes the train from the station, maintains appropriate speeds, maintains adequate distance from other trains, stops in the next station, and then opens the doors. The operator performs only one function – closing the doors.
When the operator’s only function is to close the doors because everything else is automatic, does that operator love his or her work and enjoy the pride of accomplishment, or will he or she be there only to get a paycheck? If the paycheck is the primary objective, what does that do to professionalism? Unlike the problems that occur when the automation fails, this problem occurs when the automation is performing correctly.
And what happens to our human attentional capacity when the norm, for long periods of time, is a reduced demand on attention? Human factors experts say that our attentional capacity shrinks to accommodate a reduced workload. Whether this happens through automation or a more efficient but not automated workflow, it is particularly insidious because, like complacency and de-skilling, it also occurs when things are going right, not when things are going wrong.
Our investigations of automation-related accidents have revealed two extremes. On one hand, the human operator is the least predictable and most unreliable part of the system. On the other hand, the human operator is potentially the most adaptable part of the system when failures occur or unanticipated situations are encountered.
One flight operations example of the human as the most adaptable part of the system is Captain Chesley Sullenberger’s landing in the Hudson River when his airliner suddenly became a glider because both of its engines were taken out by birds.
He and his first officer had never been trained to glide an airliner; they had never been trained to land in the water; and they had never been trained to land without power. Despite that lack of training, they were able to save the day by quickly and calmly assessing the situation, determining that going to the Hudson was the best course of action, and executing the ditching successfully.
On the other hand, the Colgan crash near Buffalo, New York, in 2009, was a case of the human pilot as the most unreliable part of the system.
Due to the pilot’s inattention and lack of situational awareness, he placed the airplane in a situation that caused the stick shaker and stick pusher to activate, whereupon he responded inappropriately and caused an aerodynamic stall and crash. FAA records indicated that this pilot had previously received four certificate disapprovals, one of which he did not disclose to Colgan. Furthermore, Colgan’s training records indicated that while he was a first officer, he needed additional training after three separate checkrides.
In this instance, the pilot should never have been in the front of a transport airplane, but the filters that are intended to remove pilots such as him from the system failed.
Another textbook example of the human as the most vulnerable part of the system, also from flight operations, was Air France Flight 447 from Rio de Janeiro to Paris in 2009. In that case, however, the pilots were largely set up to fail.
After Air France 447 reached its cruise altitude of 37,000 feet at night over the Atlantic and began approaching distant thunderstorms, the captain left the cockpit for a scheduled rest break, giving control to two less experienced pilots.
Airspeed information is so important that, for redundancy, there were three pitot tubes to provide that information, and the pitot tubes were heated to ensure that they were not disabled by ice. At the ambient temperature of minus 50-60 degrees, and with abundant super-cooled water from the nearby thunderstorms, the pitot tube heaters were overwhelmed, and the pitot tubes became clogged with ice, so the airplane no longer knew how fast it was going.
The loss of airspeed information caused several systems to quit, including the automatic pilot that was flying the airplane and the automatic throttle that was maintaining the selected speed. As a result, the pilots suddenly had to fly the airplane manually. The loss of airspeed information also disabled the automatic protections that prevent the airplane from entering into an aerodynamic stall, in which the wings no longer produce lift. The pilots responded inappropriately to the loss of these systems, and the result was a crash into the ocean that was fatal to all 228 on board.
As with most accidents that we investigate, several factors played a role. To begin with, the redundancy of having three pitot tubes was not effective because all three were taken out by the same cause. In addition, the pilots had not experienced this type of failure before, even in training, where the problem can be simulated in very realistic simulators, and they were unable to figure out what happened.
The error messages that the pilots received did not help them determine what went wrong. Had the pilots known that the cause of the error messages was loss of airspeed information, they may have known to fly the old-fashioned way, using aircraft pitch angle and power.
Crew resource management also failed: The pilot flying did not tell the other pilot that he had pulled the stick back, commanding a climb, and the other pilot did not ask. The airplane was equipped with side sticks rather than yokes; and unlike yokes, which are physically connected, movement of one side stick does not cause the other side stick to move, so the other pilot did not know that the pilot flying was pulling back on the stick.
Finally, when airspace designers increased the number of available air lanes by reducing minimum vertical separation of opposite direction traffic from 2,000 feet to 1,000 feet, they were concerned that human pilots cannot maintain that separation reliably enough, so automatic pilots became mandatory at cruise altitudes. Because autopilots are mandatory, the pilots of Air France 447 had never flown manually at that altitude, even in the simulator, and they had never had any stall recognition or recovery training at that altitude. This is important because the airplane behaves very differently at cruise than it does at low altitudes, such as during takeoff and landing.
This crash also illustrates how automation can result in the loss of basic skills. While many of the new technologies currently being contemplated in air traffic control do not perform tasks that were previously handled by controllers, the workflow of controllers is likely to change either through automation or other adaptation to heavier air traffic, as the goals of these new technologies are met.
Increasing automation unquestionably has the potential to reduce the work-load of the human operator. Moreover, when all is going well, automation brings unparalleled safety, reliability, productivity, and efficiency. The challenge is how to reap the benefits of automation while minimizing its potential downsides.
That’s where the NTSB can help. As with advanced technologies that NextGen is bringing, we can inform the process of increasing automation with information about accidents we have investigated, in all modes of transportation, in which an accident resulted wholly or partly from inadequate understanding of the human/automation interface.
You are embarking on a new journey of amazing possibilities. The technologies that you develop today will enable a generation of more efficient flight. If you spread a broad collaborative net, it will also enable a generation of safer flight. The NTSB stands ready to apply accident investigation experience to inform that process and aid that journey.
Thank you again for inviting me to speak today for the NTSB. We look forward to working with all of you to help address NextGen challenges.