March 19, 2019 737 Aircraft Software Technology 0

Get ready for a long post folks with embedded links.  For previous posts please look in the 737 category.

 

Yesterday’s post gave some details about the 737 MAX issues.  Karl’s post yesterday goes into greater detail:

Let me note up front — I’m not a pilot.  I am, however, a software and hardware guy with a few decades of experience, including writing quite a lot of code that runs physical “things”, some of them being quite large, complex, expensive and, if something goes wrong, potentially dangerous.  Flight isn’t all that complex at its core; it’s simply a dance comprised of lift, gravity, thrust and drag.  What makes it complex is the scale and physical limits we wish to approach or exceed (e.g. you want to go how fast, in air how thin, with a distance traveled of how far and with how many people on board along with you as well as with a bunch of other aircraft in the air at the same time?)

The sequence of circumstances that has left the 737MAX to arguably have the worst hull safety rating in the history of commercial jet aviation appears, from what I can figure out reading public sources, to have basically gone something like this:

  • The 737, a venerable design with literal millions of flight hours, a nice, predictable handling paradigm and an excellent safety record (the basic design of the hull is 50 years old!) was running into competition resulting from its older-series engines that bypassed less air (and thus are less efficient in their consumption of fuel.)  Boeing sought to correct this competitive disadvantage to keep selling new airplanes.
  • The means to correct the efficiency problem is to use newer, higher-bypass engines which, in order to obtain their materially lower fuel consumption, are physically larger in diameter.
  • The aircraft’s main landing gear has to fit in the space available.  To make the larger engines fit the landing gear has to be made longer (and thus larger, bigger and stronger) or the engines will hit the ground when taking off and landing.
  • The longer landing gear for where the original design specified the engines to go (but with the larger engines) would not fit in the place where it had to go when it was retracted.
  • Boeing, instead of redesigning the hull including wings, tail and similar from the ground up for larger engines, which would have (1) taken quite a lot of time and (2) been very expensive, because (among other things) it would require a full, new-from-zero certification, decided to move the engines forward in their mounting point which allowed them to be moved upward as well, and thus the landing gear didn’t have to be as long, heavy and large — and will fit.
  • However, moving the engines upward and forward caused the handling of the aircraft to no longer be nice and predictable.  As the angle of attack (that is, the angle of the aircraft relative to the “wind” flowing over it) increased the larger, more-forward and higher mounted engines caused more lift to appear than expected.
  • To compensate for that Boeing programmed a computer to look at the angle of attack of the aircraft and have the computer, without notice to the pilots and transparently add negative trim as the angle-of-attack increased.
  • In other words instead of fixing the hardware, which would have been very expensive since it would have required basically a whole new airplane be designed from scratch it appears Boeing decided to put a band-aid on the issue in software and by doing so act like there was no problem at all when it fact it was simply covered up and made invisible to the person flying the plane by programming a computer to transparently hide it.
  • Because Boeing had gone to a “everything we can possibly stick on the list is an option at extra cost and we will lease that to you on an hours-run basis, you don’t buy it”, exactly as has been done with engines and other parts including avionics in said aircraft, said shift being largely responsible for the rocket shot higher in the firm’s stock price over the last several years, the standard configuration only included one angle-of-attack sensor.  A second one, and a warning in the cockpit that the two don’t agree is an extra cost option and was not required for certification! (Update: There is some question as to whether there is one or two, but it appears if there are two physically present the “standard” configuration only USES one at any given time.  Whether literally or effectively it appears the “standard” configuration has one.)
  • Most of the certification compliance testing and documentation is not done by the FAA any more.  It’s done by the company itself which “self-certifies” that everything is all wonderful, great, and has sufficient redundancy and protections to be safe to operate in the base, certified configuration.  In short there is no requirement that a third, non-conflicted and competent party look at everything in the design and sign off on it — and thus nobody did, and the plane was granted certification without requiring active redundancy in those sensors.
  • Said extra cost option and display was not on either the Lion Air or Ethiopian jets that crashed.  It is on the 737MAX jets being flown by US carriers, none of which have crashed.
  • It has been reported that the jackscrew, which as the name implies is a long screw that sets the trim angle on the elevator, has been recovered from the Ethiopian crash, is intact and was in the full down position.  No pilot in his right mind would intentionally command such a setting, especially close to the ground.  It is therefore fair to presume until demonstrated otherwise that the computer put the jackscrew in that position and not the pilot.
  • Given where the jackscrew was found, and that there is no reasonable explanation for the pilot having commanded it to be there, why is the computer allowed to put that sort of an extreme negative trim offset on the aircraft in the first place?  Is that sort of negative offset capability reasonable under the design criteria for the software “hack-around-the-aerodynamics” issue?  Has nobody at Boeing heard of a thing called a “limit switch”?
  • It has been reported from public information that both Lion Air and the Ethiopian jet had wild fluctuations in their rate of climb or descent and at the time they disappeared from tracking both were indicating significant rates of climb.  For obvious reasons you do not hit the ground if you have a positive rate of altitude change unless you hit a cumulogranite cloud (e.g. side of a mountain or similar), which is not what happened in either case.
  • The data source for that public information on rate of climb or descent did not come from radar; while I don’t have a definitive statement on the data source public information makes clear it almost-certainly came from a transponder found on most commercial airliners known as ADS-B.  Said transponder is on the airplane itself.  It’s obvious that the data in question was either crap, materially delayed or it was indicating insanely wild fluctuations in the aircraft’s vertical rate of speed (which no pilot would cause intentionally) since you don’t hit the ground while gaining altitude and if the transponder was sending crap data that ground observers were able to receive the obvious implication is that the rest of the aircraft’s instruments and computers were also getting crap data of some kind and were acting on it, leading to the crazy vertical speed profile.
  • The Lion Air plane that crashed several months ago is reported to have had in its log complaints of misbehavior consistent with this problem in the days before it crashed.  I have not seen reports that the Ethiopian aircraft had similar complaints logged.  Was this because it hadn’t happened previously to that specific aircraft or did the previous crews have the problem but not log it?
  • The copilot on the Ethiopian aircraft was reported to have had a grand total of 200 hours in the air.  I remind you that to get a private pilots license in the US to fly a little Cessna, by yourself, in good weather and without anyone on board compensating you in any way you must log at least 40 hours.  Few people are good enough to pass, by the way, with that 40 hours in the air; most students require more.  To get a bare commercial certificate (e.g. you can take someone in your aircraft who pays you something) you must have logged 250 hours in the US, with at least 100 of them as pilot-in-command and 50 of them cross-country.  The “first officer” on that flight didn’t even meet the requirements in the US to take a person in a Cessna 172 single-engine piston airplane for a 15 minute sightseeing flight!
  • The odds of the one pilot who actually was a commercial pilot under US rules in the cockpit of the Ethiopian flight having trained on the potential for this single-data-source failure of the aircraft and what would happen if it occurred (thus knowing how to recognize and take care of it) via simulator time or other meaningful preparation is likely zero.  The odds of the second putative flight officer having done so are zero; he wasn’t even qualified to fly a single-engine piston aircraft for money under US rules.

So there are some pilot issues with the Ethiopia pilot training in terms of the co pilot.  However that doesn’t change the fact that an engine design change without subsequent airframe changes means the aircraft had the large potential to stall itself.  the solution was a software package that was able to effectively say F-U to the pilots and do whatever it wanted.  the A330 had a similar issue..a hidden system designed to prevent stalling that would freakout if it got bad data from one of the two or three sensors.  please read the linked article at the market ticker above from a guy who writes code.  the thinking that software can fix everything is a dangerous concept that is killing folks..and will continue to do so.  Remember..the automation is made by people who are not perfect.  “AI” which is a term that is thrown around without any thought to what it means..is designed by flawed people..it will never be perfect.

Karl has posted an article today that pretty much agrees with me..folks need to go to prison over this.  There was plain knowledge that Boeing knew MCAS was trying to fix a flawed airframe design.
This led to Boeing rushing the certification(which never should have passed FAA certification) trying to get the 737 out to complete with the A320.  The Seattle Times(which Karl linked to in the above linked post) has some shocking details and IMO there’s no way folks at Boeing didn’t know the design of the 737 MAX and the subsequent MCAS system wasn’t a dangerous combination that inevitably turned deadly.