Category Archive:Software

It is working out better than I anticipated.  Right now the eldest has been installing all of her games on the G drive.  I found out one of her games is nearly 200 Gigs in size.  Holy crap batman.  She can easily chew through more than 400 gigs of storage just with the games she plays.  here are the ones that take up the most space.:

  1.  Skyrim
  2. Diablo III
  3. Starcraft II
  4. Lord of the Rings Online
  5. Elder Scrolls Online(this one is the nearly 225 gigabyte monster)

She informed me she has a ton of smaller games that she will be installing now that she has the space.  She also asked..what happens if i run out my 1 Terabyte allocation?  I told her it is a couple of mouse clicks to add more space.  Let’s see if she chews through the whole terabyte..if she does I have 3.2 terabytes waiting..:)

*Game Cache Update* Is the built in compression.  For vm’s you can sometimes get 5-10x compression because vms are mostly empty space.  ZFS does compression transparently.  Right now as part of the eldest’s game cache her system thinks it has written 75 gigabytes of data.  ZFS compression has reduced that down to 52.5G in the background.  This is roughly a 1.41x reduction in size just from basic compression.  Normally with my file types(movies, music..mainly stuff that is already compressed) I do not see any real compression.  With her steam apps the compression is much higher.  It will be interesting to see if it goes up or down as she loads up the rest of her games.

NAME PROPERTY VALUE SOURCE
Data/Steamcache type volume –
Data/Steamcache creation Tue Nov 19 17:11 2019 –
Data/Steamcache used 52.5G –
Data/Steamcache available 4.17T –
Data/Steamcache referenced 52.5G –
Data/Steamcache compressratio 1.41x –
Data/Steamcache reservation none default
Data/Steamcache volsize 1.00T local
Data/Steamcache volblocksize 128K –
Data/Steamcache checksum on default
Data/Steamcache compression lz4 inherited from Data
Data/Steamcache readonly off default
Data/Steamcache copies 1 inherited from Data
Data/Steamcache refreservation none default
Data/Steamcache primarycache all default
Data/Steamcache secondarycache all default
Data/Steamcache usedbysnapshots 0 –
Data/Steamcache usedbydataset 52.5G –
Data/Steamcache usedbychildren 0 –
Data/Steamcache usedbyrefreservation 0 –
Data/Steamcache logbias latency default
Data/Steamcache dedup off default
Data/Steamcache mlslabel –
Data/Steamcache sync disabled inherited from Data
Data/Steamcache refcompressratio 1.41x –
Data/Steamcache written 52.5G –
Data/Steamcache logicalused 74.0G –
Data/Steamcache logicalreferenced 74.0G –
Data/Steamcache volmode default default
Data/Steamcache snapshot_limit none default
Data/Steamcache snapshot_count none default
Data/Steamcache redundant_metadata all default
Data/Steamcache org.freebsd.ioc:active yes inherited from Data

I made an earlier post about an experiment I am running.  So far so good.  The eldest is having to put her games onto the new G drive her computer sees.  The magic of ISCSI makes it appear as a local hard drive even though it’s on a network server.  I am a HUGE fan of ISCSI and I use it as much as I can…especially when the storage is Linux or UNIX. I did notice that the transfer was maxing out at 650 megabit/second…i know that the machine can do better..it used to do 2 gigabits/second when it was a backup target.  I wondered what has changed throughout the years?  I did a little bit of digging.  ZFS is all about data safety.  You have to be extremely determined to make it loose data for it to have a chance to do so.  sometimes that ultimate safety comes at the price of performance.  I started looking at the numbers and i noticed ram(32 gigs) was not a problem.  CPU usage was less than 20% max.  The disks however were maxed out.  Well it turns out that ZFS has a ZIL(ZFS Intent Log) that is always present.  If there is no ZIL SSD then it’s on the main drives.  I thought that double(or in this case triple) writing to the drives was it…but nope..no there.  I had to dig deeper and dug into the actual disk I?O calls.  It turns out that the default setting for synchronous writes defaults to the application level.  If the application says you must write synchronously…that means zfs will not report back that the write transaction was completed until it does both of it’s copies and verifies them on the array.  Loosely translated if you were to put this in RAID terms it would be a write-through.  Since ZFS is a COW filesystem I am not concerned about data getting corrupted when written..it won’t(again unless you have built it wrong, configured it wrong…something like that)…so I found a setting and i disabled the forcing of synchronous writes.  I effectively turned my FreeNAS into a giant write-back caching drive.  Now the data gets dumped onto the FreeNAS server’s ram and the server says “i have it” and the client moves on to the next task..either another write request or something else.  Once I did that the disks went from maxing out at 25% usage to nearly 50% usage and the data transfers maxed out the gigabit connection.  That’s how it is supposed to be.

There are times for forcing synchronous writes…like databases, financials….anything where the data MUST verified as written before things are released.  that’s when you can force synchronous writes and use a ZIL drive.  This is an SSD(typically) that holds the writes as a cache(non-volatile) until the hard disks catch up.  The ZIL then grabs the data, verifies it’s integrity, and then tells the application the write has been accomplished(because it has) and then passes those writes to the array as a sequential set of files(something hard drives are much better at than random writes).  What’s eve nicer is that you can set the writing behavior per dataset or per zvol.  The entire file system doesn’t have to be one or the other and it doesn’t hurt the ZFS filesystem performance.  More as I figure it out with the ultimate question being…how do games perform when operated like this…stay tuned.

I came across an interesting use case for FreeNAS.  My eldest daughter likes games that are huge.  Like 100-250 gigabyte huge.  I simply cannot afford to keep adding SSD storage to her machine.  I will not do hard disks as main storage..under Windows 10 it’s too painfully slow.  What Lawerence had done was taken a FreeNAS machine, sliced off a portion of the raw storage, and presented it to the workstation as a hard drive over his network.  His son now run his large games from the FreeNAS zvol as if it was local.  What’s neat is the games initial load time is a bit slower(the NAS is hard drive based) but once it’s loaded..there’s no perceptible difference in gaming performance despite a constant stream of data from the server…usually less than 150 megabit/sec.  Since I have multi Terabytes of free space i am doing the same thing for my eldest.  I am also doing what is called thin provisioning so it initially starts at zero usage and goes up until she reaches her cap of 1 Terabyte.  Let’s see how this works as my quad core Xeon cpu is light years faster(with 4 times more ram at 32 gigabytes) than his FreeNAS mini dual core atom and 8 gigs of ram.  If this works…i have a new idea for future computer builds here at the house..<G>

Get ready for a long post folks with embedded links.  For previous posts please look in the 737 category.

 

Yesterday’s post gave some details about the 737 MAX issues.  Karl’s post yesterday goes into greater detail:

Let me note up front — I’m not a pilot.  I am, however, a software and hardware guy with a few decades of experience, including writing quite a lot of code that runs physical “things”, some of them being quite large, complex, expensive and, if something goes wrong, potentially dangerous.  Flight isn’t all that complex at its core; it’s simply a dance comprised of lift, gravity, thrust and drag.  What makes it complex is the scale and physical limits we wish to approach or exceed (e.g. you want to go how fast, in air how thin, with a distance traveled of how far and with how many people on board along with you as well as with a bunch of other aircraft in the air at the same time?)

The sequence of circumstances that has left the 737MAX to arguably have the worst hull safety rating in the history of commercial jet aviation appears, from what I can figure out reading public sources, to have basically gone something like this:

  • The 737, a venerable design with literal millions of flight hours, a nice, predictable handling paradigm and an excellent safety record (the basic design of the hull is 50 years old!) was running into competition resulting from its older-series engines that bypassed less air (and thus are less efficient in their consumption of fuel.)  Boeing sought to correct this competitive disadvantage to keep selling new airplanes.
  • The means to correct the efficiency problem is to use newer, higher-bypass engines which, in order to obtain their materially lower fuel consumption, are physically larger in diameter.
  • The aircraft’s main landing gear has to fit in the space available.  To make the larger engines fit the landing gear has to be made longer (and thus larger, bigger and stronger) or the engines will hit the ground when taking off and landing.
  • The longer landing gear for where the original design specified the engines to go (but with the larger engines) would not fit in the place where it had to go when it was retracted.
  • Boeing, instead of redesigning the hull including wings, tail and similar from the ground up for larger engines, which would have (1) taken quite a lot of time and (2) been very expensive, because (among other things) it would require a full, new-from-zero certification, decided to move the engines forward in their mounting point which allowed them to be moved upward as well, and thus the landing gear didn’t have to be as long, heavy and large — and will fit.
  • However, moving the engines upward and forward caused the handling of the aircraft to no longer be nice and predictable.  As the angle of attack (that is, the angle of the aircraft relative to the “wind” flowing over it) increased the larger, more-forward and higher mounted engines caused more lift to appear than expected.
  • To compensate for that Boeing programmed a computer to look at the angle of attack of the aircraft and have the computer, without notice to the pilots and transparently add negative trim as the angle-of-attack increased.
  • In other words instead of fixing the hardware, which would have been very expensive since it would have required basically a whole new airplane be designed from scratch it appears Boeing decided to put a band-aid on the issue in software and by doing so act like there was no problem at all when it fact it was simply covered up and made invisible to the person flying the plane by programming a computer to transparently hide it.
  • Because Boeing had gone to a “everything we can possibly stick on the list is an option at extra cost and we will lease that to you on an hours-run basis, you don’t buy it”, exactly as has been done with engines and other parts including avionics in said aircraft, said shift being largely responsible for the rocket shot higher in the firm’s stock price over the last several years, the standard configuration only included one angle-of-attack sensor.  A second one, and a warning in the cockpit that the two don’t agree is an extra cost option and was not required for certification! (Update: There is some question as to whether there is one or two, but it appears if there are two physically present the “standard” configuration only USES one at any given time.  Whether literally or effectively it appears the “standard” configuration has one.)
  • Most of the certification compliance testing and documentation is not done by the FAA any more.  It’s done by the company itself which “self-certifies” that everything is all wonderful, great, and has sufficient redundancy and protections to be safe to operate in the base, certified configuration.  In short there is no requirement that a third, non-conflicted and competent party look at everything in the design and sign off on it — and thus nobody did, and the plane was granted certification without requiring active redundancy in those sensors.
  • Said extra cost option and display was not on either the Lion Air or Ethiopian jets that crashed.  It is on the 737MAX jets being flown by US carriers, none of which have crashed.
  • It has been reported that the jackscrew, which as the name implies is a long screw that sets the trim angle on the elevator, has been recovered from the Ethiopian crash, is intact and was in the full down position.  No pilot in his right mind would intentionally command such a setting, especially close to the ground.  It is therefore fair to presume until demonstrated otherwise that the computer put the jackscrew in that position and not the pilot.
  • Given where the jackscrew was found, and that there is no reasonable explanation for the pilot having commanded it to be there, why is the computer allowed to put that sort of an extreme negative trim offset on the aircraft in the first place?  Is that sort of negative offset capability reasonable under the design criteria for the software “hack-around-the-aerodynamics” issue?  Has nobody at Boeing heard of a thing called a “limit switch”?
  • It has been reported from public information that both Lion Air and the Ethiopian jet had wild fluctuations in their rate of climb or descent and at the time they disappeared from tracking both were indicating significant rates of climb.  For obvious reasons you do not hit the ground if you have a positive rate of altitude change unless you hit a cumulogranite cloud (e.g. side of a mountain or similar), which is not what happened in either case.
  • The data source for that public information on rate of climb or descent did not come from radar; while I don’t have a definitive statement on the data source public information makes clear it almost-certainly came from a transponder found on most commercial airliners known as ADS-B.  Said transponder is on the airplane itself.  It’s obvious that the data in question was either crap, materially delayed or it was indicating insanely wild fluctuations in the aircraft’s vertical rate of speed (which no pilot would cause intentionally) since you don’t hit the ground while gaining altitude and if the transponder was sending crap data that ground observers were able to receive the obvious implication is that the rest of the aircraft’s instruments and computers were also getting crap data of some kind and were acting on it, leading to the crazy vertical speed profile.
  • The Lion Air plane that crashed several months ago is reported to have had in its log complaints of misbehavior consistent with this problem in the days before it crashed.  I have not seen reports that the Ethiopian aircraft had similar complaints logged.  Was this because it hadn’t happened previously to that specific aircraft or did the previous crews have the problem but not log it?
  • The copilot on the Ethiopian aircraft was reported to have had a grand total of 200 hours in the air.  I remind you that to get a private pilots license in the US to fly a little Cessna, by yourself, in good weather and without anyone on board compensating you in any way you must log at least 40 hours.  Few people are good enough to pass, by the way, with that 40 hours in the air; most students require more.  To get a bare commercial certificate (e.g. you can take someone in your aircraft who pays you something) you must have logged 250 hours in the US, with at least 100 of them as pilot-in-command and 50 of them cross-country.  The “first officer” on that flight didn’t even meet the requirements in the US to take a person in a Cessna 172 single-engine piston airplane for a 15 minute sightseeing flight!
  • The odds of the one pilot who actually was a commercial pilot under US rules in the cockpit of the Ethiopian flight having trained on the potential for this single-data-source failure of the aircraft and what would happen if it occurred (thus knowing how to recognize and take care of it) via simulator time or other meaningful preparation is likely zero.  The odds of the second putative flight officer having done so are zero; he wasn’t even qualified to fly a single-engine piston aircraft for money under US rules.

So there are some pilot issues with the Ethiopia pilot training in terms of the co pilot.  However that doesn’t change the fact that an engine design change without subsequent airframe changes means the aircraft had the large potential to stall itself.  the solution was a software package that was able to effectively say F-U to the pilots and do whatever it wanted.  the A330 had a similar issue..a hidden system designed to prevent stalling that would freakout if it got bad data from one of the two or three sensors.  please read the linked article at the market ticker above from a guy who writes code.  the thinking that software can fix everything is a dangerous concept that is killing folks..and will continue to do so.  Remember..the automation is made by people who are not perfect.  “AI” which is a term that is thrown around without any thought to what it means..is designed by flawed people..it will never be perfect.

Karl has posted an article today that pretty much agrees with me..folks need to go to prison over this.  There was plain knowledge that Boeing knew MCAS was trying to fix a flawed airframe design.
This led to Boeing rushing the certification(which never should have passed FAA certification) trying to get the 737 out to complete with the A320.  The Seattle Times(which Karl linked to in the above linked post) has some shocking details and IMO there’s no way folks at Boeing didn’t know the design of the 737 MAX and the subsequent MCAS system wasn’t a dangerous combination that inevitably turned deadly.

 

Quantas flight 72 nearly crashed after one of it’s sensing computers got inconsistent data from it AOA(angle of attack) sensors.  It seems that AOA sensors also cause a freekout of the MCAS software system aboard the 737 MAX series.  The reason the MAX has MCAS is the engines.  The engines on the MAX 737 are larger and because they are unable to fit under the wing they are not only further forward but they are also in a higher mounting position than previous gen 737’s.  Frankly, an aircraft that requires a hidden, pilot overriding, system to be certified by the FAA needs to be removed from the airspace and never allowed to return.  The FAA seriously dropped the ball when it came to certifying this aircraft.  Why Boeing was allowed to do ANY self-certification is unconscionable.  Anyone who knew of this issue should be held personally and corporately liable and suffer fines, loss of revenue and prison time.  This is the only way(not even hundreds of deaths) will change the behavior that lead to these disasters.  Here are a couple of videos about the 737 Max MCAS system: