November 19, 2019 Gamecache Software Storage Technology 1

I made an earlier post about an experiment I am running.  So far so good.  The eldest is having to put her games onto the new G drive her computer sees.  The magic of ISCSI makes it appear as a local hard drive even though it’s on a network server.  I am a HUGE fan of ISCSI and I use it as much as I can…especially when the storage is Linux or UNIX. I did notice that the transfer was maxing out at 650 megabit/second…i know that the machine can do better..it used to do 2 gigabits/second when it was a backup target.  I wondered what has changed throughout the years?  I did a little bit of digging.  ZFS is all about data safety.  You have to be extremely determined to make it loose data for it to have a chance to do so.  sometimes that ultimate safety comes at the price of performance.  I started looking at the numbers and i noticed ram(32 gigs) was not a problem.  CPU usage was less than 20% max.  The disks however were maxed out.  Well it turns out that ZFS has a ZIL(ZFS Intent Log) that is always present.  If there is no ZIL SSD then it’s on the main drives.  I thought that double(or in this case triple) writing to the drives was it…but nope..no there.  I had to dig deeper and dug into the actual disk I?O calls.  It turns out that the default setting for synchronous writes defaults to the application level.  If the application says you must write synchronously…that means zfs will not report back that the write transaction was completed until it does both of it’s copies and verifies them on the array.  Loosely translated if you were to put this in RAID terms it would be a write-through.  Since ZFS is a COW filesystem I am not concerned about data getting corrupted when written..it won’t(again unless you have built it wrong, configured it wrong…something like that)…so I found a setting and i disabled the forcing of synchronous writes.  I effectively turned my FreeNAS into a giant write-back caching drive.  Now the data gets dumped onto the FreeNAS server’s ram and the server says “i have it” and the client moves on to the next task..either another write request or something else.  Once I did that the disks went from maxing out at 25% usage to nearly 50% usage and the data transfers maxed out the gigabit connection.  That’s how it is supposed to be.

There are times for forcing synchronous writes…like databases, financials….anything where the data MUST verified as written before things are released.  that’s when you can force synchronous writes and use a ZIL drive.  This is an SSD(typically) that holds the writes as a cache(non-volatile) until the hard disks catch up.  The ZIL then grabs the data, verifies it’s integrity, and then tells the application the write has been accomplished(because it has) and then passes those writes to the array as a sequential set of files(something hard drives are much better at than random writes).  What’s eve nicer is that you can set the writing behavior per dataset or per zvol.  The entire file system doesn’t have to be one or the other and it doesn’t hurt the ZFS filesystem performance.  More as I figure it out with the ultimate question being…how do games perform when operated like this…stay tuned.