A near-death experience
My FreeNAS based NAS has had an interesting life so far but I thought it had bitten the bullet today!
After experiencing yet another power outage caused by the main fusebox tripping, my various servers appeared to come back up OK with fans whirring and lights blinking. However, a couple of hours later when I attempted to access some files on my NAS, I was greeted with a “can’t connect” type error. I tried accessing the FreeNAS web GUI – no response. Next I tried pinging the box from the shell – dead. So, as a last resort I connected up the monitor and a keyboard… and found the console full of errors relating to mount failures and unrecoverable errors.
To cut a long story short it looks like the power failure corrupted the FreeNAS OS install on the flash drive (and after a bit of Googling it sounds like this happens more often than you would hope!). Given that my RAIDZ array is separate from the OS install being split across the 4 Samsung 2TB drives, I was hopeful that I would be able to reinstall FreeNAS on the flash drive and restore the previously configured ZFS volume. Unfortunately I’d not got round to creating a backup of the FreeNAS configuration (tut tut) so I would have to configure it all up again by hand.
The phoenix rises from the ashes…
After plugging in a brand new 4GB flash drive and running the CD based installer again, I had a fresh vanilla FreeNAS install configured with the same network settings as previously. I was then able to access the web GUI and start to restore what I could remember of the previous configuration. First I created the users and groups required and then performed an auto-import of the ZFS volume – which worked flawlessly! Very nice.
After reconfiguring the missing CIFS and AFP shares (including the one required for my iMac Time Machine backups), enabling SSH and installing my own SSL and SSH keys I was more or less back to the state I was in before. Woo hoo!
So what I have I learnt from this?
Well, several things really, including:
- I must get my NAS box connected to my APC UPS (the reason I haven’t so far is just laziness)
- I must make a backup of my FreeNAS configuration in case I need to do this again
- I have an even greater respect for the resilience of ZFS volumes
- I must get my house electrics sorted once and for all!
It’s been a couple of weeks since I built a home NAS using a HP Microserver N36L with 8GB RAM, FreeNAS 8.0.2-RELEASE and 4 x 2TB Samsung F4 hard drives configured as a RAIDZ2. Apart from a scary incident which resulted in an unexpected real world test of RAIDZ2 resilience, the NAS has been pretty stable although I’ve not been blown away by read/write performance over the network. I didn’t really want to get into fine tuning ZFS this early as I was hoping the out-of-the-box performance would be good enough, but it looks like I’m going to have to do a bit of investigation to understand why performance is not as good as I had hoped.
It’s worth mentioning that I was also experiencing regular incidents of the NAS dropping off the network and reappearing several seconds later. This was particularly noticeable when SSHing onto the box using Putty, only to have the shell stop responding and the connection terminated a few seconds later. At the same time the web GUI would also stop responding and any remote file shares would also disappear.
Checking the FreeNAS logs didn’t show anything scary such as disk problems, so I Googled a bit and found many reports of problems with the on-board Broadcom based NC107i embedded network controller on the HP Microserver N36L. Users report regular network disconnection and reconnection problems and many have resorted to installing a separate quality NIC (such as an Intel PRO/1000 server or desktop card) in one of the PCIe slots. This sounded promising and I was all set to order a NIC when it dawned on me that I had been playing about with configuring my various network devices for jumbo frames support and when I couldn’t get it to work reliably had forgotten to revert my Win XP PCs NIC settings back to a default MTU of 1500! As soon as I did this the NAS network connection was steady again so I’ve delayed the purchase of a separate NIC… for now at least!
Testing network speed with iperf / jperf
Given the numerous reports of problems with the on-board NIC in the N36L, the first test I wanted to perform was a low level network test using iperf and its GUI front-end jperf. Luckily iperf is bundled with FreeNAS so it was simply a case of starting it in server mode using the command:
Then I fired up jperf on my iMac and ran a few basic tests…
The results were very positive! After several runs the average TCP transfer rate was around 910 Mb/s (or around 113 MB/s) which must be near the theoretical maximum throughput for a Gigabit network. Now these were not exhaustive tests for any long period or under sustained load, but the on-board network controller appears to be doing its job at least some of the time so I don’t think that’s the main cause of poor performance.
So next I think I need to start drilling down into testing the raw hard drive IO performance and then maybe onto a bit of ZFS tuning. But that will have to wait until another post 🙂