ZFS System crashes if mfid disk fails

Serzh · Dec 28, 2016

I tried to simulate failure one of disks:
# mfiutil fail 5
System crashes.
Each disk in raid0 mode.
zfs raidz2 build from bunch of mfids
This happens only under heavy load

11.0-RELEASE-p2

aribi · Dec 30, 2016

Please explain "system crashes". Is that a kernel panic with extra info?
Does the crash also occur when you "fail" a disk that is NOT part of a zfs pool (for example your hot spare)?
Is it a sas connection and do you have multipath on?
Please take notice that running zfs on top of hardware raid is not considered good practice http://serverfault.com/questions/189414/zfs-on-top-of-hardware-mirroring-or-just-mirror-in-zfs.

Serzh · Dec 30, 2016

Kernel Panic, and after several seconds system reboots.
I know that this is not recommended, but controller does not sopport JBOD mode.

I've tried to do same thing in idle mode and it was ok.
Anyway system should not behave this way.

aribi · Dec 30, 2016

Again, info about the panic would be helpful. My hunch is you have a zfs-deadman panic. This will occur when zfs finds it cannot flush buffers in a reasonable amount of time (ca 10 seconds); and instead of making matters worse by letting buffers fill even further zfs "pulls the emergency handle". Your indications (high load and hardware raid taking its time to reorganize at bios speed) point in that direction.
I have seen some cases where switching to the newer mrsas(4) driver resolved instabillity. This post http://lists.dragonflybsd.org/pipermail/users/2014-July/128703.html and mfi(4) can help you moving to mrsas. Your card needs to be supported, but the hw-list in both mrsas and mfi manpages is not complete; ie both drivers will work with more cards then shown in the list.
Also, your card might not support real JBOD (seen that on Dell Perc's) but the next best thing is defining a single volume for every disk (unmirrored). This will also prevent the hw-raid card from going into trickle-mode when a disk fails.

Serzh · Jan 18, 2017

Thanks a lot for mrsas info, I had switched to it and continued tests with it.

Serzh · Jan 30, 2017

Is there any util like mfiutil to configure and replace disks ?

aribi · Jan 30, 2017

There is the port sysutils/megacli; the vendor provided attempt for a commandline interface which has a -RemCdev option. Can't vouch for it though - don't know how and if it works.

ZFS System crashes if mfid disk fails

Serzh

aribi

Serzh

aribi

Serzh

Serzh

aribi