ZFS kernel panic issue

AndyUKG · Sep 1, 2010

Hi,

I have a server running i386 FreeBSD 8.1 p0 RELEASE. It has only 2GB RAM, and I have tuned it for zfs with the following config in /boot/loader.conf

Code:

vm.kmem_size_max="330M"
vm.kmem_size="330M"
vfs.zfs.arc_max="40M"
vfs.zfs.vdev.cache.size="5M"
siis_load="YES"

It has a single ZFS pool which is under no load other than every hour it has data replicated from another server, which is imported via a "zfs receive" command. Its been up for a couple of weeks but today it has crashed.
In the hurry to get it rebooted (its a production server) we didnt managed to get a screen grab of the console, however this is the last entries in the messages file just prior to the kernel panic:

Code:

Sep  1 15:31:32 <kern.crit> nu kernel: siisch0: Timeout on slot 12
Sep  1 15:31:32 <kern.crit> nu kernel: siisch0: siis_timeout is 00040000 ss 7fffffff rs 7fffffff es 00000000 sts 80062000 serr 00000000
Sep  1 15:31:32 <kern.crit> nu kernel: siisch0:  ... waiting for slots 7fffefff
Sep  1 15:31:32 <kern.crit> nu kernel: siisch0: Timeout on slot 20
Sep  1 15:31:32 <kern.crit> nu kernel: siisch0: siis_timeout is 00040000 ss 7efffffd rs 7efffffd es 00000000 sts 80062000 serr 00000000
Sep  1 15:31:32 <kern.crit> nu kernel: siisch0:  ... waiting for slots 7eefeffd

All looks like it may be a problem with the siis eSATA controller, but I think its more likely to just be memory and ZFS related. Previously when I first put the ZFS data onto this server I had the loader.conf settings for kmem set to 512M and I could panic the server with similar errors every time I tried to import ZFS data via "zfs receive".
Anyone any comments on my config, things I could change? Or is my only quick and easy fix going to be sticking more memory in the server? Ive read that people can run ZFS without problems on a server with 1GB RAM or less, so I thought with 2GB and proper tuning I would be ok....

thanks for any comments, Andy.

graudeejs · Sep 1, 2010

You're running i386 right?
I was compiling OpenOffice packages on FreeBSD-8 i386 for all localizations (29h of intensive work for CPU) and I only had 2G ram (I added some swap file just to be sure, I don't run out of ram, because OOO compilation need A LOT of ram). Yet everything was fine.

The only thin I did with my PC, was:
add to kernel

Code:

options     KVA_PAGES=384

this number must divide by 4

This was maximum KVA_PAGE I could set. Increasing it would make panic during boot

also I added

Code:

vfs.zfs.arc_max=629145600	# 600 MB
vm.kmem_size=1153433600		# 1100 MB
vm.kmem_size_max=1153433600	# 1100 MB
vfs.zfs.vdev.cache.size=8388608	# 8 MB

to my /boot/loader.conf

With these settings I was running for quite some time until 512MB ram died. Later I replaced it with new 2.5G

Hope this helps

EDIT:
The numbers here I got from many experiments until it just worked

AndyUKG · Sep 1, 2010

Thanks for the info!
Yes, Id considered setting KVA_PAGES higher, but then decided that as the default in i386 arch is 256 which equates to 1GB leaving 1GB for user land on a 2GB system that Id just leave it as is. But now having had this problem I suppose I need to reconsider that. I think I can easily get another 2GB RAM so I guess Ill add that and set KVA_PAGES to 512....

thanks Andy.

olav · Sep 2, 2010

Once I recompiled the kernel with KVA_PAGES set to 512, zfs was a lot more stable

Terry_Kennedy · Sep 3, 2010

AndyUKG said:

Code:

Sep  1 15:31:32 <kern.crit> nu kernel: siisch0: Timeout on slot 12
Sep  1 15:31:32 <kern.crit> nu kernel: siisch0: siis_timeout is 00040000 ss 7fffffff rs 7fffffff es 00000000 sts 80062000 serr 00000000
Sep  1 15:31:32 <kern.crit> nu kernel: siisch0:  ... waiting for slots 7fffefff
Sep  1 15:31:32 <kern.crit> nu kernel: siisch0: Timeout on slot 20
Sep  1 15:31:32 <kern.crit> nu kernel: siisch0: siis_timeout is 00040000 ss 7efffffd rs 7efffffd es 00000000 sts 80062000 serr 00000000
Sep  1 15:31:32 <kern.crit> nu kernel: siisch0:  ... waiting for slots 7eefeffd

thanks for any comments, Andy.

That looks like a hardware or siis driver issue. Of course, the panic may either be completely unrelated to these log entries, or perhaps only triggered when a couple devices go missing at the same time.

Conversely, zfs tends to stress different parts of the driver - as an example, the recent 3Ware commit to fix a twa driver issue that only shows up when ZFS is used.

Any chance you got a good crash dump that will have the panic string and other useful information? Personally, I haven't had a useful crash dump since FreeBSD 6.3 - I either get a hang when writing the dump out, or a "Savecore: no core dump found" after the reboot. There are some tricks you can do to get the system to provide more useful info during a crash, but that will require a serial console and something to capture the output. Post a reply if you're interested in that.

AndyUKG · Sep 3, 2010

Hi, yeah Id tried getting a crash dump when I originally saw this issue a few weeks back but unfortunately hit the issue you mentioned that it just hung at the point where it should be writing the dump to disk.
Console wise, its a Dell server with a DRAC console which gives you a web console, not a serial console. Im not physically at the location so not in a position to be attaching anything else to it. From the original issue I noted down the info on the screen which was as follows:

Code:

siisch0: Timeout on slot 20
siisch0: siis_timeout is 00040000 ss 7fffdfff rs 7fffdfff es 0000000 sts 801620
siisch0  ..waiting for slots 7fefdffd

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 06
fault code = supervisor read, page not present
instruction pointer = 0x20:0xc0fb8d64 (always the same)
stack pointer = 0x28:0xc5265be8 (always the same)
frame pointer = 0x28:0xc5265c24 (always the same)
code segment = base 0x0, limit 0xfffff, type 0x1b
	     = DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current proecess = 12 (irq256: siis0)
trap number = 12
panic: page fault
cpuid = 2
uptime 1d21h36m3s
cannot dump. device not defined or unavailable.
automatic reboot in 15 seconds - press a key on the console to abort
panic: bufwrite: buffer is not busy???
cpuid = 2
Uptime 1d21h36m4s
sleeping thread (tid 100028, pid 12) owns a non-sleepable lock

thanks Andy.

Terry_Kennedy · Sep 4, 2010

AndyUKG said:
Hi, yeah Id tried getting a crash dump when I originally saw this issue a few weeks back but unfortunately hit the issue you mentioned that it just hung at the point where it should be writing the dump to disk.

Yup - one of the cases I've run into is a second panic while the original one is being processed (usually from something like a network packet interrupt on a different CPU).

Console wise, its a Dell server with a DRAC console which gives you a web console, not a serial console. Im not physically at the location so not in a position to be attaching anything else to it.

I have several different generations of DRAC card here as well. The problem is that the useful info often scrolls off the screen - particularly if you enable some of the debugging knobs as I mention below. If your DRAC offers SOL (Serial Over LAN), then that might be a way to go.

From the original issue I noted down the info on the screen which was as follows:

Code:

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 06
fault code = supervisor read, page not present
instruction pointer = 0x20:0xc0fb8d64 (always the same)
stack pointer = 0x28:0xc5265be8 (always the same)
frame pointer = 0x28:0xc5265c24 (always the same)
[...]
cannot dump. device not defined or unavailable.
automatic reboot in 15 seconds - press a key on the console to abort

Since it is always in the same location, it seems probable that you're experiencing a software issue - memory problems and other flakey hardware tend to have random values.

It looks like you don't have a dump device. If you aren't running gmirror, a simple dumpdev="AUTO" in /etc/rc.conf should probably suffice. A # ls -l /dev/dumpdev should tell you where it is currently pointing.

If you can't get the system to do a successful crash dump, you could try building a new kernel with these options, which will at least give some useful info on the console:

Code:

options               DDB                     # For debugging bce crashes
options               KDB                     # For debugging bce crashes
options               KDB_TRACE               # For debugging bce crashes
options               KDB_UNATTENDED          # For debugging bce crashes

If you run into the "panic while panicing" problem that I did, you could try this patch I got from jhb@:

Code:

--- //depot/vendor/freebsd/src/sys/kern/kern_mutex.c    2010/01/23 15:55:14
+++ //depot/projects/smpng/sys/kern/kern_mutex.c        2010/03/10 22:33:24
@@ -348,6 +348,15 @@
    return;
  }
 
+ /*
+  * If we have already panic'd and this is the thread that called
+  * panic(), then don't block on any mutexes but silently succeed.
+  * Otherwise, the kernel will deadlock since the scheduler isn't
+  * going to run the thread that holds the lock we need.
+  */
+ if (panicstr != NULL && curthread->td_flags & TDF_INPANIC)
+   return;
+
  lock_profile_obtain_lock_failed(&m->lock_object,
        &contested, &waittime);
  if (LOCK_LOG_TEST(&m->lock_object, opts))
@@ -664,6 +673,15 @@
  }
 
  /*
+  * If we failed to unlock this lock and we are a thread that has
+  * called panic(), it may be due to the bypass in _mtx_lock_sleep()
+  * above.  In that case, just return and leave the lock alone to
+  * avoid changing the state.
+  */
+ if (panicstr != NULL && curthread->td_flags & TDF_INPANIC)
+   return;
+
+ /*
   * We have to lock the chain before the turnstile so this turnstile
   * can be removed from the hash list if it is empty.
   */

AndyUKG · Sep 6, 2010

Terry_Kennedy said:
Yup - one of the cases I've run into is a second panic while the original one is being processed (usually from something like a network packet interrupt on a different CPU).

I have several different generations of DRAC card here as well. The problem is that the useful info often scrolls off the screen - particularly if you enable some of the debugging knobs as I mention below. If your DRAC offers SOL (Serial Over LAN), then that might be a way to go.

Dont think I got that option, think Dell design their DRACs for Windos users!

It looks like you don't have a dump device. If you aren't running gmirror, a simple dumpdev="AUTO" in /etc/rc.conf should probably suffice. A # ls -l /dev/dumpdev should tell you where it is currently pointing.

Yeah, I disabled it because it wasnt working.

If you can't get the system to do a successful crash dump, you could try building a new kernel with these options, which will at least give some useful info on the console:

Due to the fact its a production system, I think the fastest route to getting it stable is replacing it. So I suspect Im not going to get time to do any further testing. We will move the eSATA SIIS card and disks over to a new server so if its a hardware issue I will probably know about it.
Software wise I have a stable system running on the same hardware but with 4GB RAM and running on FreeBSD amd64. This is what we will migrate the problem system to....

thanks Andy.

AndyUKG · Sep 9, 2010

Ok, SIIS eSATA card and disks moved to brand new FreeBSD 8.1 amd64 server with 4GB memory, still same problem

.
Also not able to dump, freezes at "Dumping xMBs..."

Seems next stop is probably attempting to recompile the kernel with the patch as shown above if this is likely to help my crash dump issue...

thanks Andy.

AndyUKG · Sep 16, 2010

ZFS kernel panic issue (SOLVED)

For the record this was in fact a driver bug not a ZFS bug, one of the FreeBSD devs has provided a patch to the SIIS driver to fix this issue. The patch can be found here:

http://lists.freebsd.org/pipermail/freebsd-fs/2010-September/009438.html

thanks Andy.