Consistent panic using HP DL360 and DL380 G10 servers

Hiya,

Took this to -current but no bites. Hoping someone can point me in the right direction.

I'm currently evaluating two classes of server which we source through NEC. However, the motherboards for these machines are HP. I can routinely panic both of these machines using 12.0-A4, as well as 11.1-R with a shoehorned in SES/SMARTPQI driver, and 11.2-R with its native SES/SMARTPQI driver. NEC seems to think this is a ZFS issue and they may be correct. If so I suspect ARC, though as I explain further down, I haven't had a problem on other hardware.

I've managed to get a core dump on 11.1 and 11.2, but on 12.0 when the panic occurs, I can backtrace and force a panic and the system claims it is writing out a core dump, but on reboot there is no core dump.

Machine A: HP ProLiant DL360 Gen 10 with a Xeon Bronze 3106 and 16 gigs RAM and three hard drives.

Machine B: HP Proliant DL380 Gen 10 with a Xeon Silver 4114 and 32 gigs RAM and five hard drives.

I install 12.0-A4 using ZFS on root. I install with 8 gigs of swap but otherwise it's a standard FreeBSD install. I can panic these machines rather easily in 10-15 minutes by firing up 6 instances of Bonnie++ and a few memtesters, three using 2g and three using 4g. I've done this on the 11.x installs without memtester and gotten panics within 10-15 minutes. Those gave me core dumps, but the panic error is different than with 12.0-A4. I have run some tests using UFS2 and did not manage to force a panic.

At first I thought the problem was the HPE RAID card which uses the SES driver, so I put in a recent LSI MegaRAID card using the MRSAS driver, and can panic that as well. I've managed to panic Machine B while it was using either RAID card to create two mirrors and one hot spare, and I've managed to panic it when letting the RAID cards pass through the hard drives so I could create a raidz of 4 drives and one hot spare. I know many people immediately think "Don't use a RAID card with ZFS!" but I've done this for years without a problem using the LSI MegaRAID in a variety of configurations.

It really seems to me that when ARC starts to ramp up and hits a lot of memory contention, a panic occurs. However, I've been running the same test on a previous generation NEC server with an LSI MegaRAID using the MRSAS driver under 11.2-R and it has been running like clockwork for 11 days. We use this iteration of server extensively. If this were a problem with ARC, I assume (perhaps presumptuously) that I would see the same problems. I also have servers running 11.2-R with ZFS and rather large and very heavily used JBOD arrays and have never had an issue.

The HPE RAID card info, from pciconf -lv:

Code:
smartpqi0@pci0:92:0:0:  class=0x010700 card=0x0654103c chip=0x028f9005 rev=0x01 hdr=0x00
    vendor     = 'Adaptec'
    device     = 'Smart Storage PQI 12G SAS/PCIe 3'
    class      = mass storage
    subclass   = SAS

And from dmesg:

Code:
root@hvm2d:~ # dmesg | grep smartpq
smartpqi0: <E208i-a SR Gen10> port 0x8000-0x80ff mem 0xe6c00000-0xe6c07fff at device 0.0 on pci9
smartpqi0: using MSI-X interrupts (40 vectors)
da0 at smartpqi0 bus 0 scbus0 target 0 lun 0
da1 at smartpqi0 bus 0 scbus0 target 1 lun 0
ses0 at smartpqi0 bus 0 scbus0 target 69 lun 0
pass3 at smartpqi0 bus 0 scbus0 target 1088 lun 0
smartpqi0: <E208i-a SR Gen10> port 0x8000-0x80ff mem 0xe6c00000-0xe6c07fff at device 0.0 on pci9
smartpqi0: using MSI-X interrupts (40 vectors)
da0 at smartpqi0 bus 0 scbus0 target 0 lun 0
da1 at smartpqi0 bus 0 scbus0 target 1 lun 0
ses0 at smartpqi0 bus 0 scbus0 target 69 lun 0
pass3 at smartpqi0 bus 0 scbus0 target 1088 lun 0

However, since I can panic these with either RAID card, I don't suspect the HPE RAID card as the culprit.

Here is an image with the scant bt info I got from the last panic:

https://ibb.co/dzFOn9

This thread from Saturday on -stable sounded all too familiar:

https://lists.freebsd.org/pipermail/freebsd-stable/2018-September/089623.html

I'm at a loss so I have gathered as much info as I can to predict questions and requests for more info. Hoping someone can point me in the right direction for further troubleshooting or at least isolation of the problem to a specific area.

Thanks for your time,

Dave
 
Hiya,

.... I put in a recent LSI MegaRAID card using the MRSAS driver, and can panic that as well.........

JBOD arrays and have never had an issue.

........
isolation of the problem to a specific area....
well, does your LSI-card(which model?) support real JBOD? If not, you could try to flash it to real IT-mode and then test again.
For me one of the advantages of zfs over megacli is that you don't have to wait "100 years" for initializing new HDDs in megacli (which is like a stress test for your HBA) - so mixing Megacli with zfs maybe O.K for some machines/HBAs or maybe not O.K(on your machine now). To isolate the problem I would first look for a known working real JBOD HBA. If that panics again you`ll really know where to look for other problems like RAM, HDDs or so ...
 
Where is the coredump going? I believe the default is the first swap device. I ask because it looks like ZFS is causing the kernel to request more VM which it cannot satisfy. If the coredump goes to a zvol, I can't imagine that's going to have much success. You could try moving swap to a UFS partition, see if that at least gets you a dump file.
 
Ideally you want to have your swap on a dedicated swap partition because then the kernel can do the crash dump even under very dire circumstances. Swap files are always iffy no matter which filesystem is used.
 
I don't know if this is related, bug 230704 (All the memory eaten away by ZFS 'solaris' malloc), there's a proposed patch and a discussion on the FreeBSD-stable ML but I couldn't find that. Regression after 10.X effecting 11.1, 11.2.
 
Back
Top