Solved Kernel panic during boot of FreeBSD 12.2 with PRAID CP400i in JBOD mode

Hi!

I have to install FreeBSD (now 12.2) on 2 servers Fujitsu Primergy RX2520 (dual CPU board) with 1 x Intel Xeon Silver 4208 and PRAID CP400i in JBOD mode. I usually use a pre-installed system on USB disc and I copy it on the new hardware, but I tired the installation USB too. The kernel boot failed with panic:
Code:
...
mfisyspd0: mfi0: DJA NA XXX SYSPDIO

 SYSPD volume attached
Fatal trap 12: page fault while in kernel mode
cpuid=2; apic id=02
fault virtual adress = 0x0
fault code = supervisor read data, page not present
...
processor flags = interrupt enabled, resume, IOPL = 0
current process = 13 (g_down)
trap number = 12
panic: page fault
...
The panic happens on both servers and I guess it might be some tricky BIOS setting. I've booted one of them with FreeBSD 12.1 installation USB to test if JBOD is working. That was several weeks ago and the system booted with no problems at all. Well, tomorrow I have to solve this somehow so any help would be highly appreciated.

Forgot to mention: I've updated the BIOS of one machine but after the fail I left the other as it was supplied. Same result for both BIOS versions - factory and latest. Sorry I had no FreeBSD 12.1 media to try to boot today.
 
An update:
after several hours of experiments I found that the Kernel panics are related to the storage controller. The controller is Fujitsu PRAID CP400i and supports JBOD mode:
Code:
mfi0 Adapter:
    Product Name: PRAID CP400i
   Serial Number: 0000000075062044
        Firmware: 24.21.0-0076
     RAID Levels: JBOD, RAID0, RAID1, RAID5, RAID10, RAID50
  Battery Backup: not present
           NVRAM: 32K
  Onboard Memory: 0M
  Minimum Stripe: 64K
  Maximum Stripe: 64K
Code:
# mfiutil show firmware
mfi0 Firmware Package Version: 24.21.0-0076
mfi0 Firmware Images:
Name  Version                          Date         Time      Status
BIOS  6.36.00.3_4.19.08.00_0x06180203  08/14/2018   10:56:09  active
HIIM  03.25.05.10                      Aug 14 2018  10:48:13  active
CTLR  5.19-0603                        Jun 05 2018  12:09:54  active
APP   4.680.01-8418                    Oct 03 2018  02:33:05  active
NVDT  3.1705.01-0012                   Aug 14 2018  09:01:52  active
BTBL  3.07.00.00-0003                  Jul 31 2015  14:47:18  active
Code:
# mfiutil show drives
mfi0 Physical Drives:
8 ( 1118G) JBOD <SEAGATE ST1200MM0009 SG02 serial=WFK7XY1J\000\000??@> SCSI-6 E1:S3
9 ( 1118G) JBOD <SEAGATE ST1200MM0009 SG02 serial=WFK7XX06\000\000??@> SCSI-6 E1:S2
10 ( 1118G) JBOD <SEAGATE ST1200MM0009 SG02 serial=WFK7WTBG\000\000??@> SCSI-6 E1:S0
11 ( 1118G) JBOD <SEAGATE ST1200MM0009 SG02 serial=WFK7XX16\000\000??@> SCSI-6 E1:S1
The problem occurs only when the drives are put in JBOD. If the drives are unconfigured or in any kind of RAID (e.g. each disk in RAID0) the system runs well. If the JBOD mode is enabled and the drives are set as JBOD the kernel panics. An interesting fact is that if the disks (hot-swappable) are removed before the boot the system starts and runs. When a drive is plugged back in kernel panics immediately. If the dirve is empty or with partitions doesn't make difference.
I think this might be a bug in mfi(4) but I'm not that sure to fill a PR.

I can confirm that FreeBSD 12.0 boots always without problems. The JBOD disks appear as mfisyspd0, mfisyspd1, etc...
 
According to the documentation this is a rebranded card with an LSI (Broadcom/Avago) SAS 3008 chipset. You may want to try and switch to mrsas(4). Note that your disks will show up as da* instead of mfi* when mrsas(4) is used. But as this is a new system that probably isn't going to matter much.
 
Thank you very much! The switch is successful with FreeBSD 12.1 and I'll test it with 12.2 as soon as possible. Changed drive names are not problem because my partitions are labeled. Most important is that sysutils/smartmontools can work with the drives now.
Do you think it would be a good idea to try to cross-flash the card with IT firmware? The card's BIOS shows "no configuration" in JBOD mode thus giving me clues that there is no abstraction in this mode but this could be a misbelief.
 
The switch is successful with FreeBSD 12.1 and I'll test it with 12.2 as soon as possible. Changed drive names are not problem because my partitions are labeled. Most important is that sysutils/smartmontools can work with the drives now.
One downside of using mrsas(4) is that mfiutil(8) doesn't work any more. You can use sysutils/megacli though, that still works.

Do you think it would be a good idea to try to cross-flash the card with IT firmware?
Can't comment on that, I've used quite a few original LSI (Broadcom/Avago) cards but I never used one of the rebranded ones from HP, Dell or Fujitsu. There are others on this board that are much more knowledgeable than I on that subject.

The card's BIOS shows "no configuration" in JBOD mode thus giving me clues that there is no abstraction in this mode but this could be a misbelief.
I think that's correct. These drives are not configured, at least from the card's point of view. The card's firmware doesn't configure them and just passes them along to the OS.
 
Now I can confirm that with mrsas(4) FreeBSD 12.2 works very well with Fujitsu PRAID CP400i in JBOD mode! Since JBOD gives direct access to each drive and sysutils/smartmontools is able to monitor it (and it displays a lot more information than a regular SATA drive) I'm not sure I need mfiutil(8) at all. Switching to the correct driver makes the always tricky and risky reflashing of the board senseless.
Thanks again!

For the inpatient: to switch to mrsas(4) you have to add the following to /boot/device.hints:
Code:
hw.mfi.mrsas_enable="1"
 
Yow, this is a nasty boot-time panic. I'll add some detail and keywords that might lead others here. Thanks to SirDice for the easy workaround and to von_Gaden for the initial report.

I hit this problem during the upgrade from 12.1 to 12.2. The server hardware is a SuperMicro model 1028UX-CR-LL1. These came with the Broadcom 3108 SAS controller, running in JBOD mode with ZFS handling the error checking and redundancy. This is recognized as mfi0 through 12.1-RELEASE and booted fine. Beginning with the 12.2-RELEASE kernel, I get the panic.

This was tricky me for even track down because the panic message only appears briefly before the screen clears as part of the reboot. Taking a video and playing back frame-by-frame found the traceback and I spotted that mfi_send_frame() was running before the first thing that looked like kernel panic handling due to a page fault in kernel mode. That led me here.

I wanted to avoid editing /boot/device.hints because it is read only. I got the same effect by just adding three lines to /boot/loader.conf:
Code:
# Work around boot panic in 12.2 kernel
hw.mfi.mrsas_enable="1"
mrsas_load="yes"
I had swap configured on each of the drives attached via the controller so I had to edit /etc/fstab and replace mfisyspd with da.

Also, if you tripped on this in the middle of a system upgrade like I did, don't forget to run /usr/sbin/freebsd-update install to finish the install.
 
Last edited by a moderator:
Back
Top