DELL H755 RAID support

We have been using DELL Server systems with FreeBSD since it at version 7. FreeBSD support for DELL hardware has always been great, up until recently. When DELL released their 14 Gen systems (PowerEdge R740), we found that there was NO support for the H740 RAID board. But we were able to replace the H740 with the H730 in those sytems and all was fine. Now DELL has released the 15 Gen systems with the H755 RAID interface. Also, the H730 PCIe RAID board is now EOL... Which would not matter anyway, since the 12 bay RAID backplane connectivity would not support the PCIe H730 anyway. I cannot find anything on the FreeBSD 13 HCL or forums about the H740/H755 RAID. When I try to boot from a FreeBSD 13 (BIOS or UEFI - tried both), it starts to load, so it seeing the RAID1 boot drive pair, but then fails with a 'mount root', and does not show any available drives for 'tank'. ANY SUGGESTIONS?

Thanks for taking the time to read this...

Dale Kline
 
P.S., there DOES seem to be Linux support for this H755 RAID configuration, but we cannot go to a new operating system. After over 10 years, we are tied into FreeBSD.
 
P.P.S. I did try loading (ugh) Windows 10 on the stsyem, and then Ubuntu 20. Both of these worked fine, and accessed other RAID0 drives. so I know the hardware is good, as long as it is not FreeBSD... :-(
 
I don't know about the H755 (but interested because looking at the R6515) but I've got two Dell R640s working fine with RAID-5 on H740 controllers e.g.
Code:
# MegaCli -AdpAllInfo -aAll
                                   
Adapter #0

==============================================================================
                    Versions
                ================
Product Name    : PERC H740P Mini
Serial No       : 0AV024L
FW Package Build: 51.13.1-3662

EDIT: I'm running 13.0-RELEASE on both R640s.
 
Dell R740 here with PERC H740P. Running FreeBSD 12.3 without any issues. H755 isn't listed in mrsas but then again H740 is missing as well. you should definitely check the mailing lists.
Code:
# MegaCli -AdpAllInfo -aAll

Adapter #0

==============================================================================
                    Versions
                ================
Product Name    : PERC H740P Mini
Serial No       : 7CM019Q
FW Package Build: 51.14.0-3900
 
Did some checking on the DELL H730P and H740P. Both used the LSI SAS3108 chipset. So the RAID support should have been similar, if not the same. We were then using FreeBSD10.3, which could explain why the H740P did not work for us in the PowerEdge R640 and R740xd systems.
The PowerEdge R450 and R750 use the newer LSI SAS3916 Tri-mode (whatever that is?) ROC chipset which is an entirely new architecture.
Windows 10 and Ubuntu installs (for testing hardware) both install and boot up fine on this system...
 
You might need to enable mrsas(4), it's possible mfi(4) takes precedence but fail to attach to the card. Some LSI chipsets only work with mfi(4), some with both mfi(4) and mrsas(4) and the latest cards only work with mrsas(4).

Code:
 Using /boot/device.hints (as mentioned below), the	user can provide a
     preference	for the	mrsas driver to	detect a MR-Fusion card	instead	of the
     mfi(4) driver.

	   hw.mfi.mrsas_enable="1"
 
Why not simply ignore the hardware RAID and use separate HDD/SSDs with ZFS? Most hardware RAID controllers have one main disadvantage - the controller has no backup. And if you are using e.g. RAID5 configuration and RAID controller fails, your server stops working until you find the same controller.
 
There is a 'non-RAID' configure setting in BIOS, but I am having a hard time figuring it out without wiping the boot drive in the process. :-( Fortunately, these are just 'dupes' of working system drives for testing. Time to try that again...
 
Nothing wrong with sticking to UFS on hardware RAID if that's what you've been using for all those years before.
 
UPDATE: Finally figured out the DELL RAID setup - no more Ctl-R on boot... RAID configuration is now in BIOS (F2) setup under Device Configuration. I changed the boot drive from RAID0 to JBOD. Boot drive is FBSD 13.0 UEFI boot. (This 'same drive' boots in previous model DELL PE R740xd with H730 RAID board, without problems. The new PE R450 uses the H755 with LSI sas3916 Trimode ROC chipset.)

No help. It starts OK, going through POST checks, finds one non-RAID drive, starts booting from Drive C:, shows the FreeBSD splash page, runs another five seconds, and crashes with the "Fatal trap 12 error while in kernel mode". "Fault code = supervisor read instruction, page not present" (picture attached)

I still believe this is a driver problem with the new H755 RAID board. It does not seem that anyone is working on drivers for ths chipset. The system boots fine off a Windows10 drive, and an Ubunbtu 20 drive.

Any suggestions would be appreciated. We can't be the only DELL <> FreeBSD users out here? THANKS....
 

Attachments

  • PE R640 error print.jpg
    PE R640 error print.jpg
    670.8 KB · Views: 264
The LSI 3916 chip is used in lots of other boards, sold by Lenovo (IBM) and Supermicro, plus obviously Broadcom's own boards. Those chips have been out for years. It seems unplausible that nobody else has used those chips with FreeBSD, in general. On the other hand, for using exactly that Dell H755 board with release 13, you might be the first one.

But let me ask a question for a quick check: Have you tried booting FreeBSD on this machine from something else (like a USB stick or some other port that doesn't use the LSI chip)? That would verify that there isn't something more basic broken here. You can do that both with and without the Dell H755 board inserted, and see that you at least get that up and running.
 
There's lots of mixing up of Dell server codes and RAID controller codes in your messages.

Your text talks of the H755, but your screenshot says PE R640 - so with the H740?

I've got Dell R430s and R640s running H730 and H740 with FreeBSD 13.0 RELEASE using hard drives and SSDs, but all in RAID-1 or RAID-5 mode so haven't used JBOD.

I've not yet got a server with the new H755 - is that what you are really interested in, or something to do with the earlier generation(s) or both?

It could certainly be the case that no-one has written a FreeBSD driver for the newer hardware - I think LSI supplied the FreeBSD driver for the previous RAID generations and maybe they haven't for the H755 etc.
 
No help. It starts OK, going through POST checks, finds one non-RAID drive, starts booting from Drive C:, shows the FreeBSD splash page, runs another five seconds, and crashes with the "Fatal trap 12 error while in kernel mode". "Fault code = supervisor read instruction, page not present" (picture attached)
Did you enable the mrsas(4) driver by adding hw.mfi.mrsas_enable="1" to /boot/loader.conf?
 
This system that is failing is the PE R450 with H755 RAID - 15th Gen DELL just released - our first system at this 15th Gen. I saved that screen shot above with the wrong system name...

"I think LSI supplied the FreeBSD driver for the previous RAID generations and maybe they haven't for the H755 etc." > I really do believe this is the case. :-( How do we convince someone to look at it? DELL has notified us that the H730 is going EOL.

I have PE R730xd 13th Gen, and R740xd 14th Gen, both using the H730 RAID board. (We had trouble three years ago with the H740 RAID, which seems to have been resolved now.) These ran fine in the past with FBSD 10.3, and are running OK now with FBSD 13.0.

I tried running the FBSD 13.0 installer USB on the R450, and it crashes at the same place with exactly the same error.



Checking on that mrsas enable right now, but probably won't have that answer til tomorrow morning. It is now heading for 5:00pm.
 
I suspect that yes, the H755 hasn't got a driver yet but might be more a question for the mailing lists.

Looks like support was added to recognise the controller:


... but I don't know enough to find out if a driver exists yet (you could try 14.0?)

I was days away from ordering a R6515 so your questions about the H755 very pertinent for me. Might have to see if I can get a R640 instead and stay on the previous generation until there's a definite answer on support for the new RAID controller.
 
This system that is failing is the PE R450 with H755 RAID ...
"I think LSI supplied the FreeBSD driver for the previous RAID generations and maybe they haven't for the H755 etc." ...


No, there is some confusion here. I suspect that a Dell H755 card is nothing but a LSI/Broadcom/Avago 3916 chip. Probably all Dell does is to make the PC board and the connector, buy the chip from Broadcom, minimally modify the firmware of the chip (probably mostly to change the model name and number to be "Dell H755"), and then ship it. That is how most HBA/RAID cards are done. The support questions is 99% about FreeBSD support of the LSI 3916.

How do we convince someone to look at it?
FreeBSD is a free operating system, and support is mostly friendly people on the Internet giving you tips (like SirDice above). If you can find a concrete bug in FreeBSD, you could file a bug report ... but saying "it crashes but I don't know where and why" is not a concrete bug. I also suspect (without direct knowledge) that LSI/Broadcom funds the software engineer(s) who writes the mrsas driver; if you were a large Broadcom customer, you could contact them for engineering support (BTDT, got the T-shirt). You could in theory contact Dell, but my educated guess is that they'll say "FreeBSD who? not our problem, we support Linux and Windows". And my educated guess is that if you contact Broadcam directly, they'll say "you're not a customer".

I tried running the FBSD 13.0 installer USB on the R450, and it crashes at the same place with exactly the same error.[
Was that with or without the LSI-based HBA in the system? If it was without, then the R755 is not the problem. If it was with, then it is possible that the R755 is the cause of the problem, even when not booting from it. In that case, after checking for the mfi versus mrsas question, I would get hold of a kernel with debug symbols and look at the stack traceback when it crashes.

By the way, somewhat unrelated: I always thought that the standard RELEASE kernel has symbol names; when I get stack tracebacks, there are function names there. Or am I hallucinating?
 
Hi Dale,

I have several 15th generation Dell servers in production with that same controller running Windows Server 2019. We had some boot reliability problems as well and it turned out to be an early revision of firmware that shipped with the controller.

I used the embedded Dell Lifecycle Controller (available via F10 at POST time) to update the firmware to a more recent revision. This cured our boot reliability problem.
Windows is not FreeBSD but maybe the same issue applies more generally.

-Don
 
I avoid DELL brand after replaced a HDD of Laptop. It was necessary to disassemble nearly all parts (including keyboard and display) to reach the HDD.
 
I avoid DELL brand after replaced a HDD of Laptop. It was necessary to disassemble nearly all parts (including keyboard and display) to reach the HDD.
Consumer, of-the-shelf, laptops aren't comparable to enterprise server grade hardware. Those things are typically built like a tank and pretty much all components are usually easily replaced, you don't even need a screwdriver nowadays. In some cases components can even be replaced without powering down the system.
 
You are saying that DELL engineers for "consumer" laptops are not good but for servers are very good? I am not sure. Equivalent HP servers are better or at least do not have such problems with RAID card support, etc.
 
Back
Top