ZFS boot sequence oddity

I have been testing a replacement chassis (an older supermicro 1U, SATA2) for a 4x8Tb (A,B,C,D) zfs pool. The drives are mounted in hot-swap caddies. I have tested booting from all combinations of two drives in each of the 4 (0...3) drive slots. For example (slot#/HDDid): 3A ,2-, 1B, 0- ; 3-,2A,1-,0B ; 3B,2C,1-,0- ; and so on. What I have discovered is that the 3x,2-,1y,0- configuration cannot find the OS and the system will not boot. This happens for any value of x and y.

If I simply move either of the two HDDs to another slot then the system boots normally. For example, given 3A ,2-, 1B, 0- I can move 3A to 2A, or 0A, and the system boots. Likewise I can leave 3A in place and move 1B to 2B, or 0B, and the system boots. If I swap 3A with 1B the system will not boot. I can swap either or both 3A and 1B for C or D and the system still cannot find the OS. Any other configuration of two drives and the system boots normally.

The BIOS boot sequence is USB:HDD, IDE:CDROM, IDE:HDD0, IDE:HDD1, IDE:HDD2, IDE:HDD3. I have not tinkered with this sequence. I temporarily removed all the IDE:HDDs from the boot sequence after which the system would not boot; as expected.

I am curious as to what might explain the effect of the slot3/slot1 configuration. Has anyone else run into something like this?
 
How far does the boot process get when it fails? What messages can you see? Are you using UEFI boot? If so do you have multiple EFI system partitions? If not have you installed boot code on all disks?
 
1. Boot gets to the point where it reports no OS found.

2. See above.

3. No UEFI

4. Yes, boot code is on all partitions.
 
It is interesting that the BIOS lists "IDE:HDDx". PATA (IDE) drives needed the master/slave explicitly set to work correctly and some of the first SATA drives also had a master/slave jumper. However, given their size, I expect your drives are modern and do not have a master/slave jumper. Have you updated to the latest available BIOS for the motherboard? Is there a way in the BIOS to switch the hard drives from IDE to AHCI mode?
 
Make sure to use labels and refer to those instead of the actual drive's nomination. Because the disks (and controllers) move around their nomination can change (ada0 will move to ada4 for example), the labels however will always remain the same.

Also keep in mind that additional controller cards may have their own boot ROM and its settings could take precedence over the machine's BIOS settings.

What SuperMicro mainboard does the machine have? As you refer to SATA disks but the controller is detected as IDE there may be a BIOS setting to switch between AHCI (preferred), IDE or RAID.
 
An afterthought: to save your sanity, it might be worth tracking down your motherboard's manual in case it specifies particular requirements for disk controller connections.
 
I will check the AHCI setting. The system I resurected for this is quite old (c.2008), but I do have the manual for it. There is no BIOS update available from SuperMicro due to the system's age. I am loath to use any of the 3rd. party BIOS updates for this system but they do exist.
 
The system I resurected for this is quite old (c.2008), but I do have the manual for it.
We would still like to know the board's model/type. So we also know what we're dealing with.
 
Looking at the SuperMicro X7DBU user manual, page 4-4 (page 60 of the PDF), I believe you should adjust the BIOS settings. I would recommend:
  • Set Parallel ATA to disabled
  • Check Serial ATA is set to enabled (this is the default)
  • Set SATA Controller Mode to Enhanced
  • Set SATA AHCI to enabled.
Let me know how you get on. Once you get to the point where FreeBSD is booting, SirDice's recommendation of setting labels is worth noting. Guessing that you're using a GUID Partition Table (GPT) you can set labels with gpart(8). For example, if your disk A is /dev/ada0 you can set the label "DiskAPartition2" for the partition at /dev/ada0p2 with gpart modify -i 2 -l DiskAPartition2 /dev/ada0. You can then refer to when creating your ZFS pool as /dev/gpt/DiskAPartition2 rather than /dev/ada0p2, which might change to /dev/ada1p2 if you put the disk caddy into a different slot.
 
Just to be clear: the system does boot. It boots if all four drives are inserted; or if only two drives are inserted, provided that the slots occupied are not 1 and 3. It could be that the BIOS will only boot from 0 and 2 so the contents of 1 and 3 are irrelevant so long as one of 0 or 2 is occupied.

I will check the BIOS settings at my first opportunity and report.
 
Back
Top