ZFS FreeBSD 10.3 ZFS SAS errors on boot


OS: FreeBSD 10.3
SAS HBA: LSI 9201-16e
FAST-DISKS: Samsung 850 Pro MZ-7KE1TOBW 1TB

Hey All,

I'm having an issue with my FreeBSD build for our homebrew NAS solution here at the office. I recently built it out with just one Supermicro enclosure full of disks and everything went swimmingly. Zpool created, replication and snapshotting all functional boot time very quick.

Today more gear arrived and I installed the second enclosure full of disks. During this install I unfortunately broke the SAS chain while some writes were happening and of course the disks were none-too-happy about that (none of this data is production so it's actually not a huge deal yet). After getting everything cabled back up correctly I investigated the damaged zpool and decided to go for a reboot to see if it would come back up happier. Much to my dismay on boot the server generates a lot of timeout errors on the console related to the disks.
Example errors are pasted below:

mps0: mpssas_ata_id_timeout checking ATA ID command 0xfffffe0000af4440 sc 0xfffffe0000aa8000
mps0: ATA ID command timeout cm 0xfffffe0000af4440
mpssas_get_sata_identify: request for page completed wth error 0mps0: Sleeping 3 seconds after SATA ID error to wait for spinup

These error numbers increment up and I can only assume it generates one for every one of the 90 disks in the system (perhaps even more than one error per) there's an associated wait time between each error being generated so this causes the boot up to take forever (37 minutes). After it finally boots up, everything appears to be functional and healthy. Mid way through the boot I get another type of console message:

mps0: mpssas_add_device: failed to get disk type (SSD or HDD) for SATA device with handle 0x002a
mps0: mpssas_add_device: sending Target Reset for stuck SATA identify command (cm = 0xfffffe0000adf830)
        (noperiph:mps0:0:130:0): SMID 1 sending target reset
        (xpt0:mps0:0:130:ffffffff): SMID 1 recovery finished after target reset
mps0: Unfreezing devq for target ID 130

It seems to go through adding a fair few disks identified by target IDs as a result of their failure to identify by command. At a certain point though it runs into this type of error:

mps0: _mapping_add_new_device: failed to add the device with handle 0x0038 to persistent table because there is no free space available.

It hits these errors in a big string amidst all the timeouts being generated. Not sure what to make of them. After that it goes back to the timeouts with an occasional target reset until the target id's have incremented up from ID 130 (the first one) to like ID 154 (the last one).
After this point it grinds through a ton of errors that look like:

mps0: mpssas_add_device: failed to get disk type (SSD or HDD)for SATA device with handle 0x007a
mps0: mpssas_get_sata_identify: error reading SATA PASSTHRU; iocstatus =0x47

After generating a butt ton of those, it finally runs through the disks, turns on the ethernet ports, decrypts all the partitions and boots up to happiness.

In summary: With only one enclosure in place, none of these disk errors occurred.
We currently have a production system that includes all parts identical to these ones including two enclosures so I know the configuration is theoretically possible. After 36 minutes once the system finally finishes booting, the zpools are healthy and all the disks come up fine in camcontrol. It only appears to cause me grief when the box is starting. Of course this is no good if we have to patch and restart some time in the future. The thing shouldn't take this long to spin up.

Any ideas?
Putting SATA disks behind SAS expanders is asking for trouble. From a first glance it looks like the driver assumed your disks failed to respond because they weren't ready (spun up to operating RPM) yet and retried after an annoyingly high delay. Your problem is amplified by the fact that the mps(4)() probes for disks sequentially.
So I have an update to this. After screwing with it a bunch, reinstalling FreeBSD numerous times, flashing an HBA, compiling custom mps drivers, and generally being a nuisance I have come to another finding.

Our production system DOESN'T EVEN have a true SAS chain to begin with, so it's pretty bad to compare this new system to that one if they aren't configured the same way.
But this train of thought led me down a new rabbit hole...

I have attached some diagrams that show what I'm trying to do physically with the cabling and the unfortunate result is the errors I pasted in my OP. Please See Below:

Soup is the currently functional configuration both in production and in my new system.
Sas CHAIN - Soup.vsdx.png

No Soup
is what I originally thought the right thing to do was (because redundancy) but alas... No soup for me.

SAS CHAIN - No soup.png

If I set it up like the Soup configuration and then connect any of the remaining open ports to another one of them (essentially creating a loop that leads back to the SAS HBA) it starts spilling my original spinup errors all over the console. I believe that if I waited the 35 minutes after doing this it would end up being functional but that's obviously not the move.

If anybody knows about why this SAS chain doesn't want to be a chain. I'd love to know more.

I'm not 100% sure but I do believe you're not supposed to loop SAS. Skimming through the enclosure documentation I believe this should be connected like this:

Not redundant:
HBAa p1 -> EncA p1
EncA p3 -> EncB p1
EncB p3 -> EncC p1 etc.

HBAa p1 -> EncA p1
HBAa p2 -> EncB p1

Redundant (needs two HBA cards):
HBAa p1 -> EncA p1
HBAb p1 -> EncA p2
EncA p3 -> EncB p1
EncA p4 -> EncB p2
EncB p3 -> EncC p1
EncB p4 -> EncC p2

HBAa p1 -> EncA p1
HBAa p2 -> EncB p1
HBAb p1 -> EncA p2
HBAb p2 -> EncB p2
You may daisy chain SAS JBODs, but IIRC SAS1 was limited to 3 expanders deep. Those JBODs should contain two SAS expanders (one front and one back) with two upstream and two downstream ports each. There are three common ways to wire these JBODs:
  • Connect all four (2 x 2) upstream ports to the four external SFF-8088 ports. This gets you maximum performance per JBOD and multipath I/O but you can't daisy chain JBODs.
  • Daisy chain the two expanders inside the JBOD. This puts one expander behind the other. This setup still supports multipath I/O but reduces the available bandwidth by half.
  • Connect just one upstream and downstream port per expander. This configuration doesn't support multipath I/O.
FreeBSD supports multipath I/O at the GEOM layer with geom_multipath, but error handling is suboptimal because at the GEOM layer the kernel can't tell the difference between disk failure and link failure.