SATA detaching then reattaching on boot - weird wild stuff

I have a Ryzen motherboard (Gigabyte AX370 Gaming) that has 8 SATA ports and one m.2 SSD slot. Pretty cool, right? Perfect for a server. Start with a Ryzen 7 2600, add 32GB RAM, a 512GB M.2 NVMe SSD, an enterprise 1.92GB SATA SSD, and seven more SATA hard drives, and then watch it... ummm ... not work at all.

Here is the dmesg log:
Code:
--snip booting up & bringing up the drives normally--
ada5 at ahcich5 bus 0 scbus5 target 0 lun 0
ada5: <HGST HDN724040ALE640 MJAOA5E0> ATA8-ACS SATA 3.x device
ada5: Serial Number PKXXXXXXX
ada5: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada5: Command Queueing enabled
ada5: 3815447MB (7814037168 512 byte sectors)
ada6 at ahcich6 bus 0 scbus6 target 0 lun 0
ada6: <WDC WD20EARS-00MVWB0 51.0AB51> ATA8-ACS SATA 2.x device
ada6: Serial Number WDXXXXXXX
ada6: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada6: Command Queueing enabled
ada6: 1907729MB (3907029168 512 byte sectors)
ada6: quirks=0x1<4K>
ada7 at ahcich7 bus 0 scbus7 target 0 lun 0
ada7: <HGST HDN724040ALE640 MJAOA5E0> ATA8-ACS SATA 3.x device
ada7: Serial Number PKXXXXXXX
ada7: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada7: Command Queueing enabled
ada7: 3815447MB (7814037168 512 byte sectors)
-- snip everything's going good - maybe too good - it's bringing up the USB, Ethernet, we just booted yay!---
-- snip oh but we're not done yet muhahahah!!! ---
ada6 at ahcich6 bus 0 scbus6 target 0 lun 0
ada6: <WDC WD20EARS-00MVWB0 51.0AB51> s/n WD-WMAZA1164646 detached
(ada6:ahcich6:0:0:0): Periph destroyed
ada6 at ahcich6 bus 0 scbus6 target 0 lun 0
ada6: <WDC WD20EARS-00MVWB0 51.0AB51> ATA8-ACS SATA 2.x device
ada6: Serial Number WD-WMAZA1164646
ada6: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada6: Command Queueing enabled
ada6: 1907729MB (3907029168 512 byte sectors)
ada6: quirks=0x1<4K>
So as you can see this poor peripheral is getting killed and then brought back to life, like a best friend in a zombie movie. How does ZFS get along with zombies? Not great.
Code:
% zpool status

    NAME                      STATE     READ WRITE CKSUM
    Argus                     DEGRADED     0     0     0
      raidz1-0                DEGRADED     0     0     0
        gpt/Argus1            ONLINE       0     0     0
        gpt/Argus2            ONLINE       0     0     0
        gpt/Argus3            ONLINE       0     0     0
        12370386899973975005  REMOVED      0     0     0  was /dev/gpt/Argus4

errors: No known data errors

Some more background info:
1. All the drives appear in the BIOS.
2. The drives & worked fine in my old server and were exported / imported without any issues.
3. I have successfully used this many drives (and more!) using a LSI SAS PCIe controller in another FreeBSD system
4. I dimly remember having problems when I hit drive #9 in another system, although I was using a Marvell PCIe card that time.
5. I have a 500w power supply
6. Argus is the giant with 100 eyes from greek mythology

My thoughts:
1. SATA timeouts? (i.e. need to give FreeBSD more time to detect the drives?). And yeah that's a 2TB WD Green - the rip van winkle of SATA drives. And ata Idle was deprecated.
2. Motherboard disabling the SATA port? But it is only supposed to do that if I have a SATA SSD in the slot, according to the manual. And this isn't exactly disabled.
3. If you help me fix this problem, will I just stuff more hard drives into my case until I have another problem? I mean there's still a lot of room in there...

Thanks for any and all help!
 
Installation Notices for the M2F_32G and SATA Connectors:
Due to the limited number of lanes provided by the Chipset, the availability of the SATA connectors may be affected by the type of devices installed in the M2F_32G connector. The M2F_32G connector shares bandwidth with the SATA3 7 connector.

 
I read that also, but like I said there's the little table that says all of the SATA ports should be available, and they do show up at BIOS.

I'm leaning now towards thinking it's the WD Green drive timing out. Maybe it just doesn't like the SATA controller on this new board.

Is there still a way to set ata idle at boot? I remember once that I did something like this - is this possible with camcontrol?

ataidle_enable="YES"
ataidle_devices="ada6"
ataidle_ada6="-P 0"
 
How are these eight things power-wired? I would not trust on some kind of chinese Y-cables (had lots of trouble with these), I would rather rely on a rugged high-performance industry-level solution, like this one.

I found specifically WDC drives, while respectably fast, being extremely sensitive on slight fluctuations in the power wiring (which happen when multiple drives draw from a wire where there is some oxidation on the connector.
 
How are these eight things power-wired? I would not trust on some kind of chinese Y-cables (had lots of trouble with these), I would rather rely on a rugged high-performance industry-level solution, like this one.

I found specifically WDC drives, while respectably fast, being extremely sensitive on slight fluctuations in the power wiring (which happen when multiple drives draw from a wire where there is some oxidation on the connector.

You may be onto something here. I will move around the power connectors and see if that makes a difference. Unfortunately I'm basically stuck with splitters for at least a couple of connections - my PSU only has five SATA power connectors.
 
Back
Top