I've put together a new FreeBSD (14.3-RELEASE-p2) machine to use as (among other things) a backup server--it's an i5-3570K with two mirrored SSDs attached to the on-board SATA, and 12 drives connected to a pair of LSI adapters (a 9211-8I and 9210-8i, both using the mps driver). The drives are in a zfs pool which is (or will be) used exclusively as a backup destination, and will be imported prior to backup and exported after (so it's used no more than once a night--I haven't settled on the backup frequency yet).
From cold boot the system works great. I've tested with the 12 drives connected and disconnected; I get about 6 repetitions of "root mount waiting for: CAM" when they're removed, and about 10-15 when they're installed. The drives are not the fastest to initialize, so this seems reasonable.
The drives have EPC power management, and use the idle_a (0 sec), idle_b (120 sec), idle_c (600 sec), and standby_z (900 sec) states. I do not want to turn off the power management from drives that are used no more than once a day (for no more than an hour or two)--the savings are significant, especially if the drives reach standby_z state.
Unfortunately that standby_z state is an issue. As long as the drives are in one of the idle states (a/b/c), rebooting the server is fine. It spends more time doing the "root mount waiting for: CAM" loop the deeper the idle state, but it eventually will bring the drives up and continue the boot. However, if the drives have reached standby_z, it hangs indefinitely at that point, and the only solution is to hard power down the server with the power button. I can't actually tell whether the system is hardlocked or if it's just caught in an infinite loop that it never recovers from (the lights on the keyboard still work, but the I wasn't able to scroll back to see any other messages).
However, while FreeBSD is up and running, the drives will come back from standby_z just fine. It take some time for each drive to do so, but if I (for example) import the zpool while the drives are in standby_z, it takes about 38 seconds for the drives to return to ready and for the pool to import (note that because it is a RAIDZ2, it'll import as soon as 10 of the drives become available, which isn't ideal but seems to work). While this is happening however CAM is spamming syslog with errors (command timeout/not ready) starting about 8 seconds after the import command is issued.
I've noticed is that the drives report standby_z as their power state when queried ("Current power state: Standby_z(0x00)"), but if they were previously in standby_z and are coming back to idle_a, they instead report "Current power state: PM0:Active or PM1:Idle(0xff)". Once they finish coming back though they will correctly report "Current power state: Idle_a(0x81)".
Anybody have any ideas of what I can do to to fix any of this? The machine is serves other functions that need to be available 24/7 so powering it off when the backup isn't running is not an option. Thus far the only solution I've come up with is to use camcontrol to permanently disable the standby_z power state, but there have to be other drives out there with that power state, so it seems like there ought to be some other solution...
From cold boot the system works great. I've tested with the 12 drives connected and disconnected; I get about 6 repetitions of "root mount waiting for: CAM" when they're removed, and about 10-15 when they're installed. The drives are not the fastest to initialize, so this seems reasonable.
The drives have EPC power management, and use the idle_a (0 sec), idle_b (120 sec), idle_c (600 sec), and standby_z (900 sec) states. I do not want to turn off the power management from drives that are used no more than once a day (for no more than an hour or two)--the savings are significant, especially if the drives reach standby_z state.
Unfortunately that standby_z state is an issue. As long as the drives are in one of the idle states (a/b/c), rebooting the server is fine. It spends more time doing the "root mount waiting for: CAM" loop the deeper the idle state, but it eventually will bring the drives up and continue the boot. However, if the drives have reached standby_z, it hangs indefinitely at that point, and the only solution is to hard power down the server with the power button. I can't actually tell whether the system is hardlocked or if it's just caught in an infinite loop that it never recovers from (the lights on the keyboard still work, but the I wasn't able to scroll back to see any other messages).
However, while FreeBSD is up and running, the drives will come back from standby_z just fine. It take some time for each drive to do so, but if I (for example) import the zpool while the drives are in standby_z, it takes about 38 seconds for the drives to return to ready and for the pool to import (note that because it is a RAIDZ2, it'll import as soon as 10 of the drives become available, which isn't ideal but seems to work). While this is happening however CAM is spamming syslog with errors (command timeout/not ready) starting about 8 seconds after the import command is issued.
I've noticed is that the drives report standby_z as their power state when queried ("Current power state: Standby_z(0x00)"), but if they were previously in standby_z and are coming back to idle_a, they instead report "Current power state: PM0:Active or PM1:Idle(0xff)". Once they finish coming back though they will correctly report "Current power state: Idle_a(0x81)".
Anybody have any ideas of what I can do to to fix any of this? The machine is serves other functions that need to be available 24/7 so powering it off when the backup isn't running is not an option. Thus far the only solution I've come up with is to use camcontrol to permanently disable the standby_z power state, but there have to be other drives out there with that power state, so it seems like there ought to be some other solution...