Hello Everyone,
We've been successfully using HAST (with ZFS and NFS on top) for years and years on a number of different deployments for our customers. Typically the setup is two HAST volumes with ZFS running zmirror over it.
Today, under FreeBSD 11.2-RELEASE-p4, we went to setup an instance using 4 disks using zraid1, and ran into an issue as soon as any significant writing occurred:
Two out of the 4 total HAST volumes immediately "crashed" and failed to report:
On further inspection in syslog we'd see:
The only trace of this error I can find is in this mailing list entry: https://lists.freebsd.org/pipermail/freebsd-current/2015-May/055750.html
However we're not running a custom kernel. This seems to be something specific to running more than 2 volumes.
Does anyone have any insight into what limit is being hit and how to fix it?
I can't find much documentation on MAXPHYS and what it does (or did).
I would be grateful for any assistance -- please let me know if there is a better place to post this or anyone needs more details.
Thank you!
We've been successfully using HAST (with ZFS and NFS on top) for years and years on a number of different deployments for our customers. Typically the setup is two HAST volumes with ZFS running zmirror over it.
Today, under FreeBSD 11.2-RELEASE-p4, we went to setup an instance using 4 disks using zraid1, and ran into an issue as soon as any significant writing occurred:
Two out of the 4 total HAST volumes immediately "crashed" and failed to report:
Code:
>>>> hastctl status
;;Name Status Role Components
zhsubd0 complete primary /dev/ada0p4 nas2
zhsubd1 - init /dev/ada1p4 nas2
zhsubd2 complete primary /dev/ada2p4 nas2
zhsubd3 - init /dev/ada3p4 nas2
On further inspection in syslog we'd see:
Code:
Dec 4 18:30:14 nas1 hastd[1475]: [zhsubd1] (primary) G_GATE_CMD_START failed: Cannot allocate memory.
Dec 4 18:30:14 nas1 devd: Processing event '!system=DEVFS subsystem=CDEV type=DESTROY cdev=hast/zhsubd1'
Dec 4 18:30:14 nas1 devd: Processing event '!system=GEOM subsystem=DEV type=DESTROY cdev=hast/zhsubd1'
Dec 4 18:30:14 nas1 hastd[578]: [zhsubd1] (primary) Worker process exited ungracefully (pid=1475, exitcode=71).
Dec 4 18:30:14 nas1 hastd[578]: [zhsubd1] (primary) Changing resource role back to init.
The only trace of this error I can find is in this mailing list entry: https://lists.freebsd.org/pipermail/freebsd-current/2015-May/055750.html
However we're not running a custom kernel. This seems to be something specific to running more than 2 volumes.
Does anyone have any insight into what limit is being hit and how to fix it?
I can't find much documentation on MAXPHYS and what it does (or did).
I would be grateful for any assistance -- please let me know if there is a better place to post this or anyone needs more details.
Thank you!