I have a FreeBSD 14.0-RELEASE-p5 backup system that I am having some problems with. I am hoping someone who understands ZFS way better than I do can assist in resolving the issue.
I have a zpool for a Bacula spool that is 17TB in size, and frequently has spool files in the 2TB size range while backing up specific servers. The server has 96GB of memory.
A couple of times now, during backups the system has killed off Bacula Director, and a PostgreSQL instance due to running out of memory.
I thankfully had telegraf setup to monitor this server and was able to import a Grafana dashboard that includes ARC Cache sizes.
As the system writes the spool files, we see that the ARC cache grows to 82GB during the time the processes were killed. Total system memory used shows as 90GB.
I am wondering what my best option is for fixing this issue. I've already tuned PostgreSQL to use less memory, but then ARC ate that up.
I am thinking I could set primarycache=metadata for the zpool, or set vfs.zfs.arc_max to 64GB. Which of these would be the better option? or is there another option I haven't thought of?
zfs get all spool:
I have a zpool for a Bacula spool that is 17TB in size, and frequently has spool files in the 2TB size range while backing up specific servers. The server has 96GB of memory.
A couple of times now, during backups the system has killed off Bacula Director, and a PostgreSQL instance due to running out of memory.
Code:
Mar 3 14:17:24 drs-02 kernel: pid 13100 (postgres), jid 0, uid 770, was killed: failed to reclaim memory
Mar 3 14:18:54 drs-02 kernel: pid 1110 (bacula-dir), jid 0, uid 910, was killed: failed to reclaim memory
I thankfully had telegraf setup to monitor this server and was able to import a Grafana dashboard that includes ARC Cache sizes.
As the system writes the spool files, we see that the ARC cache grows to 82GB during the time the processes were killed. Total system memory used shows as 90GB.
I am wondering what my best option is for fixing this issue. I've already tuned PostgreSQL to use less memory, but then ARC ate that up.
I am thinking I could set primarycache=metadata for the zpool, or set vfs.zfs.arc_max to 64GB. Which of these would be the better option? or is there another option I haven't thought of?
zfs get all spool:
Code:
NAME PROPERTY VALUE SOURCE
spool type filesystem -
spool creation Tue Feb 13 15:36 2024 -
spool used 31.8M -
spool available 17.2T -
spool referenced 120K -
spool compressratio 1.00x -
spool mounted yes -
spool quota none default
spool reservation none default
spool recordsize 1M local
spool mountpoint /bacula/spooling local
spool sharenfs off default
spool checksum skein local
spool compression lz4 local
spool atime off local
spool devices on default
spool exec on default
spool setuid on default
spool readonly off default
spool jailed off default
spool snapdir hidden default
spool aclmode discard default
spool aclinherit restricted default
spool createtxg 1 -
spool canmount on default
spool xattr on default
spool copies 1 default
spool version 5 -
spool utf8only off -
spool normalization none -
spool casesensitivity sensitive -
spool vscan off default
spool nbmand off default
spool sharesmb off default
spool refquota none default
spool refreservation none default
spool guid 3911512592938036835 -
spool primarycache all default
spool secondarycache all default
spool usedbysnapshots 0B -
spool usedbydataset 120K -
spool usedbychildren 31.7M -
spool usedbyrefreservation 0B -
spool logbias latency default
spool objsetid 54 -
spool dedup off default
spool mlslabel none default
spool sync standard default
spool dnodesize legacy default
spool refcompressratio 1.00x -
spool written 120K -
spool logicalused 10.6M -
spool logicalreferenced 54.5K -
spool volmode default default
spool filesystem_limit none default
spool snapshot_limit none default
spool filesystem_count none default
spool snapshot_count none default
spool snapdev hidden default
spool acltype nfsv4 default
spool context none default
spool fscontext none default
spool defcontext none default
spool rootcontext none default
spool relatime on default
spool redundant_metadata all default
spool overlay on default
spool encryption off default
spool keylocation none default
spool keyformat none default
spool pbkdf2iters 0 default
spool special_small_blocks 0 default