64GB system not saving crash dump - how large should swap be?

One of my servers crashed shortly after logging a read error on an NVMe drive used for ZFS L2 ARC. I presume the read error caused a panic, although I don't quite understand why that would happen (L2 ARC is expendable data, so ZFS should just fault the device and continue?)

I have dumpdev="AUTO" set in rc.conf but there's no vmcore files in /var/crash.

I thought at first it was because it wanted to dump the entire 64GB physical RAM to swap (which is only a 2GB partition), but the FreeBSD handbook says Minidumps ... "Hold only memory pages in use by the kernel (FreeBSD 6.2 and higher)" ... "Minidumps are the default dump type as of FreeBSD 7.0"

I'm having some trouble determining how much memory "in use by the kernel" might be under amd64. What should I increase the swap size to in order to successfullyreliably dump?

FreeBSD 12.1-RELEASE r354358 (amd64)

Thanks.
 
sysctl -d vm.kmem_size_max
vm.kmem_size_max: Maximum size of kernel memory (1.25 TB on my 12 GB RAM laptop)
Remember that buffers & caches add most to kernel memory in use, but I don't know if these are included in a minidump. If you don't want to repartition (dump device>=physical RAM), consider to enable a textdump(4) (sysctl knob debug.ddb.textdump.pending) and have sysrc -v dumpon_flags to include compression.
EDIT I have a small swap/dump device of 4 GB on this machine (add. 12 GB swap on ZVOL) and one minidump succeeded since I have this machine. It was ~1/2 GB compressed. This is not a server use-case, though.
 
If minidumps include caching then on any busy server that's going to be almost the same as dumping all memory. According to sysctl vm.kmem_map_size, I'm already up to ~39GB, and that will climb as the server warms up the cache (L2ARC already has 27GB of data after 3 hours uptime). To further complicate things, in a few days I'll be fitting larger sticks, which will increase the total physical memory to 128GB.

I have plenty of disk space so I can certainly allocate 8GB or 16GB to swap (by doing some repartitioning), but I'd rather be more certain about a size estimate...

From a read of the man page, it looks like textdump requires a custom kernel?
 
sysctl -da | grep dump
[...]
kern.coredump_pack_vmmapinfo: Enable file path packing in 'procstat -v' coredump notes
kern.coredump_pack_fileinfo: Enable file path packing in 'procstat -f' coredump notes
vfs.zfs.zio.exclude_metadata: Exclude metadata buffers from dumps as well
[...]
 
If I understand it all correctly, it would only dump what the TLB contains (at least that would make minidumps small enough). But I did not check the code for seeeveral years.
 
Does the dump process blindly write as much as possible and abort if it runs out of space, or does it pre-check?

Wondering whether the complete absence of any crash files means that it failed because of insufficient space (latter case), or for some reason (config etc) it didn't attempt it at all.

I'll be able to do some quick testing when I install the new RAM, but I'd rather go in prepared. :) In the meantime, mysql is rebuilding a 1TB+ MyISAM database (don't ask...)
 
Does the dump process blindly write as much as possible and abort if it runs out of space, or does it pre-check?
I did not RTSL for you, but I had debug.ddb.textdump.pending="1" in loader.conf(5) and produced a kernel dump by sysctl debug.kdb.panic=1 as described in dumpon(8). It did a normal minidump, not a textdump(4), and the knob is not visible, thus not in effect. So to get a textdump(4), a custom kernel is required ( options TEXTDUMP_PREFERRED & TEXTDUMP_VERBOSE). To be safe, size the dump device as large as physical RAM. To have some swap even on a machine with very large RAM (that will likely never swap), is a safety net, anyway. EDIT I wouldn't recommend to restrict vm.kmem_size_max to the size of the dump device, which is relatively small in your case.
 
Back
Top