Solved Non-intrusive way to increase the swap space

  • Thread starter Deleted member 63539
  • Start date
D

Deleted member 63539

Guest
I spend the whole disk for FreeBSD and keep the FreeBSD installer's default swap size, 2G. Now I want to increase swap and found that swap is a fixed partition but not part of the zpool (the zpool is on another fixed partition, so indeed the whole disk is partitioned but not used entirely by the zpool). It seemed re-install (re-partition) is the only solution, and I don't want to do that.

Code:
gpart list ada0
Geom name: ada0
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 234441607
first: 40
entries: 128
scheme: GPT
Providers:
1. Name: ada0p1
   Mediasize: 209715200 (200M)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 20480
   Mode: r0w0e0
   efimedia: HD(1,GPT,96cf85cb-d194-11ea-ba3f-d8cb8a370b18,0x28,0x64000)
   rawuuid: 96cf85cb-d194-11ea-ba3f-d8cb8a370b18
   rawtype: c12a7328-f81f-11d2-ba4b-00a0c93ec93b
   label: efiboot0
   length: 209715200
   offset: 20480
   type: efi
   index: 1
   end: 409639
   start: 40
2. Name: ada0p2
   Mediasize: 524288 (512K)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 209735680
   Mode: r0w0e0
   efimedia: HD(2,GPT,96da25cd-d194-11ea-ba3f-d8cb8a370b18,0x64028,0x400)
   rawuuid: 96da25cd-d194-11ea-ba3f-d8cb8a370b18
   rawtype: 83bd6b9d-7f41-11dc-be0b-001560b84f0f
   label: gptboot0
   length: 524288
   offset: 209735680
   type: freebsd-boot
   index: 2
   end: 410663
   start: 409640
3. Name: ada0p3
   Mediasize: 2147483648 (2.0G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 210259968
   Mode: r1w1e0
   efimedia: HD(3,GPT,96e2cd9a-d194-11ea-ba3f-d8cb8a370b18,0x64428,0x400000)
   rawuuid: 96e2cd9a-d194-11ea-ba3f-d8cb8a370b18
   rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b
   label: swap0
   length: 2147483648
   offset: 210259968
   type: freebsd-swap
   index: 3
   end: 4604967
   start: 410664
4. Name: ada0p4
   Mediasize: 117676359680 (110G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 2357743616
   Mode: r1w1e1
   efimedia: HD(4,GPT,96e8ac07-d194-11ea-ba3f-d8cb8a370b18,0x464428,0xdb30760)
   rawuuid: 96e8ac07-d194-11ea-ba3f-d8cb8a370b18
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: zfs0
   length: 117676359680
   offset: 2357743616
   type: freebsd-zfs
   index: 4
   end: 234441607
   start: 4604968
Consumers:
1. Name: ada0
   Mediasize: 120034123776 (112G)
   Sectorsize: 512
   Mode: r2w2e3

On OpenIndiana, they have the swap inside the zpool. So resizing the swap volume is very easy. Could I add a ZFS volume and use it as swap? I ask before actually try because I think there must be technical problem so it's not do that way by default. If it's possible and has no drawbacks at all I really wonder why the FreeBSD installer doesn't put swap inside the zpool by default. It's much more convenient and reasonable to do so.

Please help me. Thanks.
 
On OpenIndiana, they have the swap inside the zpool. So resizing the swap volume is very easy. Could I add a ZFS volume and use it as swap? I ask before actually try because I think there must be technical problem so it's not do that way by default. If it's possible and has no drawbacks at all I really wonder why the FreeBSD installer doesn't put swap inside the zpool by default. It's much more convenient and reasonable to do so.
Yes, you do not need to re-partition & re-install. It's as simple as zfs create -b `sysctl -n hw.pagesize` -V 2G zroot/SWAP and add it to your fstab(5):
/dev/zvol/zroot/SWAP none swap sw 0 0
To immediately activate it, do swapon /dev/zvol/zroot/SWAP.
IIRC FreeBSD can not dump to a ZVOL. Thus the small swap partition has still some reasonable existance.
 
How about this generic solution:
Code:
dd if=/dev/zero of=/usr/swap0 bs=1m count=512
chmod 0600 /usr/swap0
echo "md99 none swap sw,file=/usr/swap0,late 0 0" >> /etc/fstab
swapon -aL
 
How about this generic solution:
Code:
dd if=/dev/zero of=/usr/swap0 bs=1m count=512
chmod 0600 /usr/swap0
echo "md99 none swap sw,file=/usr/swap0,late 0 0" >> /etc/fstab
swapon -aL
Change the 512 to 2048, and so on up
 
There's a lot of historical chatter about swap on ZFS (zvol or file) leading to kernel thrashing.
You're such a Weisenheimer... perl is on /usr/local on FreeBSD... Please emphazise the term historical... I copied your script to /root/bin/trashswap.pl and run it. So what? Swap fills up to max & the task gets killed. I have none of the special sysctl(8) knobs set that are mentioned in the bug log. IMHO swapping to a ZVOL can be considered safe.
 
mjollnir Do you think I should turn off compression on the swap volume? I think it's no value for a swap volume to be compressed, but I could be wrong.
 
Don't know. I guess reading is a little bit faster even on a SSD for the standard lz4 compression method. Usually, application code & data compresses well. I have compression=on (inherited), and did not have any problems for years. What's more important is to exclude any swap volume from scripts that take snapshots. You might want to ask about swap compression on one of the FreeBSD mailing lists.
 
Your machines have too much memory.

It is actually very simple: when the machine runs out of memory, it does paging. When paging is on zfs, it needs zfs to do that. And in order to do anything, zfs needs memory (lots of, and real physical memory).
Somewhere in this is an obvious logical flaw, and I might assume this, while being techically possible, calls for problems when you get into tough waters.

My old server was running without memory. That is, it was using the existing memory for zfs management only. The programs were living in the swap, and the data in the l2arc. With an SSD this is possible, but the thing needed a very careful configuration for how to use the existing memory and not hit any limits.

Now having moved that thing to newer hardware and running it in 64bit mode, it still has limited memory installed, but allocation goes far beyond the limits that were there before. The whole heap allocation works completely different, and seems to invent memory where there actually isn't any.

Under such conditions the swap-on-zfs trick will probably work - but then, you cannot mock reality forever and at some point the cards have to be shown. On a server that is expected to even cope with DOS situations (to some extent) this may turn out badly.
 
Your machines have too much memory.

It is actually very simple: when the machine runs out of memory, it does paging. When paging is on zfs, it needs zfs to do that. And in order to do anything, zfs needs memory (lots of, and real physical memory).
Somewhere in this is an obvious logical flaw, and I might assume this, while being techically possible, calls for problems when you get into tough waters.

My old server was running without memory. That is, it was using the existing memory for zfs management only. The programs were living in the swap, and the data in the l2arc. With an SSD this is possible, but the thing needed a very careful configuration for how to use the existing memory and not hit any limits.

Now having moved that thing to newer hardware and running it in 64bit mode, it still has limited memory installed, but allocation goes far beyond the limits that were there before. The whole heap allocation works completely different, and seems to invent memory where there actually isn't any.

Under such conditions the swap-on-zfs trick will probably work - but then, you cannot mock reality forever and at some point the cards have to be shown. On a server that is expected to even cope with DOS situations (to some extent) this may turn out badly.
I don't understand your post. And IMHO, 8G of RAM is just barely enough. When building large codebase, it stills need to use swap space. And when it come to virtualization, 16G is at least.
 
I don't understand your post. And IMHO, 8G of RAM is just barely enough. When building large codebase, it stills need to use swap space.

It expects to have swap. That is normal. The memory system is designed to have all memory backed by filespace, this was always the case. But it does not actually execute that option as long as there is memory available.
You can observe that, the sysctl vm.swap_reserved will show tha amount of swap the system has currently requested to back memory-in-use.
This works just like your money: you give your money to the bank to store it, so the bank is expected to have it backed by values. But they do not, they have less than 10% of it. And this usually works - until somebody makes a mistake, like it happened in 2008 with the Lehman crisis. What happened then was they had backed the non-existing money with obligations on other non-existing money. What you're doing with zfs is quite similar, so - it usually works... until something unexpected happens.

What the perl script from above does: it tries to force the system to execute some of the options. But this does not impose a big problem, because perl is a user space program, and can itself be paged out. This will look different if you instead get a sudden demand for real kernel memory (like a huge amount of vnode cache or network buffers) which cannot be put off with nonexisting memory.

But then, it all depends on what you expect a machine to do. I build mine usually so they can run unattended for an indefinite time (i.e. months/years), under any condition.
 
I still don't understand anything you said. Please pardon my ignorant but I decided to remove my swap volume on ZFS and stick to the default 2G swap partition. I will adjust the make jobs to lower the memory needed so the system would never run out of swap space.

My own observation is: GCC with -j3 is as fast as Clang with -j6 on the code base I usually build and cc1plus is memory intensive but c++ is CPU intensive. As long as the build speed still roughly the same lowering the make jobs for GCC is not a big deal at all.
 
I still don't understand anything you said. Please pardon my ignorant but I decided to remove my swap volume on ZFS and stick to the default 2G swap partition. I will adjust the make jobs to lower the memory needed so the system would never run out of swap space.

Okay, simple version:
The zfs-swap is not a problem as long as you run a machine mainly for compiling (development) tasks, and add that swap when the machine needs some more, and consider it a workaround for now.
I think it does become a problem when people start to think it would be "good practice" do so so, and then deploy critical servers in such a design right from the start.

My own observation is: GCC with -j3 is as fast as Clang with -j6 on the code base I usually build and cc1plus is memory intensive but c++ is CPU intensive. As long as the build speed still roughly the same lowering the make jobs for GCC is not a big deal at all.

I thought the memory consumtion is equivalent to how big your modules are, and the compiling is always cpu-bound.
 
I thought the memory consumtion is equivalent to how big your modules are, and the compiling is always cpu-bound.
Then Clang consumed much lesser memory than GCC with make jobs set to -j6. And of course each of the make jobs ( cc1plus or c++ on top's report) always eat 100% (or sometimes more than 100%) of my CPU.

BTW, don't care much about my words. I'm an amateur that don't have any CS background. I just said what I see and what I think to be. No warranty for it to be right.
 
Compile tasks are mostly CPU bound. There might be some rare exceptions for programs where the compiler needs to build up very huge tables in memory, like e.g. build LLVM itself. On decent modern CPU, compiling many small files in a build is so fast that parts of the build become I/O bound (a build run does more than only compiling).
Back on topic: gh_origin, IMHO the amount of your swap (2 GB) is at the edge of reasonable minimum. Yes, in case you have only 1 GB RAM, it's twice that much, which matches the traditional equation swap = 2 x RAM. IIRC there's another rule of thumb to add a certain minimum & maximum to that equation. Obviously these values depend on your workload/use-case. I would recommend to double to 4 GB, and reduce some ZFS & other kernel VM knobs that set minimum values for RAM usage.
 
mjollnir Build LLVM indeed a bit faster than the code base I usually build! Took about a bit more than 2 hours on my system with this 2G swap partition. Only about 198M of swap in use checked right after when the build finished. I have to restart the system to reclaim RAM, as only about 1.2G of RAM available and most of them is used by the ARC Cache, Uncompressed. The compiler used is the system Clang. Didn't try with GCC so don't know how long it would last. Someday I will try it.

Update: I tried. But GCC failed to build it. The LLVM I'm building is mulle-clang (for Objective C), it's here: https://github.com/Codeon-GmbH/mulle-clang

Error message:

Code:
[3299/5583] Building CXX object projects/compiler-rt/lib/fuzzer/CMakeFiles/RTfuzzer.x86_64.dir/FuzzerSHA1.cpp.o
FAILED: projects/compiler-rt/lib/fuzzer/CMakeFiles/RTfuzzer.x86_64.dir/FuzzerSHA1.cpp.o
/usr/local/bin/g++  -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Iprojects/compiler-rt/lib/fuzzer -I/BUILD/mulle/src/llvm/projects/compiler-rt/lib/fuzzer -Iinclude -I/BUILD/mulle/src/llvm/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-class-memaccess -Wno-redundant-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment -fdiagnostics-color -ffunction-sections -fdata-sections -Wall -std=c++14 -Wno-unused-parameter -O3 -DNDEBUG    -m64 -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fno-stack-protector -fvisibility=hidden -fno-lto -O3 -g -Wno-variadic-macros -Wno-non-virtual-dtor -fno-omit-frame-pointer -std=c++14 -MD -MT projects/compiler-rt/lib/fuzzer/CMakeFiles/RTfuzzer.x86_64.dir/FuzzerSHA1.cpp.o -MF projects/compiler-rt/lib/fuzzer/CMakeFiles/RTfuzzer.x86_64.dir/FuzzerSHA1.cpp.o.d -o projects/compiler-rt/lib/fuzzer/CMakeFiles/RTfuzzer.x86_64.dir/FuzzerSHA1.cpp.o -c /BUILD/mulle/src/llvm/projects/compiler-rt/lib/fuzzer/FuzzerSHA1.cpp
/BUILD/mulle/src/llvm/projects/compiler-rt/lib/fuzzer/FuzzerSHA1.cpp:42:11: fatal error: endian.h: No such file or directory
   42 | # include <endian.h> // machine/endian.h
      |           ^~~~~~~~~~
compilation terminated.
[3304/5583] Building CXX object projects/compiler-rt/lib/fuzzer/CMakeFiles/RTfuzzer.x86_64.dir/FuzzerTracePC.cpp.o
ninja: build stopped: subcommand failed.

Regarding the build speed of GCC (-j3) and Clang (-j6), I found it could be caused by the wrong make jobs. Too much make jobs doesn't speed up the build but indeed can slow it down or at least have no noticeable different. BTW, GCC's cc1phus is still uses more memory than Clang's c++. Clang's c++ could be slowed down with too much make jobs but it didn't fail because of out memory.
 
Back
Top