Solved Swap - Is it still really necessary?

Hello guys,
I have some conceptual questions.

First of all, on average servers, would swap really be necessary?
Or does FreeBSD already handle memory pressure so well that it doesn't need to be swapped?

I don't know why, but FreeBSD Server Instances on Amazon EC2 have no swap.
On NAS devices, too. No Swap.
I believe there must be a reason for this.

I have a machine with 4GB of RAM with the following status:
Mem:
179M Active - 2497M Inactive - 251M Laundry - 677M Wired - 339M Buf - 219M Free

I don't see any apparent problems. I don't see any memory leaks. It's OK for me.
I will increase the machine to 8GB RAM to run more services. But does it make sense to have SWAP?

If so, I've read that it would be better to add SWAP on a new SSD Disk (according Handbook section 12.12.1), rather than using a space on the same disk.
What do you think about it?

Do you use SWAP on servers with 8GB RAM or more?
Taking advantage of the question,, encrypted SWAP has relative performance impact?

Thanks in advance,
Grether
 
If I'm recalling correctly, if you enable core dumps, the coredumps are save in swap.
Lots of opinions, myths, "common knowledge" around swap.
All that said, given the cost per byte of modern storage devices, I like having a swap partition (I always config swap). Why?
On multi GB/TB devices, a partition of about 10% total device size isn't really killing your storage needs.
On a strictly embedded system with very limited resources, not configuring could be a good thing.

Encrypted swap: like encrypted tmpfs in RAM. Offers you a bit of security in a powered off state. Performance impact? Likely "yes" but the real question should be "performance impact that I would notice" and to that I say "on modern hardware, not you probably won't notice".

Don't do file-backed swap unless you are in an emergency situation.

Above are all my opinions, based on what works for me.
 
Or does FreeBSD already handle memory pressure so well that it doesn't need to be swapped?
The sad truth is: No matter how well you handle memory pressure, once you're just running out, you're running out. And as operating systems these days happily "overcommit" (which means: grant almost any amount an application requests in the hope it won't really get used), applications expect that and might map huge memory areas. So, there's no way for the OS to avoid (or even predict) memory outage.

Once this happens, it's either swap or the OOM killer (randomly killing some memory-hungry process).

Now, swap is slower than RAM ... how much slower depends on your storage devices ... but for typical workloads, this means you don't want to have too much swap either, cause the system might get that slow, this dreaded OOM killer suddenly looks quite friendly ;)

Then, of course, there are probably workloads that are predictable enough, you won't need swap. I could imagine a NAS typically won't run a lot of software reserving a lot of dynamically allocated memory
 
Between 12.x and 13.x I notice that 13.x uses more swap but this is not bad thing if you don't swap in/out a lot. If you want the behaviour of 12.x then you can use a value of vm.pageout_update_period less than the default. If you vm.pageout_update_period=0 then it will use almost no swap. On my systems which have 32GB, 64GB or 128GB RAM I always add a 16GB of swap just in case.
 
Needed? Maybe not. Advisable? Heck yes.
As mentioned before, swap space can get you the benefit of kernel core dumps, so this may come in handy if things go sideways. The swap space should be as large as your RAM, then.
Another thing is large pages. Those are faster than the smaler ones (less MMU table walks), but in order to be allocated the complete (physical) range must be free. When there is a page allocated in the range the kernel tries to get for a large page, it may try to drop it and reassign it to the large page mapping. If it is a text page, no problem. Those are backed by the code segment of a file and that will be reloaded next hit. If not, you need to save it to some place. That'll be swap, if it is not a memory mapped writeable file. So without swap, you may come into situations that large pages are not possible, because you do not have swap. So having a small swap space available can help there.
And the biggest thing is performance under memory pressure. Here FreeBSD is different to linux. FreeBSD starts syncing memory to swap when it is untouced for some time and the page daemon has time. That makes these pages dropable at once, they are clean. This is what the "laundry" part is about, dirty pages waiting to be cleaned. So when memory becomes tight, the kernel can reclaim these pages at once without waiting for them to be swapped out. They may be also be dropped when idle and be used as cache memory (inactive in FreeBSD parlance). This is what you may skip when you do not have swap, without knowing it. My machine has 18 GB of swap set up (compressed zvol), and uses 5 of that at the moment. This is idle browser tabs, login processes on tty1..n, .. you name it. It has 1 GB of laundry (hey, get busy there!) and 1.5 GB cached file data (inactive) + a lot of ARC. Without swap, I'd be short of that inactive memory and a lot of the ARC cache. This would not be visible, the system would work, but the free memory would be also small. And performance would suffer.
 
Sometimes i compile a lot of ports simultanously and then swap comes in very handy at certain times.
Mostly my swap usage is zero. So it does not bother. Not used swap has no bad influence on performance. It just takes a bit disk space on my SSD.
Situation is different for virtual machines but someone might enlighten.
 
Alain De Vos that's one of the things about system resources; it all depends on the workload.
Building ports/systems from source takes up a lot of resources, mostly memory. So it's not unexpected for ports to dip into swap when needed.

Virtual Machines: I think that's a case by case basis. If one is using the VM to say test out new kernels or new drivers that may cause a crash, then swap is useful for the core files. If you are building ports in a VM, it may be useful. General use, it may not be. But it all depends how much RAM is allocated to the VM itself.

My opinions.
 
Thanks for the explanations!

What do you prefer? Attach a new disk and use it as swap? Or create a swap file on the root volume? It will be 8Gb for SWAP.
Create a swap file is more simple, and as I have researched, currently there is not much performance difference between swap partition and swap file.
 
Here's my take:

If you use a swap partition, you can do core dumps, that you can't do on a swap file. I use swap partitions on my desktop and laptop mainly since I can do core dumps as I run CURRENT.

If you run CURRENT or otherwise do kernel work, don't use a swap flile.

If you have a regular "stable" desktop/server, don't expect crashes often (or ever), stick with a RELEASE and don't plan to do kernel development, swap files are just fine.

I'd love to use a swap file on my desktops, but sometimes when I hack on the kernel (not often), I still want a core dump. My TigerLake HP Spectre has been stable with CURRENT for a few months, but when the drivers were new I actively helped.

I don't expect to replace my laptop anytime soon, but when I do I still want to be able to help.
 
The sad truth is: No matter how well you handle memory pressure, once you're just running out, you're running out. And as operating systems these days happily "overcommit" (which means: grant almost any amount an application requests in the hope it won't really get used), applications expect that and might map huge memory areas. So, there's no way for the OS to avoid (or even predict) memory outage.
This sounds correct. But try to explain me a real configuration:

1) System with 4GB RAM. No swap. At some moment the memory is full and we need swap but it is missing. I think for 4GB RAM is preferred 2 or 4GB swap. OK - we add 2/4GB swap and "out of memory" situation is avoided in future.

2) The above system is with 8GB RAM. No swap. Can we discuss the situation when memory is full? 4GB are not enough but 4GB + 4GB swap are enough. OK - now we have 8GB RAM.

I want to say that swap may be necessary for core dump but for normal usage the system has to be able to work without swap if RAM is enough (enough means normal size, not overprovisioned).
 
One note to keep in mind: swapfile will inherit the considerations of the filesystem used. By the nature of the swap being slower than ram, you're adding a new "problem" using a swapfile. Journal and metadata will give a toll on that, not to mention fragmentation, etc. That's one of the reasons I'm always against swapfiles contraptions.
 
I want to say that swap may be necessary for core dump but for normal usage the system has to be able to work without swap if RAM is enough (enough means normal size, not overprovisioned).
I have 16Gb of ram and 4Gb of swap (2 swap partitions of 2Gb on two HDDs).
 
For me, I usually use $MEMORY_SIZE GB on my FreeBSD machines mainly because incase I do kernel work (not often). For me it's like fire or flood or robbery insurance without the damage part.

On a secondary Debian laptop used to build LineageOS, I heavily oversized the swap because building LineageOS uses an absurd amount of RAM but it's an WhiskeyLake HP Spectre with 16GB RAM. Many LineageOS contributors are in India where PCs are expensive and old hardware is common, and I don't even know how they survive when I could barely survive.

My FreeBSD PCs and servers don't need nearly as much swap, the most is 32 GB of RAM on desktops/servers, and 16GB on laptops.
 
  • Like
Reactions: mer
What do you prefer? Attach a new disk and use it as swap? Or create a swap file on the root volume? It will be 8Gb for SWAP.
Create a swap file is more simple, and as I have researched, currently there is not much performance difference between swap partition and swap file.
Over the decades, swap space has become less used, and people have become accustomed to snappy response that's hard to get if you are swapping.

So, for me, swap space is a back-stop. Strictly for unexpected use. So it gets provisioned because it will save me from the deadly OOMs. But the use is sufficiently rare that I don't sweat about performance or placement. I do mirror it, to prevent down time from a broken disk. Generally size is about the same as main memory, which is, I think, overly generous. It was not so long ago when the recommendation for swap was at least three times main memory. But main memories were a lot smaller then.

Cheaper memory has helped a lot. And so have large disk capacities.
 
[begin quote]
As a rule of thumb, the swap partition should be about double the size of physical memory (RAM). Systems with minimal RAM may perform better with more swap. Configuring too little swap can lead to inefficiencies in the VM page scanning code and might create issues later if more memory is added.
On larger systems with multiple SCSI disks or multiple IDE disks operating on different controllers, it is recommended that swap be configured on each drive, up to four drives. The swap partitions should be approximately the same size. The kernel can handle arbitrary sizes but internal data structures scale to 4 times the largest swap partition. Keeping the swap partitions near the same size will allow the kernel to optimally stripe swap space across disks. Large swap sizes are fine, even if swap is not used much. It might be easier to recover from a runaway program before being forced to reboot.
[end quote]
 
All my machines have swap partitions and all from time to time, depending on what I run on them, use swap.

Even the 64 GB 32 core machines used for buildworld and universe testing use a little swap.

I typically see swap used when ARC grows in response to heavy ZFS I/O. Usually there is very little paging, pages are paged out never to be paged in again until next reboot. Typically this would be parts of address spaces outside of the working set.

If the working set is rather large, many times due to poorly written apps which iterate over the y axis of an array instead of the x axis: poor locality of reference, one might see significant paging. I tend to see poorly written java apps that exhibit this behaviour.

Typically the rule of thumb used to be 4x swap to memory ratio. It was reduced to 2x and now 1x swap to memory ratio or even less is acceptable. This of course depends on your application. For a time sharing system 2x or 4x might be recommended. For a laptop 2x is fine. (I use 1.5x on my laptop.) For a database server any amount of swap used is bad. This doesn't mean not to configure swap. Configure it for DBMS servers but monitor swap usage and tune* if swap is used. It's better to use swap than have OOM kill the largest memory process.

* Tuning a UNIX system might mean add RAM, reduce the number of concurrent processes, limiting ARC or UFS cache, etc.

Let's talk OOM now: Contrary to common belief, it is not some random process that is killed as a result of OOM. BSD OOM and Linux OOM are different animals. BSD OOM kills the largest process. Linux OOM kills the process with the largest badness value. Linux defines badness as a function of RAM used times the amount of time an app has been running. IMO Linux OOM is inferior as it may keep the offending process from being killed while killing innocent processes. Whereas the BSD approach of killing the largest process is IMO better.

Short answer: Yes, swap can be useful even on large memory systems. It all depends on the workload you are running on your computer.
 
As to why EC2 instances have no swap? IMO, disk costs money. (At the datacenter I work at we charge $$$ for disk but not for RAM.)

Why would a NAS device not have swap? NAS is designed to be a file server. File servers do a lot of I/O while using their memory for cache. It would be silly to page out file cache to disk. Or put it another way, it would be silly to do I/O to page out cache which is supposed to reduce I/O by caching data. Doing I/O to save I/O makes no sense. You should not have swap for NAS devices.
 
As to why EC2 instances have no swap? IMO, disk costs money. (At the datacenter I work at we charge $$$ for disk but not for RAM.)

Why would a NAS device not have swap? NAS is designed to be a file server. File servers do a lot of I/O while using their memory for cache. It would be silly to page out file cache to disk. Or put it another way, it would be silly to do I/O to page out cache which is supposed to reduce I/O by caching data. Doing I/O to save I/O makes no sense. You should not have swap for NAS devices.
Disclaimer: I work at Microsoft but not on Azure.

Cloud storage is based on "pay as you go". You pay for all the resources you subscribe to.

Big clouds like AWS, Azure, GCP, IBM, et al. are designed for big companies who need tons of scale. You pay a premium but can scale as big as you want. Household names use big clouds because (a) they have massive scale (b) they are one-stop shops having all the features and (c) a lot of people are trained in these clouds. While Netflix or Uber may pay a premium for AWS, they need the massive scale and would lose more if they tried to use DigitalOcean.

Smaller hosts like DigitalOcean, BuyVM, Vultr, etc. are slightly more flat rate but not completely, but then aren't designed for your 5000-node Spark cluster or unprofitable Bitcoin mining farm, they're designed for individuals and small apps. They are cheap but then they don't have huge scale. Heck, stock is a big problem with BuyVM, once there was a "does BuyVM have stock" website a decade ago, and it's still a problem even in 2022.

In my team at Microsoft, I was involved in data purging projects to clear Azure storage resources, and we have reduced our data use by HALF: I think 16PB to 8PB or something like that. I bet a small mom-and-pop VPS host doesn't even have 1 TB of storage for VPS node.

Despite working at Microsoft, I do have servers at "cheaper" hosts: personal website/email at Vultr, Tor exit relays at Psychz Networks, and a few other servers I easily forget about and aren't updated at all.
 
1) System with 4GB RAM. No swap. At some moment the memory is full and we need swap but it is missing. I think for 4GB RAM is preferred 2 or 4GB swap. OK - we add 2/4GB swap and "out of memory" situation is avoided in future.

2) The above system is with 8GB RAM. No swap. Can we discuss the situation when memory is full? 4GB are not enough but 4GB + 4GB swap are enough. OK - now we have 8GB RAM.

Let me first quote myself on this:
Now, swap is slower than RAM ... how much slower depends on your storage devices ... but for typical workloads, this means you don't want to have too much swap either, cause the system might get that slow, this dreaded OOM killer suddenly looks quite friendly ;)

To sum it up, any amount of RAM could be "too little", depending on your workload. But just adding "infinite" swap will never solve that problem. Swap makes sense to get through situations of memory pressure without needing the OOM killer, to some extent.
 
If only the size of RAM+swap is important then in your case you can add 4GB RAM (20GB total) and remove the swap.
All my memory slots are occupied already (unless I exchange all the ram sticks, but meh, it's an ivy bridge and it's already working as expected), and I don't see a reason to remove the swap, specially because I use my system limits for several reasons and I don't feel like testing if OOM killer works :D (actually, I did it already, I had some situations where the OOM killer was called).
 
I am not sure for UFS, never used
For ZFS yes, I suggest a swap of 16GB (a bit overkill, typically 8GB will be enough)
Especially if you use programs that eat RAM (like Virtualbox and backup programs that load (a lot) on ARC)
 
Back
Top