- Thread Starter
- #26
[...]There's just no way a machine can do anything useful unencumbered while heavily swapping.
I guess the reference is too obscure. Here: https://queue.acm.org/detail.cfm?id=1814327
[...]There's just no way a machine can do anything useful unencumbered while heavily swapping.
Yes, and this "unusual configuration" is simply having >= 32GB RAM to make this issue very obvious.Therefore, if such a restriction doesn't solve that problem on your machine immediately, something else must be configured in an unusual way.
You didn't describe a "use case", as there's no description how this ridiculous amount of swap should ever be used in a sane way. And yes, it's impossible.getting my use case called "ridiculous"
We're talking about access times of around a few µs at least, as compared to modern RAM which is in the single-digit ns range here, so that's a factor of roughly 1000. Sure, if your swap device is as fast as possible, this helps a bit in situations when swapping is unavoidable. Still, heavy constant swapping kills the performance of any machine with any workload.The sustained random read latency of 3D Xpoint today is around the speed RAM had in the mid 90s.
That's very unlikely. My server with 64GB, many virtual machines and jails, also used for desktop stuff remotely, also doesn't use any swap unless there's only very little free RAM (around 1GB). I monitor that closely cause I sometimes run large builds on it and want to make sure it can stand the load.Yes, and this "unusual configuration" is simply having >= 32GB RAM to make this issue very obvious.
You didn't describe a "use case", as there's no description how this ridiculous amount of swap should ever be used in a sane way. And yes, it's impossible.
He didn't call your use case "ridiculous", but the configuration with that unusual large amount of swap.I want to thank Snurg and Mjölnir for actually addressing my question. I have to say, coming back to the forums after several years and getting my use case called "ridiculous", in the first sentence of the first answer, by a moderator, was not a great experience.
You could have provided that. Or do it now; we're curious...With no other content in the message.
Whatever, when I got enough time and brain free and am in the mood, I'll look at the memory/swap management code and try to find out what needs to be patched to implement a swappiness sysctl, or maybe a build option for zero swappiness (probably easier).
sysctl -d vm.swap_idle_{enabled,threshold{1,2}}
vm.swap_idle_enabled=Allow swapout on idle criteria
vm.swap_idle_threshold1=Guaranteed swapped in time for a process
vm.swap_idle_threshold2=Time before a process will be swapped out
Given the context you put this quote in, do you understand the article? Because, yes, it is possible to write a (special-purpose!) software actively "exploting" virtual memory. Doing so, it has to make sure swapping in/out is reduced to a minimum. The article describes (among other things) how that is achieved."The really short version of the story is that Varnish knows it is not running on the bare metal but under an operating system that provides a virtual-memory-based abstract machine. For example, Varnish does not ignore the fact that memory is virtual; it actively exploits it. A 300-GB backing store, memory mapped on a machine with no more than 16 GB of RAM, is quite typical. The user paid for 64 bits of address space, and I am not afraid to use it."
Given the context you put this quote in, do you understand the article? Because, yes, it is possible to write a (special-purpose!) software actively "exploting" virtual memory. Doing so, it has to make sure swapping in/out is reduced to a minimum. The article describes (among other things) how that is achieved.
Now, this special-purpose software is a high-performance cache for potentially huge amounts of data. Do you want to run that? If not, the conclusion that you could in any way benefit from such a huge swap space is badly flawed.
The idea might be good, disable idle swapping and setting threshold to a few days...sysctl -d vm.swap_idle_{enabled,threshold{1,2}}
EDIT to workaround your issue, you could also reduce your RAM
The thing is, it's disabled by default. And enabling it should only reduce priority for pages of idle processes faster, so they can be swapped out earlier. It shouldn't change anything about not swapping out pages unless there's a (foreseeable) need…The idea might be good, disable idle swapping and setting threshold to a few days...
Nothing against making swap perform better. But it's impossible to solve a few underlying problems, so (yes, in absence of a special-purpose program carefully choosing its data structures in a way that will minimize page faults) heavy swapping will always be a performance killer. In fact, this mentioned program's design actually avoids heavy swapping while still allocating huge amounts of virtual memory.Plus IMHO with these new low-latency NVRAM storage technologies Optane & NMVe, the BeaSD's VM swap implementation could be enhanced to honour a swap device priority to stage swap devices.
Probably not, cause knowing the theory and having touched some kernel code some time is far from enough to be qualified for such reviews. I'd need to invest a lot of time first to get familiar with the FreeBSD kernelIMHO you're the perfect candidate to either help with coding or writing tests or review that stuff.
I disagree. It depends on what the application wants to do.That's a ridiculous amount, completely overkill.
I have 16GB of swap for a machine with 96GB of memory. That's more than enough.
That's by design of the VM system (and not only FreeBSD): every memory page is logically mapped to some diskspace.Edit:
The maximum swap size had been increased recently because of the well-known issue that FreeBSD by default has very high swappiness, likes to swap out literally the whole RAM in some scenarios. And for this reason it can be bad if the swap is smaller than RAM.
There are many users who would prefer the traditional simple Unix swap method of only swapping when actually needed? (e.g. swappiness = 0, while default FreeBSD behaves like swappiness=99)
What are you doing? My desktop has only 8G, no ARC limit, and no swapping. Except when building two llvm in parallel, and then only a few MB.Your reply is a typical example of the denial of the fact that FreeBSD actually swaps "just because".
With ARC set to a sensible maximum, say, 1GB, there is still a lot of swap going on even when, say, 20% of memory are actually used (e.g. not "free memory"). It is just insane when FreeBSD is able to fill a 2GB swap partition when only ~8GB of 48GB RAM has ever been used since boot and is not "free memory".
On smaller scale I had these usecases, and they did work.You don't help the OP by making up usecases that you think would profit from ridiculous amounts of swap; they won't. Of course, to understand why, you must understand how virtual memory and swap work in general.
enough to @least write or outline tests & constraints. The underlying theory did not change for decades IIUC.Probably not, cause knowing the theory and having touched some kernel code some time is
Pppffhhh... You would be surprised how many beginner's bugs even afar from enough to be qualified for such reviews.
Not true. VM is VM for decades, and beeing a noob can even be advantageous because that noob guy asks nasty questions that the wizzard might forget when s/he's in the flow.I'd need to invest a lot of time first to get familiar with the FreeBSD kernel
I hate to rebuild my desktop, and so I just sleep the PC instead of starting it up/shutting it down every day.What are you doing? My desktop has only 8G, no ARC limit, and no swapping. Except when building two llvm in parallel, and then only a few MB.
Only when running for days, over time a few things are moved to swap.
Ah, now it becomes clearer. (I don't leave the desktop on over night anymore.)Just leave a few big memory eaters like LibreOffice, several tab-filled Firefox windows and the like idle for a long time, say, overnight.
Then you can experience the disruptive feeling when your system suddenly starts to swap in gigabytes that the friendly VM swapped out through the night, and you have no idea whether it will become responsive again in seconds, or whether it is better to go smoke one or make a tea, as this swap-in can sometimes take quite a while.
Wow, have fun with that. (If You make it tuneable in both directions, I'm interested.)Whatever, when I got enough time and brain free and am in the mood, I'll look at the memory/swap management code and try to find out what needs to be patched to implement a swappiness sysctl, or maybe a build option for zero swappiness (probably easier).
Aha. That whole suspend/resume stuff is not 100% sound (very likely also due to broken ACPI BIOSes). I have numerous issues, unfortunately you guys keep posting interesting stuff here & I don't find the time to write 1/2-way qualified bug reports (I also have the pride to @least try to find a fix or workaround)... So would you agree to just reboot once a week?I hate to rebuild my desktop, and so I just sleep the PC instead of starting it up/shutting it down every day.
So there are quite long uptimes in which swap used and inactive/laundry memory grows to insane amounts.
I usually reboot only after updates.
Yes, that's sometimes hard to find all suspend/resume breakers.Aha. That whole suspend/resume stuff is not 100% sound (very likely also due to broken ACPI BIOSes). I have numerous issues, unfortunately you guys keep posting interesting stuff here & I don't find the time to write 1/2-way qualified bug reports (I also have the pride to @least try to find a fix or workaround)...
Uh-uh. Question: how often should one update for a reasonably safe machine?So would you agree to just reboot once a week?
% uptime
7:45PM up 35 days, 5:54, 8 users, load averages: 0.36, 0.66, 0.64
%
I smell FORTRAN...Ok, so we moved away from "ridiculous" and "impossible" to "huge" and "if not"? Progress! ;-).
I think my use case is as old as the hills. I have a big data set and a program that works by loading it, doing processing for a long time, and then writing the result.
Do you read it all at once? If yes, mmap will not change much. If you re-read it multiple times a tmpfs may help.We could re-write the program to use mmap, but that is not a trivial change that will be more expensive than paying $134 for 1 Tb of nvme.
-Cristian
No, the fact that you can have a single application working in a huge virtual address space perform in an acceptable manner if it is written in a way aware of the problem (simplest possible case "sequential processing") just doesn't invalidate the generic reasoning. Add some other memory-hungry processes to the picture and you're in "heavy swapping" scenario again. And the use-cases made up in this thread still just won't work.Ok, so we moved away from "ridiculous" and "impossible" to "huge" and "if not"? Progress! ;-).
TBH, reading this article about the design of that web cache, mmap(2) was the very first thing I had in mind as a means to make something like this, taking advantage of the 64bit address space, work without a very weird system configuration. Just maybe, there might be a small performance penalty having to pass the filesystem (although I don't think it would be relevant, but you typically have to test such things to be sure…)I think my use case is as old as the hills. I have a big data set and a program that works by loading it, doing processing for a long time, and then writing the result. The problem is the opposite of "embarrassingly parallel", meaning, it can't work by processing pieces of input at a time. We could re-write the program to use mmap, but that is not a trivial change that will be more expensive than paying $134 for 1 Tb of nvme.