Other [GELI] encrypted swap unsafe/unstable?

zirias@

Developer
I've had several occassions of a system running into a "livelock" (well, probably a deadlock for a few processes) during "massive" poudriere builds, so, a lot of disk I/O, a lot of CPU load and some swap in use. Maybe the problem is solved meanwhile, I'm not sure about that, will have to observe… so far only for background info.

Anyways, looking for some help on IRC, I was also told that having swap encrypted could be troublesome, because GELI needs to dynamically allocate memory sometimes, and this could lead to deadlocks when no RAM is available (which sounds very plausible when, for accessing swap, GELI is needed).

My question is: is this indeed (and, still?) an issue? And if so, does anyone know of any efforts to solve this? IMHO, encrypted swap makes sense, and also, swapon(8) explicitly supports this by automatically creating devices given to it with a .eli suffix – so using this should be safe?
 
For details you probably have to ping the GELI developers on the mailing list.

But: People who do storage / file system development have known that having to allocate memory in the page-out or swap path will lead to deadlock. I find it hard to imagine that a competent developer would make a mistake such as "having to allocate memory". The problem is that many of the deadlock-avoidance techniques (such as pre-allocated emergency buffers, or background/foreground swap, or single-tracked operation) can be very slow. So it is possible that GELI encrypted swap in a low memory situation becomes so slow, it may not be distinguishable from deadlock to the outside observer.

My suggestion: In addition to ask on the mailing list, reduce the parallelism and thereby the memory usage of poudriere a little bit, and see whether that makes it run FASTER (not slower!), by avoiding the use of swap. Why? Because swap is slow, and the speed gain by higher parallelism may be eaten up by swapping. This is a case where tuning the system for better performance can also make it more reliable (less variability in the performance).
 
A/B testing should be easy, because temporarily using the swap partition without encryption involves no loss of data (just of the feeling of security ;)).
 
Sure, very easy, if you experience this kind of locks very rarely (had it a total of 4 times in 2 years), and (see first post!) don't assume encrypted swap is the problem here.

To make it explicit, I had the same behavior with unencrypted swap as well one time.

I am asking here just for information, whether someone knows some background, whether this can indeed be unsafe.
 
I don't have an answer to your question. But to track the problem down, maybe you can devise a workload/configuration that would more easily provoke the problem. Using a very small encrypted swap and some resource-hungry operation that is more predictable than a large poudriere run could help.
 
Back
Top