Poudriere memory exhaustion

ports-mgmt/poudriere-devel is running here on a 8 cpu system with 8GB RAM and 8 GB zfs-swap.
In the past weeks Poudriere didn't complete it's run anymore. It makes the system nonreactive at local console. Only an unfriendly use of the system reset can restart the system.

A look on the Poudriere generated log files gives no clue for the problem. Poudriere failes every time in the same i386-jail-set. But it is not predictable when and how it happens. To get some 600 ports bulk run completed it needs 3-5 manual system restarts.

I saw swapdisk usage at a maximum of about 60% but saw the usage also on a lower percentage failing.

https://github.com/freebsd/poudriere/wiki/todo said:
Stability

There is substantial risk that large ports build at once and consume all RAM/swap and cause a OOM or panic. Either need to make the queue wait on these known large ones or monitor the amount of remaining memory and current CPU load and delay builds while high. Note that hidden in this task is reworking the queue to allow delaying builds. This is not possible currently and conflicts with detecting a stuck/deadlocked queue. A more flexible queue would allow retrying fetch failures or failed builds (due to memory constraints).

So the question is how Poudriere could be configured without triggering this problem?
Currently it cannot be used automated anymore here as it fails each time, leaving the system unusable.
 
Looks like the machine is locking due to memory exhaustion to me. I see you're running swap on ZFS. That probably isn't the best idea. I believe most people create a swap partition instead to keep this from happening. It's what I do.
 
Which setting of USE_TMPFS you're using?

Code:
# Use tmpfs(5)
# This can be a space-separated list of options:
# wrkdir    - Use tmpfs(5) for port building WRKDIRPREFIX
# data      - Use tmpfs(5) for poudriere cache/temp build data
# localbase - Use tmpfs(5) for LOCALBASE (installing ports for packaging/testing)
# all       - Run the entire build in memory, including builder jails.
# yes       - Only enables tmpfs(5) for wrkdir
# EXAMPLE: USE_TMPFS="wrkdir data"

Do you have any idea which port is causing the memory exhaustion? I've noticed that java/openjdk for example can consume huge amounts of memory if you you're using the USE_TMPFS option, even without it it's quite a memory hog.
 
in /usr/local/etc/poudriere.conf I use
Code:
USE_TMPFS="all"

Now I remember having changed that! What setting do you recommend?

Do you use TMPFS_LIMIT or MAX_MEMORY? Maybe this is necessary to set too?
 
Try first without it so that no tmpfs(5) is used at all, the builds will be slower though but that's one variable eliminated if it doesn't make a difference.
 
What's wrong with swap on ZFS?
Historically, the reason was ZFS needs some memory to do internal book-keeping and under extreme memory pressure it would have trouble with swap on ZFS. I'm not sure how relevant that is today or if it's changed over the years. Additionally, if you need to do core dumps in the event of a kernel panic the kernel cannot write to a real file system as it could cause corruption. You would need a swap partition to capture the core dump.
 
I just recently got started with Poudriere. Creating a jail kept failing at different parts of the job, or just crashing the machine. The machine is a 4-core AMD 64-bit CPU with 16GB of RAM (and a pretty large ZFS pool). I tuned down the number of parallel builds to 1 and created a 32GB swap file and then it stopped failing. Will start to tune it back up from here.

From the handbook...

The number of processor cores detected is used to define how many builds should run in parallel. Supply enough virtual memory, either with RAM or swap space. If virtual memory runs out, compiling jails will stop and be torn down, resulting in weird error messages.​
 
Back
Top