Troubleshoot of "a thread waited too long to allocate a page"

epopen · Jan 31, 2023

Hi All.

I got a system problem since FreeBSD 13 as follows (Prior 12, I have no impression about the issue).

Code:

Jan 30 11:22:47 test kernel: [116123] pid 50340 (mysqld), jid 1, uid 88, was killed: a thread waited too long to allocate a page
Jan 30 11:22:47 test kernel: [116134] pid 32665 (php-fpm), jid 2, uid 80, was killed: a thread waited too long to allocate a page
...
Jan 30 11:39:07 test kernel: [117080] swap_pager: cannot allocate bio
...
Jan 30 14:06:11 test kernel: [125979] watchdog pre-timeout, WD_SOFT_LOG
Jan 30 14:06:11 test kernel: [125979] watchdog pre-timeout, WD_SOFT_PRINTF
Jan 30 19:30:33 test kernel: [127986] pid 61858 (pickup), jid 3, uid 125, was killed: failed to reclaim memory

System freeze until hardware reset.

Version of FreeBSD.

Code:

FreeBSD epopen.com 13.1-RELEASE-p3 FreeBSD 13.1-RELEASE-p3 releng/13.1-n250169-c3c13035ef2 EpopenKernel_13_1-RELEASE_2022-05-08 amd64

My system installed physical memory size: 8GB and SWAP size: 2GB
Using (root on) ZFS and ARC

And monitor system resource continue and trying to find out which one exhaust memory by top as follows

Code:

last pid: 13312;  load averages:  0.26,  1.02,  1.00                                                                                                                               up 0+14:53:11  10:23:30
102 processes: 1 running, 100 sleeping, 1 zombie
CPU:  0.8% user,  1.5% nice,  0.7% system,  0.8% interrupt, 96.3% idle
Mem: 228M Active, 1532M Inact, 964M Wired, 40K Buf, 5113M Free
ARC: 214M Total, 68M MFU, 83M MRU, 163K Anon, 2216K Header, 61M Other
     58M Compressed, 197M Uncompressed, 3.38:1 Ratio
Swap: 2048M Total, 2048M Free

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
   49 root         52  20    0   103M    53M select   0   2:53   0.00% python3.9
55271 mysql        57  20    0   926M   612M select   1   2:15   0.00% mysqld
   30  11211       10  20    0    79M    32M kqread   1   0:15   0.00% memcached
23190 www           1  24    0   232M    73M accept   0   0:11   0.00% php-fpm
   32 www          54  20    0    97M    34M piperd   0   0:09   0.00% httpd
....

After google, I got some information about tuning system variable, but I don't understand, therefore known current value list as follows.

Code:

vfs.zfs.arc.max: 268435456
vm.pfault_oom_attempts: 3
vm.pageout_oom_seq: 12
vm.pfault_oom_wait: 10

Can help me debug/suggestion/tip for troubleshoot?

Thanks very much.

richardtoohey2 · Jan 31, 2023

MySQL can chew through memory - so that might be the culprit.

I'd make a script that appends top's output every minute to a log file and see if you can find the culprit after the crash. Or just keep top up and running and see where there seems to be any pressure.

For a large and busy MySQL database on top of ZFS then 8GB of RAM might not be enough without tuning - especially if you are also running a web server with PHP as well. And other daemons like memcached. But if all of those aren't very busy and the database is small, then might be something else.

epopen · Jan 31, 2023

richardtoohey2 said:
MySQL can chew through memory - so that might be the culprit.

I'd make a script that appends top's output every minute to a log file and see if you can find the culprit after the crash. Or just keep top up and running and see where there seems to be any pressure.

For a large and busy MySQL database on top of ZFS then 8GB of RAM might not be enough without tuning - especially if you are also running a web server with PHP as well. And other daemons like memcached. But if all of those aren't very busy and the database is small, then might be something else.

Thanks you.

My MySQL aren't busy and I had tried to configure MySQL to using minimum amount of memory before several year and work fine with FreeBSD 12 and earlier version.
Therefore I trying to find the culprit in entire system.

Behavior of kill process is keep system running I think.
But actual result is system freeze, it is not make sense.

richardtoohey2 · Jan 31, 2023

Yes, you are right - hard freeze would be unlikely for out-of-memory issue.

Is this new hardware or old? Is it the same machine you used for 12.x?

If it was working and has stopped, what has changed?

Possibly more likely hardware? Drives all OK?

epopen · Jan 31, 2023

richardtoohey2 said:
Yes, you are right - hard freeze would be unlikely for out-of-memory issue.

Is this new hardware or old? Is it the same machine you used for 12.x?

If it was working and has stopped, what has changed?

Possibly more likely hardware? Drives all OK?

Hardware never change...
New version of the software(e.g: MySQL and other) consumes more memory than before maybe.

Monitor continue

SirDice · Jan 31, 2023

I'd check the drive(s) with sysutils/smartmontools. Even if it's just to rule them out as a possible cause. Run smartctl -a /dev/da0 (may need to adjust the device name). It might be much to copy/paste, so smartctl -a /dev/da0 | nc termbin.com 9999 would upload the output to termbin.

Erichans · Jan 31, 2023

epopen said:
[...] System freeze until hardware reset.
My system installed physical memory size: 8GB and SWAP size: 2GB
Using (root on) ZFS and ARC

What is your ZFS setup? Using mirrors I presume?

epopen said:
[...] Behavior of kill process is keep system running I think.
But actual result is system freeze, it is not make sense.

When the system freezes, can you access it remotely (ssh)?

When the system looks as being in a freezed state, a severely memory starved ZFS system (also with a relative small swap space of 2GB) may take a while to get responsive again.
How much time (approximately) is there between when you notice the system freeze and you do a hardware reset?

Alain De Vos · Jan 31, 2023

5G free memory. That's alot of free memory.

richardtoohey2 · Jan 31, 2023

I think that's after the reboot - so it doesn't show much.

cracauer@ · Jan 31, 2023

Why only 2 GB swapspace when running mysql and memcached?

I mean the error message is not the OOM killer, but the log reports just fit a full swapspace very well.

Alain De Vos · Jan 31, 2023

I have 64G of swap-space.

epopen · Feb 1, 2023

SirDice said:
I'd check the drive(s) with sysutils/smartmontools. Even if it's just to rule them out as a possible cause. Run smartctl -a /dev/da0 (may need to adjust the device name). It might be much to copy/paste, so smartctl -a /dev/da0 | nc termbin.com 9999 would upload the output to termbin.

Hi SirDice

Please reference https://termbin.com/n2al

I'm surprise about attribute of Power_Cycle_Count and Power-Off_Retract_Count
Because the driver using for server and 24/7 power-up, power cycle times is low, will be damage?

System not use smartctl frequency due kern_securelevel="2"

EDIT:
Follows richardtoohey2 information, attribute value is weird,
I regenerate report by smartctl -x ... (Prints all SMART and non-SMART information about the device.)
Please reference new report https://termbin.com/8cvj7

Thanks a lot.
Neko

richardtoohey2 · Feb 1, 2023

Intel 330s - looks like they were released 10 years ago - how old are the drive(s)?

Is this a custom kernel?

epopen · Feb 1, 2023

Erichans said:
What is your ZFS setup? Using mirrors I presume?

When the system freezes, can you access it remotely (ssh)?

When the system looks as being in a freezed state, a severely memory starved ZFS system (also with a relative small swap space of 2GB) may take a while to get responsive again.
How much time (approximately) is there between when you notice the system freeze and you do a hardware reset?

What is your ZFS setup? Using mirrors I presume?
>> Only one driver.

When the system freezes, can you access it remotely (ssh)?
>> I can not access remotely, therefore ask my brother to press the reset button

When the system looks as being in a freezed state, a severely memory starved ZFS system (also with a relative small swap space of 2GB) may take a while to get responsive again.
How much time (approximately) is there between when you notice the system freeze and you do a hardware reset?
>> Within the first few hours could be ping my server, but later could not(no reply), after 12 hours(approximately) of continue ping fail, ask my brother help to hard reset.

Thanks a lot.
Neko

epopen · Feb 1, 2023

richardtoohey2 said:
Intel 330s - looks like they were released 10 years ago - how old are the drive(s)?

Is this a custom kernel?

It bought at 2014, but storage(no power) 2~3 years I remember.
With mentioned Power_Cycle_Count and Power-Off_Retract_Count, the Power_On_Hours_and_Msec can not believe

Yes, I using custom kernel (device ada included).

Thanks a lot.

epopen · Feb 1, 2023

Alain De Vos said:
5G free memory. That's alot of free memory.

Yes, but do not know amount of free memory during problem occurring.
Keep going monitor.

Thanks a lot.

epopen · Feb 1, 2023

Alain De Vos said:
I have 64G of swap-space.

It is higher swap space

In my case, my server does not busy, almost personal use.
Therefore my swap-space free almost.

Thanks a lot.

richardtoohey2 · Feb 1, 2023

There was a bug in a similar IBM SSD to do with the power-on-time looking odd - so maybe similar for your SSD: https://www.smartmontools.org/ticket/289

epopen · Feb 1, 2023

cracauer@ said:
Why only 2 GB swapspace when running mysql and memcached?

I mean the error message is not the OOM killer, but the log reports just fit a full swapspace very well.

Ans: 2 GB swapspace is not use almost (Configuration of all daemon from 256MB RAM machine and few modified to fit higher memory machine)

Yes, I think so full swapspace problem, It caused by problem software (memory leak?) possible.
Therefore monitoring to find which one causes.

Thanks a lot.

epopen · Feb 1, 2023

richardtoohey2 said:
There was a bug in a similar IBM SSD to do with the power-on-time looking odd - so maybe similar for your SSD: https://www.smartmontools.org/ticket/289

Follows your information, I regenerate report by smartctl -x ... (Prints all SMART and non-SMART information about the device.)
Please reference https://termbin.com/8cvj7

Thanks a lot.

epopen · Feb 25, 2023

Hi All

I tried to capture result of top per minute and noteworthy point found as follows.

2023/2/25 Sat 03:15:34 CST
last pid: 48; load averages: 1.16, 1.85, 1.28 up 23+16:03:05 03:15:34
128 processes: 1 running, 119 sleeping, 3 stopped, 5 zombie
CPU: 0.8% user, 0.5% nice, 0.9% system, 0.7% interrupt, 97.2% idle
Mem: 247M Active, 1499M Inact, 100M Laundry, 3790M Wired, 104K Buf, 2206M Free
ARC: 348M Total, 119M MFU, 65M MRU, 4832K Anon, 2624K Header, 157M Other
39M Compressed, 216M Uncompressed, 5.52:1 Ratio
Swap: 2048M Total, 625M Used, 1423M Free, 30% Inuse

PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
39131 mysql 57 20 0 1132M 510M select 1 86:41 0.00% /usr/local/libexec/mysqld --defaults-extra-file=/usr/local/etc/mysql/my.cnf --basedir=/usr/local --datadir=/home/sqldb/data --plugin-dir=/usr/local/lib/mysql/plugin -
-user=mysql --log-error=sql.epopen.com.err --open-files-limit=12K --pid-file=/var/run/mysql-server/pid --socket=/tmp/mysql.sock --port=3306
36 www 1 22 0 232M 70M accept 1 1:35 0.00% php-fpm: pool neko_push_down (php-fpm)
37 www 1 25 0 232M 66M accept 1 0:24 0.00% php-fpm: pool neko_push_down (php-fpm)
92231 www 1 20 0 232M 65M accept 1 0:11 0.00% php-fpm: pool neko_push_down (php-fpm)
96204 root 52 20 0 106M 44M select 0 98:16 0.00% /usr/local/bin/python3.9 /usr/local/bin/fail2ban-server --async -b -s /var/run/fail2ban.sock -p /var/run/fail2ban.pid --loglevel INFO --logtarget /var/log/fail2ban.lo
g --syslogsocket auto
57515 root 21 31 0 278M 42M uwait 0 0:02 0.00% /usr/local/libexec/git-core/git merge --ff-only FETCH_HEAD
2023/2/25 Sat 03:16:35 CST
last pid: 78361; load averages: 1.06, 1.69, 1.25 up 23+16:04:06 03:16:35
125 processes: 1 running, 116 sleeping, 3 stopped, 5 zombie
CPU: 0.8% user, 0.5% nice, 0.9% system, 0.7% interrupt, 97.2% idle
Mem: 235M Active, 1500M Inact, 100M Laundry, 5832M Wired, 104K Buf, 176M Free
ARC: 375M Total, 143M MFU, 52M MRU, 4807K Anon, 2778K Header, 173M Other
41M Compressed, 243M Uncompressed, 5.86:1 Ratio
Swap: 2048M Total, 625M Used, 1423M Free, 30% Inuse

PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
39131 mysql 57 20 0 1132M 510M select 1 86:41 0.00% /usr/local/libexec/mysqld --defaults-extra-file=/usr/local/etc/mysql/my.cnf --basedir=/usr/local --datadir=/home/sqldb/data --plugin-dir=/usr/local/lib/mysql/plugin -
-user=mysql --log-error=sql.epopen.com.err --open-files-limit=12K --pid-file=/var/run/mysql-server/pid --socket=/tmp/mysql.sock --port=3306
36 www 1 22 0 232M 70M accept 1 1:35 0.00% php-fpm: pool neko_push_down (php-fpm)
37 www 1 25 0 232M 66M accept 1 0:24 0.00% php-fpm: pool neko_push_down (php-fpm)
92231 www 1 20 0 232M 65M accept 1 0:11 0.00% php-fpm: pool neko_push_down (php-fpm)
96204 root 52 20 0 106M 44M select 1 98:16 0.00% /usr/local/bin/python3.9 /usr/local/bin/fail2ban-server --async -b -s /var/run/fail2ban.sock -p /var/run/fail2ban.pid --loglevel INFO --logtarget /var/log/fail2ban.lo
g --syslogsocket auto
57515 root 19 31 0 278M 42M uwait 0 0:03 0.00% /usr/local/libexec/git-core/git merge --ff-only FETCH_HEAD
2023/2/25 Sat 03:17:38: CST
last pid: 48; load averages: 11.30, 3.75, 2.01 up 23+16:05:13 03:17:42
125 processes: 6 running, 111 sleeping, 3 stopped, 5 zombie
CPU: 0.8% user, 0.5% nice, 0.9% system, 0.7% interrupt, 97.2% idle
Mem: 18M Active, 35M Inact, 68M Laundry, 1245M Wired, 104K Buf, 6475M Free
ARC: 219M Total, 151M MFU, 40M MRU, 4305K Anon, 8139K Header, 17M Other
151M Compressed, 2801M Uncompressed, 18.58:1 Ratio
Swap: 2048M Total, 1108M Used, 940M Free, 54% Inuse

PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
57515 root 1 20 0 278M 32M RUN 0 0:03 0.00% /usr/local/libexec/git-core/git merge --ff-only FETCH_HEAD
96204 root 52 20 0 106M 23M select 1 98:16 0.00% /usr/local/bin/python3.9 /usr/local/bin/fail2ban-server --async -b -s /var/run/fail2ban.sock -p /var/run/fail2ban.pid --loglevel INFO --logtarget /var/log/fail2ban.lo
g --syslogsocket auto
39131 mysql 57 20 0 1132M 19M select 1 86:41 0.00% /usr/local/libexec/mysqld --defaults-extra-file=/usr/local/etc/mysql/my.cnf --basedir=/usr/local --datadir=/home/sqldb/data --plugin-dir=/usr/local/lib/mysql/plugin -
-user=mysql --log-error=sql.epopen.com.err --open-files-limit=12K --pid-file=/var/run/mysql-server/pid --socket=/tmp/mysql.sock --port=3306
29 root 1 -52 r0 13M 13M nanslp 1 0:05 0.00% /usr/sbin/watchdogd -t 180 --pretimeout 90 --softtimeout --pretimeout-action log,printf --softtimeout-action log,panic,printf
36413 nobody 1 20 0 43M 6068K RUN 0 1:45 0.00% postfwd3:olicy (perl)
63872 bind 6 52 0 66M 4128K sigwai 1 0:28 0.00% /usr/local/sbin/named -u bind -c /usr/local/etc/namedb/named.conf

Item 1 -> 2: 3790M Wired -> 5832M Wired, therefore 2206M Free -> 176M Free
Item 2 -> 3: 30% Inuse of Swap-> 54% Inuse of Swap.

Daemon using memory same, Wired using memory increase.
But Wired using for system, no idea which one using.

Thanks a lot.
Neko

rowan194 · Apr 12, 2024

Was this ever resolved?

I am having the same issue after upgrading from 12.4 to 13.3

User processes which were memory hungry, but worked, were fine on the former install; they're now being killed off with errors "failed to reclaim memory" and "a thread waited too long to allocate a page".

Occasionally other unrelated/system processes, like getty, or even init, are also killed, leaving the system in an unstable state.

There's 8GB of swap, and no swap errors logged, so it appears it's not just simple OOM.

Did not have this issue until after the upgrade.

Edit: One idea I've had is mmap, as there's multiple processes accessing a couple of read-only files. Has any behaviour, such as default flags, changed there? Is it possible that for some reason the data is being cached individually rather than collectively? Or perhaps, given the more specific memory errors rather than just being absolutely OOM, it's unrelated.

Code:

      bitvector_mptr = mmap(NULL, bitvector_filesize, PROT_READ, MAP_PRIVATE | MAP_NOCORE | MAP_PREFAULT_READ, bitvector_fd, 0);
...
      mptr = mmap(NULL, hashindex_filesize, PROT_READ, MAP_PRIVATE | MAP_NOCORE, fd, 0);

rowan194 · Apr 13, 2024

Absolute carnage here... 30 mins after a reboot the system has killed off all applications, as well as the "screens" they run under, and for some reason it also killed unbound.

This never happened under 12.x. The server is currently unusable. I'm not sure what to do next...

Code:

Apr 14 00:06:29 softdev kernel: pid 9238 (<application_1>), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:11:48 softdev kernel: pid 10030 (<application_1>), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:12:29 softdev kernel: pid 9653 (<application_2>), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:12:59 softdev kernel: pid 9654 (<application_1>), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:27:16 softdev kernel: pid 10106 (sort), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:27:16 softdev kernel: pid 3321 (php7), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:27:16 softdev kernel: pid 3281 (php7), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:27:16 softdev kernel: pid 10088 (php7), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:27:16 softdev kernel: pid 10084 (php7), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:27:16 softdev kernel: pid 10083 (php7), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:27:16 softdev kernel: pid 10082 (php7), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:27:16 softdev kernel: pid 10091 (php7), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:27:16 softdev kernel: pid 10089 (php7), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:27:16 softdev kernel: pid 622 (local-unbound), jid 0, uid 59, was killed: a thread waited too long to allocate a page
Apr 14 00:27:16 softdev kernel: pid 3330 (screen-4.9.1), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:27:16 softdev kernel: pid 3314 (screen-4.9.1), jid 0, uid 0, was killed: a thread waited too long to allocate a page
Apr 14 00:27:16 softdev kernel: pid 3277 (screen-4.9.1), jid 0, uid 0, was killed: a thread waited too long to allocate a page

richardtoohey2 · Apr 13, 2024

See if you can get back to 12.x (but it’s EOL now).

Could try 14.0 but might just cause more issues,

Once you are back on 12.x set up mirror test environment where you can upgrade to 13.x and see if you can find the culprit.

What do you mean by “no swap errors”? No swap usage or something else?

epopen · 2024-04-30T07:26:06+0100

The issue disappeared in FreeBSD 14-Release.

Troubleshoot of "a thread waited too long to allocate a page"

Administrator