High load (when system idle) after replacing HDDs

ers · May 9, 2021

I have replaced 4 HDDs (bad sectors) for 6 new ones. The replacement passed without problems. System see disks, no unusual messages in logs.
No other hardware, software or config was changed. After replacement the system load is about 0.95+ when idle.
The mysterious load is in kernel (~12% by top). 5 top processes by top:
242.5H 730.32% [idle]
47.6H 93.21% [kernel]
36:56 0.00% [geom]
34:46 0.00% [intr]
9:00 0.00% [zfskern]

Before HDDs replacement powerd deamon changed frequency without problems. Now it is always in turbo mode...
Powerd calculated load (on highest frequency) is about 94-106% so powerd cannot lower frequency.
Why? What to look for?

SirDice · May 9, 2021

Slower disks perhaps? Changed 7200 RPM disks for 5400 RPM ones?

ers · May 9, 2021

No, replaced to 7200.
IMHO disk speed has no connection to high load when there is nothing read or write to them.
I have no clue what it could be. I see this behavior for the first time...

ralphbsz · May 9, 2021

Maybe you put the new disks into existing ZFS pools, and the workload you're seeing is the resilvering of those pools? Try "zpool status" ir you are using ZFS.

ers · May 9, 2021

New disk are in new pool and there is no error. No resilvering needed in any pool...

richardtoohey2 · May 9, 2021

Unlikely but worth asking - definitely no hardware RAID involved?

mtu · May 10, 2021

Without knowing your pool layouts (before and after), all we can do is guess. "replaced 4 HDDs for 6 new ones" and "New disks are in new pool" is not useful information.

Try and post at least the full output of zpool status, and describe what changed with the HDD replacement.

ers · May 11, 2021

No hardware raid. Basic zfs on HBA.
old pools: 1x hdd not used, 3x hdd in raidz1, 6x hdd in raidz2, 6x hdd in raidz2)
new pools: 6x hdd in raidz2, 6x hdd in raidz2, 6x hdd in raidz2

Disks in bold removed. New ones create new pool. Other pools are unchanged.
Zpool status show all pools and disk online. Read 0, write 0 cksum 0 on all disks in all pools.

ralphbsz · May 11, 2021

Yes, there is no logical reason I can see for ZFS do be doing resilvering. It might be doing a scrub, but that would be obvious and visible in zpool status. And the CPU time is being used by process [kernel], not by process [zfskern], so it probably isn't ZFS in the first place.

I honestly have no idea. Debugging ideas: Look with iostat, maybe whatever process is doing this is also doing IO which you can not otherwise explain. And look at top for user processes. Maybe there is a user process that is doing something that uses "a little bit" of CPU time in user space all the time, maybe that could give us a hint. Look at /var/log, and find log flies that are growing.

ers · May 11, 2021

Iostat shows no activity on disks (i have stopped outside communication for tests).
Cpu stats are: kernel 12, idle 88. As previously observed in top - no hint there.
There is no user processes. This "magic" is present while no user activity.
?...

richardtoohey2 · May 11, 2021

Did you change anything else? Or literally JUST put the new drives in? Looks like there was some changes you made to the the ZFS set-up (ZFS is not anything I know about) so not just a case of plugging in hot spares - there were some ZFS changes (but then the point is - why can't you see what ZFS is doing?)

Reboots?

Did you make any previous configuration changes or updates that might have kicked in after a reboot?

If you boot off a live "CD" does the kernel still show as being as busy? Would expect it to not be, but if it still is, maybe that points to hardware?

ers · May 12, 2021

I know i did not change anything else. Why nobody believes?
No previous changes (!) in config in hardware and software.
Boot from the same same disk with the same system.
That is why this case is so strange...

I just took old drives and put new ones. Of course i had to make a small changes, but this is obvious to use the new drives.
Destroyed 3x hdd raidz1 and 6x hdd raidz2 created. Nothing else.

I do see what zfs is doing. Why you think i cannot?
Outside communication was stopped for tests. There is no "hidden" activity ralphbsz sugessted.

Reboot? Plenty. No changes.
Later, disabling not used devices to change irq mapping did not change anything. Reverted.

richardtoohey2 · May 12, 2021

ers said:
I do see what zfs is doing. Why you think i cannot?

Sorry, I thought that was the problem - you couldn't see why there was activity (or what the activity is/was). I must have got the wrong end of the stick, sorry.

mark_j · May 12, 2021

You need to start with uname output, dmesg output, rc.conf etc. There's just too little information provided other than "I have a problem, solve it for me!"

Have you looked at systat -vmstat?

The information provided seems to be that the new disks being the cause is a red-herring and have nothing at all to do with CPU usage. Maybe you've got an interrupt storm?

What's your settings for sysctl -a | grep "kern.event". What cron jobs are running, both system and user?

It could also just be a bug.

Edit: I want to clarify, are we talking load averages here, or what?

ers · May 19, 2021

I have wrote about something strange i found. All the years of using BSD did not prepare me to solve this mystery.
I have a problem, did you find something similar? I am asking for suggestions where to look for, because probably no one solve this...
The question is: Why it works flawlessly before?

I od not know why you want uname output, but here it is: FreeBSD

Maybe you was thinking about something else? "uname -a" perhaps?

FreeBSD 10.0-RELEASE #0 r260789 No, there is no chance to change to 13 for at least year.

This is not the interrupt storm, because there are no messages about the storm in logs.
The top stats (for idle) show: CPU: 0.0% user, 0.0% nice, 13.3% system, 0.0% interrupt, 86.7% idle. 0.0% interrupt means no storm here. Am I wrong?
Of course i have looked at systat -vmstat but no help here. The most suspected is acpi, which is my only suspect. Why it was ok earlier?

Code:

8938 total
5427 acpi0 9
2 ehci0 16
3 ehci1 23
881 cpu0:timer
mps0 264
mps1 265
xhci0 266
::::
17 ahci0 277
264 cpu1:timer
178 cpu4:timer
1072 cpu3:timer
485 cpu6:timer
150 cpu2:timer
197 cpu7:timer
255 cpu5:timer

rc.conf:

Code:

ifconfig_igb0="inet XXX.XXX.XXX.XXX netmask XXX.XXX.XXX.XXX description LAN0"
defaultrouter="XXX.XXX.XXX.XXX"
sshd_enable="YES"
powerd_enable="YES"
powerd_flag="-a adp"
dumpdev="NO"
update_motd="NO"
zfs_enable="YES"
pf_enable="YES"
openntpd_enable="YES"
openntpd_flags="-s"
samba_enable="YES"
syslogd_flags="-4 -ss"
moused_enable="NO"
moused_ums0_enable="NO"
moused_ums1_enable="NO"
performance_cx_lowest="C2"
economy_cx_lowest="C2"
rpcbind_enable="YES"
nfs_server_enable="YES"
mountd_flags=""

Seting performance_cx_lowest="Cmax" or economy_cx_lowest="Cmax" did not change anything.

sysctl -a | grep "kern.event"

Code:

kern.eventtimer.et.LAPIC.flags: 7
kern.eventtimer.et.LAPIC.frequency: 50001119
kern.eventtimer.et.LAPIC.quality: 600
kern.eventtimer.et.HPET.flags: 7
kern.eventtimer.et.HPET.frequency: 14318180
kern.eventtimer.et.HPET.quality: 550
kern.eventtimer.et.RTC.flags: 17
kern.eventtimer.et.RTC.frequency: 32768
kern.eventtimer.et.RTC.quality: 0
kern.eventtimer.et.i8254.flags: 1
kern.eventtimer.et.i8254.frequency: 1193182
kern.eventtimer.et.i8254.quality: 100
kern.eventtimer.choice: LAPIC(600) HPET(550) i8254(100) RTC(0)
kern.eventtimer.singlemul: 2
kern.eventtimer.idletick: 0
kern.eventtimer.timer: LAPIC
kern.eventtimer.periodic: 0

Timecounter "TSC-low" frequency 1650036800 Hz quality 1000

System crontab is the only one used.

Code:

*/5     *       *       *       *       root    /usr/libexec/atrun
11      11      *       *       *       operator /usr/libexec/save-entropy
0       *       *       *       *       root    newsyslog
1       1       *       *       6       root    periodic daily
15      2       *       *       6       root    periodic weekly
30      3       1       *       *       root    periodic monthly
1,31    0-5     *       *       *       root    adjkerntz -a

This is simple storage. No cpu time consuming services.

Any ideas what went wrong?
Maybe someone have similar issue?

mark_j · May 19, 2021

You still haven't stated what load you are talking about?
FreeBSD 10 is as old as my granny & she can't take much load either.

ers · May 19, 2021

You did not read the first message. Read the first message...
There is everything clearly described.
The fact the system is old do not change that it was working and stopped after replacing only drives.
The calculated load is high and is preventing powerd to lower frequency.
This was working perfectly earlier.

ipsum · May 19, 2021

Can you export that new pool and see if the load changes?

PMc · May 19, 2021

So, as I understand, the system is consuming CPU, and we do not know why? We do know however that the load is accumulated on PID 0 (the kernel). Correct?

So the next step of in-vivo analysis is to see what piece of the kernel is consuming the load: ps axH.
This gives the processes separated into their individual threads, and the accumulated compute time - and some of these times increase over time. There is a bunch of threads on PID 0 (but I don't remember how that did look on Rel.10, it has changed a lot over time), so just compare the figures with a minute later.
(There is also an option to see these in top - but I won't search the manpage how that did work in Rel.10)
In any case we should get a name of a thread - and that should give a clue on what subsystem is eating the compute.

SirDice · May 19, 2021

ers said:
FreeBSD 10.0-RELEASE #0 r260789 No, there is no chance to change to 13 for at least year.

You don't need to upgrade to 13.0, 12.2 is also supported. You could update to 11.4 too but I won't recommend that as it will be EoL at the end of September. Still a viable quick upgrade though, even if you have to do another upgrade to 12.2 soon after it. Although I believe by that time 12.3 might come out. The fact remains, as others have noted, that 10.0 is old (support ended in 2015).

ers said:
The fact the system is old do not change that it was working and stopped after replacing only drives.

The act of replacing a disk may have triggered a bug or issue that has long since been fixed. There's been a lot of developments in the past 6 years. You can't just dismiss that.

chungy · May 19, 2021

Even if you want to stick to an unsupported release, 10.4 is still a better shot at getting things working properly than 10.0....

Seriously though, upgrade to 12.2 or 13.0. It's well worth the time.

covacat · May 19, 2021

just boot a aupported release from install media and drop to live fs
see if the load is still present

mark_j · May 20, 2021

ers said:
You did not read the first message. Read the first message...
There is everything clearly described.
The fact the system is old do not change that it was working and stopped after replacing only drives.
The calculated load is high and is preventing powerd to lower frequency.
This was working perfectly earlier.

And still you have not explained what the load is.
Are you referring to top's output, such as this:

Code:

last pid: 86750;  load averages:  0.15,  0.07,  0.04

If so, then you need to learn about load and how FreeBSD operates. I've had loads up around 300 and still got a responsive machine.

ralphbsz · May 20, 2021

ers said:
The top stats (for idle) show: CPU: 0.0% user, 0.0% nice, 13.3% system, 0.0% interrupt, 86.7% idle. 0.0% interrupt means no storm here. Am I wrong?

Could be. Not clear. The problem here is that interrupt processing is so darn efficient. I'm looking at my home router, which acts as a NAT box and firewall, and even though there is heavy network traffic (probably 5 MByte/s, two house mates are watching videos over the network and saturating our DSL), the interrupt rate is at 0.0%. In spite of the fact that thousands of packets per second are being routed. I've been glancing at top for several minutes now, and I'm yet to set it climb to 0.1%. At the same time, my system fraction of the CPU seems to be at 2...3% typically, with jumps to 8% a few times (I have quite a few processes that do something every few seconds or once a minute).

So interrupt time being 0.0% doesn't prove much; even significant interrupts might not register as 0.1%.

ers said:
The fact the system is old do not change that it was working and stopped after replacing only drives.

But it is quite likely that this old system has bugs which have been fixed long ago. And that people may not even remember. I don't know exactly when I upgraded to 11, but it was several years ago.

PMc said:
So the next step of in-vivo analysis is to see what piece of the kernel is consuming the load: ps axH.

THIS!

To debug this, you need to break it down further. Something in the kernel (process ID 0) is using CPU time, but we don't know what, or why, or what monsters are hidden in this old kernel. All we know that it started after a disk replacement, but that might be a red herring.

mark_j · May 20, 2021

PMc said:
So, as I understand, the system is consuming CPU, and we do not know why? We do know however that the load is accumulated on PID 0 (the kernel). Correct?

But be careful. The load averages you see involve some POTENTIAL load on the CPU, not all actual. In other words, the CPU has a queue of runnable processes plus those already running. Some of the runnable processes (probably a lot?) are waiting on interrupts, so don't consume CPU resources at all; they're sleeping.

High CONSISTENT (and that means in the 15 minute zone) loads means you have a problem. High count in the 1 minute zone just means you've got a lot of running/runnable processes (threads actually) waiting to complete. If they're higher in the 5 minute zone than the 1 minute zone, then you're building up to a problem.

The scheduler uses this load average to schedule processes/threads. See kern_sync.c and sched_ule.c

The only time I can recall seeing high (over 200) in the 15 minute zone was when a SCSI disk array pack decided to die.

High load (when system idle) after replacing HDDs

ers

SirDice

Administrator

ers

ralphbsz

ers

richardtoohey2

mtu

ers

ralphbsz

ers

richardtoohey2

ers

richardtoohey2

mark_j

ers

mark_j

ers

ipsum

PMc

SirDice

Administrator

chungy

covacat

mark_j

ralphbsz

mark_j