Occasionally I noticed that the system would not quickly process the tasks i need done, but instead prefer other, long-running tasks. I figured it must be related to the scheduler, and decided it hates me.
A closer look shows the behaviour as follows (single CPU):
Lets run an I/O-active task, e.g, postgres VACUUM that would continuousely read from big files (while doing compute as well [1]):
Now start an endless loop:
# while true; do :; done
And the effect is:
The VACUUM gets almost stuck! This figures with WCPU in "top":
Hacking on kern.sched.quantum makes it quite a bit better:
Now, as usual, the "root-cause" questions arise: What exactly does this "quantum"? Is this solution a workaround, i.e. actually something else is wrong, and has it tradeoff in other situations? Or otherwise, why is such a default value chosen, which appears to be ill-deceived?
The docs for the quantum parameter are a bit unsatisfying - they say its the max num of ticks a process gets - and what happens when they're exhausted? If by default the endless loop is actually allowed to continue running for 94k ticks (or 94ms, more likely) uninterrupted, then that explains the perceived behaviour - buts thats certainly not what a scheduler should do when other procs are ready to run.
11.1-RELEASE-p7, kern.hz=200. Switching tickless mode on or off does not influence the matter. Starting the endless loop with "nice" does not influence the matter.
[1]
A pure-I/O job without compute load, like "dd", does not show this behaviour. Also, when other tasks are running, the unjust behaviour is not so stongly pronounced.
A closer look shows the behaviour as follows (single CPU):
Lets run an I/O-active task, e.g, postgres VACUUM that would continuousely read from big files (while doing compute as well [1]):
Code:
pool alloc free read write read write
cache - - - - - -
ada1s4 7.08G 10.9G 1.58K 0 12.9M 0
Now start an endless loop:
# while true; do :; done
And the effect is:
Code:
pool alloc free read write read write
cache - - - - - -
ada1s4 7.08G 10.9G 9 0 76.8K 0
The VACUUM gets almost stuck! This figures with WCPU in "top":
Code:
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND
85583 root 99 0 7044K 1944K RUN 1:06 92.21% bash
53005 pgsql 52 0 620M 91856K RUN 5:47 0.50% postgres
Hacking on kern.sched.quantum makes it quite a bit better:
Code:
# sysctl kern.sched.quantum=1
kern.sched.quantum: 94488 -> 7874
pool alloc free read write read write
cache - - - - - -
ada1s4 7.08G 10.9G 395 0 3.12M 0
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND
85583 root 94 0 7044K 1944K RUN 4:13 70.80% bash
53005 pgsql 52 0 276M 91856K RUN 5:52 11.83% postgres
Now, as usual, the "root-cause" questions arise: What exactly does this "quantum"? Is this solution a workaround, i.e. actually something else is wrong, and has it tradeoff in other situations? Or otherwise, why is such a default value chosen, which appears to be ill-deceived?
The docs for the quantum parameter are a bit unsatisfying - they say its the max num of ticks a process gets - and what happens when they're exhausted? If by default the endless loop is actually allowed to continue running for 94k ticks (or 94ms, more likely) uninterrupted, then that explains the perceived behaviour - buts thats certainly not what a scheduler should do when other procs are ready to run.
11.1-RELEASE-p7, kern.hz=200. Switching tickless mode on or off does not influence the matter. Starting the endless loop with "nice" does not influence the matter.
[1]
A pure-I/O job without compute load, like "dd", does not show this behaviour. Also, when other tasks are running, the unjust behaviour is not so stongly pronounced.