Perform a CPU stress test on the O.S. using make or poudriere or synth ?

ralphbsz

Son of Beastie

Reaction score: 2,189
Messages: 3,146

Compiles will probably not do it; for most programming languages, make is IO limited, no CPU limited. With a very good file system and very fast disks (something like NOVA or Strata, on top of Optane DIMMs) you might have a chance; but the kind of CPUs that have the IO horsepower to support those storage devices tend to have lots of cores too, so you need a build environment in which you can run lots of parallel steps in make, and that's rare.

If you really want to stress test ... you need to figure out what you want to stress. There are simple CPU-intensive bench marks that are all numeric. On a modern CPU, you'll need to run dozens of copies of them, to soak all the cores. But that only exercises a small part of the CPU. To make it more intense, add floating point and vector instructions. Even that is just the CPU. Where it gets interesting is to make memory-intensive benchmarks, in particular those that are capable of simultaneously overstressing internal caches and the memory subsystem itself. Adding the IO subsystem to an intense benchmark is possible too.

Anecdote: Many years ago, our son was in a soccer tournament, which must have been played in the winter, because it rained hard, and he got soaking wet. Unfortunately, he was his team's goal keeper, so there is much more wet gear. Fortunately, the place where he was playing was close to my office. So we went to the lab, I turned on a stress test benchmark which exercises all the disk drives in one disk enclosure (our disk enclosures were large and power hungry, they have 384 drives inside them, each consuming about 10W under full load), and hung his soccer gear in front of the 4U enclosure using a chair and a few ethernet cable to improvise a drying rack. Worked well.
 
OP
Alain De Vos

Alain De Vos

Daemon

Reaction score: 548
Messages: 1,893

I've set the number of make jobs to 15 in make.conf
Code:
MAKE_JOBS_NUMBER?=15
And to 12 in poudriere.conf.
Code:
ALLOW_MAKE_JOBS=YES
PARALLEL_JOBS=12
But ps aux shows only 4 make jobs. Maybe it spawns threads or it can't be paralised ?
 
OP
Alain De Vos

Alain De Vos

Daemon

Reaction score: 548
Messages: 1,893

There is only one c++ process running.
Code:
/bin/sh -e -c (cd /wrkdirs/usr/ports/www/qt5-webengine
/usr/local/libexec/ccache/c++
/usr/bin/c++
c++ --version
Code:
FreeBSD clang version 11.0.1 (git@github.com:llvm/llvm-project.git llvmorg-11.0.1-0-g43ff75f2c3fe)
Target: x86_64-unknown-freebsd13.0
Thread model: posix

Is it a limitation of poudriere or did I do something wrong ?
 

Jose

Daemon

Reaction score: 860
Messages: 1,048

Not sure. I had no trouble maxing out all my cores running Poudriere jobs. What's the load average during your runs?
 
OP
Alain De Vos

Alain De Vos

Daemon

Reaction score: 548
Messages: 1,893

50%, it spawned 4 processes, 4 different parallel ports, on an 8 core CPU.
In make.conf I have:
MAKE_JOBS_NUMBER=15
In poudriere.conf :
PARALLEL_JOBS=15
PREPARE_PARALLEL_JOBS=23
ALLOW_MAKE_JOBS=YES
 

Zirias

Daemon

Reaction score: 1,342
Messages: 2,363

Compiles will probably not do it; for most programming languages, make is IO limited, no CPU limited.
I have some doubts about that from what I observe on my builder with these specs:
  • CPU: Xeon E3-1240L v5 @ 2.10GHz (4 cores, therefore 8 threads)
  • RAM: 64GB
  • Storage: RAID-Z on 4* 4TB SATA-3 HDD(!)
With poudriere running 8 builders, the CPU stays at 100% most of the time. As soon as one of the builders is building one of these huge ports I have listed in ALLOW_MAKE_JOBS_PACKAGES, the CPU stays at 100% constantly. The only thing that can sometimes make it wait for I/O is a very good cache hit rate from ccache.
 

ralphbsz

Son of Beastie

Reaction score: 2,189
Messages: 3,146

Zirias: You have enough RAM, the intermediate files are probably being kept in file system cache. And you have a reasonably fast file system, with the speed of 3 hard disks for writes, 4 for reads. And not very many cores. Still, I'm surprised. My C++ compiles are usually IO limited once you get to a handful cores. That's why throwing a really fast file system (either a cluster with lots of disks, and lots of RAM to hold temporary files and output, or very fast devices) works better for me.
 

garry

Active Member

Reaction score: 99
Messages: 111

Not sure. I had no trouble maxing out all my cores running Poudriere jobs. What's the load average during your runs?
Same with synth with setting it to use ram for the build environment and for the workdir. When it decides to try building several big packages in parallel with -j4 I see load hit 13 (on four core i5). Strangely, the system is still completely responsive even in fluxbox/Xorg.
 

Zirias

Daemon

Reaction score: 1,342
Messages: 2,363

I see load hit 13 (on four core i5). Strangely, the system is still completely responsive even in fluxbox/Xorg.
Modern CPU schedulers are black magic for sure, but most use adaptive priorities to ensure "fairness". This means scheduling priorities change based on previous behavior of the process. A very simple (but already somewhat effective) idea is to look at how often a process uses the full time slice it gets. A process using full time slices will be prioritized down. This is true for your typical compiler. In an initial parsing phase, reading all the required input files, it might block quite often waiting for I/O, but once it has everything it needs, the actual compilation just uses CPU time, so it's full time slices for a while. In contrast, the processes driving your desktop UI will almost never use a full time slice, as they're most of the time waiting for some input (like e.g. a mouse click), blocking until something is happening. So, this simple heuristics would already lead to the scheduler prefering your "fluxbox" process to the "clang" process ;) There are probably a lot more elaborate ideas implemented as well.

In my experience, the FreeBSD scheduler is doing a pretty good job. You can help it further by setting a "bias" for the priority of a process with nice(1) – or even a strong offset with either rtprio(1) (to prioritize up) or idprio(1) (to prioritize down). This is rarely needed, but I do use nice -n 20 for my poudriere builds. This way, even loads of 30 to 40 (on a quad-core with hyperthreading) most of the time don't disturb other, more interactive, processes on the same machine.

Edit: note a higher number means a lower priority – so adding a bias of 20 with nice(1) lowers the priority.
 
OP
Alain De Vos

Alain De Vos

Daemon

Reaction score: 548
Messages: 1,893

With FreeBSD12 I was able to disturb the realtime attitude of my PC during builds like e.g. "rust", i.e. youtube would start to stutter.
With FreeBSD13 I can no longer. Youtube just continues to play fine.
 

Zirias

Daemon

Reaction score: 1,342
Messages: 2,363

With FreeBSD12 I was able to disturb the realtime attitude of my PC during builds like e.g. "rust", i.e. youtube would start to stutter.
With FreeBSD13 I can no longer. Youtube just continues to play fine.
This is in line with my observations. It seems the scheduler's "intelligence" improved in 13.
 
OP
Alain De Vos

Alain De Vos

Daemon

Reaction score: 548
Messages: 1,893

With poudriere you can easily use "32 builders" to build your packages.
 
Top