Solved What is the ideal make jobs formula for multicores system?

Deleted member 63539 · Aug 4, 2020

My CPU has 4 physical cores. Ninja's default make jobs is set to 6 and it caused out of memory problem when the compiler is GCC. By trial and error I found the suitable make jobs for GCC is 3.

So I come up with this formula: make jobs = physical cores - 1

Recently I build mulle-clang for playing with Objective C (I don't like the language much and about to remove it). I found the logic the developer used in his install-mulle-clang script is:

Bash:

if [ "${OPTION_PARALLEL}" = "NO" ]
      then
         MAKE_FLAGS="-j 1"
      else
         local cores
         local loadavg

         cores="`get_core_count`"
         loadavg="`expr $cores / 4`"
         loadavg="`expr $cores - ${loadavg}`"
         if [ "${loadavg}" -gt 0 ]
         then
            MAKE_FLAGS="-l ${loadavg}"
         fi
      fi

After evaluate it, the make jobs for my system is 3. Do you think that is the ideal make jobs formula for multicores system? And what is your own formula?

rigoletto@ · Aug 4, 2020

There is none. This is also depends on available RAM and what exactly you are building. Small things should benefit of more jobs, while for large software it can make things worse because more jobs needs more memory, and if that make the build touch swap you lose performance.

In other words, trial and error.

Deleted member 63539 · Aug 4, 2020

rigoletto@ said:
There is none. This is also depends on available RAM and what exactly you are building. Small things should benefit of more jobs, while for large software it can make things worse because more jobs needs more memory, and if that make the build touch swap you lose performance.

In other words, trial and error.

I think you are right. Trial and error is the way to go.

mark_j · Aug 4, 2020

Rule of thumb seems to be 1.5x core count, but as rigoletto says it all depends.

Alain De Vos · Aug 4, 2020

I play a youtube video. When the sound get distorted i reduce the number.

Mjölnir · Aug 4, 2020

Alain De Vos said:
I play a youtube video. When the sound get distorted i reduce the number.

A very practical attempt

I like that one!

Usually I use MAKE_JOBS=$(( $(sysctl -n hw.ncpu) + 1)) (sh(1) syntax, not csh(1)). That's ok on a dual core with SMT (4 threads), because many compile tasks on small source files are very fast, and the build might get I/O (disk) bound. It's not good when the application to build raises long compile tasks, e.g. LLVM. Then MAKE_JOBS=number of physical cores is more appropriate.

ralphbsz · Aug 4, 2020

It also depends on the IO system, how good the throughput and latency of the file system are. Compilation is very IO intensive, as each step reads many files (source and header files, or object and libraries), does IO to temporary files, then writes a file. Caching is very important. So you need to find out what the bottleneck is, and make sure that bottleneck is kept completely fed.

On a single-node C++ compile with a good file system and sufficient memory, my rule of thumb used to be 4-8x the number of cores; that pretty much guarantees that there is always at least one runnable process for each core, and the bottleneck (CPU) is kept 100% busy. I actually measured it, and 1-2 compiles per core had lower performance, while more than 8 didn't help much. For distributed compiles on a large cluster (with a cluster file system), the bottleneck is the network, and I go down do 1-2 compiles per host (not per core!); any more just creates large queues of jobs on some nodes that clog the file system.

I'm sure there is not one formula, and it depends on the source code layout, fanout of files, what fraction of files are intermediate and need to be written and immediately read, and obviously memory and IO hardware. Trial and error is the correct answer, as is monitoring the bottlenecks.

Solved What is the ideal make jobs formula for multicores system?

Deleted member 63539

Guest

rigoletto@

Deleted member 63539

Guest

mark_j

Alain De Vos

Mjölnir

ralphbsz