C mkuzip/mkulzma multithreaded

ASX · Jun 12, 2016

hello,

I've modified the 'mkuzip' utility to make use of multithreading where possible, basically parallelizing the compression tasks.

The problem I met while doing this was that the compressed uzip file is a sequence of compressed blocks, and that sequence must remain unchanged, thus limiting the level of parallelization that can be achieved.

I used a relatively simple approach: read and compress N blocks in parallel, (where N is the number of simultaneous threads), wait until all N thread completed, then write the N blocks preserving the sequence, and loop around.

That approach prevent the full use of the cpu cores, still is a good performance enhancement over the single thread version.

On a core i3 (dual core + ht) the speed gain is around 2.5x / 3x, but on cpus with more cores the speed gain is more noticeable.

Otherwise, on single core cpus, the effects of the threaded implementation is practically none.

The compressed files will be (and from my tests are) bit by bit identical to the original version.

Of course, other strategies could have been implemented, but that would have been significatively more complex with the risk to introduce bugs, i preferred a safer route.

Primarily for test purposes, I've added a "-t N" option, to force the use of N number of threads.

src: https://github.com/zBSD/mkuzip

below, some tests using 4k, 16k and 64k block size, on a testfile made up from 10 kernels concatenated together.

Code:

CPU  : Intel(R) Core(TM) i3-2120 CPU @ 3.30GHz
K Arch: amd64 i386
Cores : 4
Copying some files from various location ...

Filesystem  Type  Size  Used  Avail Capacity  Mounted on
tank/ROOT/initial  zfs  478G  44G  434G  9%  /

/usr/bin/time ./mkuzip -t N -s X test/kernel-x10

 01 thr -s 65536  28.17 real  27.93 user  0.10 sys
 02 thr -s 65536  15.63 real  28.11 user  0.16 sys
 04 thr -s 65536  10.68 real  35.23 user  0.18 sys
 06 thr -s 65536  10.96 real  34.41 user  0.15 sys
 08 thr -s 65536  10.06 real  35.26 user  0.25 sys
 12 thr -s 65536  9.88 real  35.55 user  0.29 sys
 16 thr -s 65536  9.85 real  35.83 user  0.30 sys
 24 thr -s 65536  9.62 real  35.87 user  0.27 sys
 32 thr -s 65536  9.57 real  35.97 user  0.25 sys

 01 thr -s 16384  15.38 real  14.94 user  0.44 sys
 02 thr -s 16384  8.65 real  15.09 user  0.48 sys
 04 thr -s 16384  6.25 real  20.05 user  0.50 sys
 06 thr -s 16384  6.59 real  19.43 user  0.53 sys
 08 thr -s 16384  6.01 real  20.29 user  0.69 sys
 12 thr -s 16384  5.81 real  20.44 user  0.63 sys
 16 thr -s 16384  5.76 real  20.29 user  0.56 sys
 24 thr -s 16384  5.63 real  20.42 user  0.56 sys
 32 thr -s 16384  5.73 real  20.42 user  0.76 sys

 01 thr -s 4096  12.28 real  10.96 user  1.38 sys
 02 thr -s 4096  7.26 real  11.37 user  1.80 sys
 04 thr -s 4096  5.26 real  15.28 user  2.17 sys
 06 thr -s 4096  5.67 real  14.57 user  2.23 sys
 08 thr -s 4096  5.03 real  15.28 user  2.28 sys
 12 thr -s 4096  5.05 real  15.34 user  2.25 sys
 16 thr -s 4096  4.90 real  15.41 user  2.19 sys
 24 thr -s 4096  5.02 real  15.32 user  2.32 sys
 32 thr -s 4096  4.89 real  15.48 user  2.11 sys

AngelescuO · Jul 1, 2016

Great job ASX.
I will use it.
Thank you.

ASX · Jul 1, 2016

You are welcome!

In the meantime, I have applied a similar enhancement to mkulzma, I will post additional info as soon as the changes will be completed.

ASX · Jul 2, 2016

I have just pushed the changes related to mkulzma, this utility is very similar in scope and implementation to the mkuzip utility, therefore I will not rewrite the details described above for mkuzip, they are the same for mkulzma.

I will ask for help of forum admin/moderator to change the topic title to mkuzip / mkulzma multithreaded.

src: https://github.com/zBSD/mkulzma

Code:

CPU  : Intel(R) Core(TM) i3-2120 CPU @ 3.30GHz
K Arch: amd64 i386
Cores : 4
Copying some files from various location ...
test/cc
/usr/bin/mkulzma  10.26 real  9.09 user  1.15 sys
./mkulzma  4.40 real  13.70 user  2.60 sys
====================================================================
test/cc-x10
/usr/bin/mkulzma  102.11 real  91.40 user  10.63 sys
./mkulzma  43.63 real  140.49 user  22.39 sys
====================================================================
test/gdb
/usr/bin/mkulzma  1.03 real  0.94 user  0.09 sys
./mkulzma  0.46 real  1.37 user  0.23 sys
====================================================================
test/gdb-x10
/usr/bin/mkulzma  10.37 real  9.41 user  0.96 sys
./mkulzma  4.47 real  14.14 user  2.20 sys
====================================================================
test/kernel
/usr/bin/mkulzma  7.52 real  6.78 user  0.73 sys
./mkulzma  3.19 real  10.21 user  1.86 sys
====================================================================
test/kernel-x10
/usr/bin/mkulzma  75.23 real  66.83 user  8.38 sys
./mkulzma  31.71 real  102.92 user  17.25 sys
====================================================================

ASX · Sep 7, 2016

Today I learned a thing, the hard way.

Just discovered that not only mkuzip was already updated to make use of multiple cores, but also that mkulzma features has been merged in mkuzip.

Deserved, I would say.

On my defense, I was and am relatively new to the BSD world, just entered into in April 2015, and the updated mkuzip version is in 12-current, I didn't noticed about.

Surprisingly, to me at least, is that no one yelled at me: "Hey, what are you doing ?"

Be sure, I will not forget the lesson.

C mkuzip/mkulzma multithreaded

ASX

Guest

AngelescuO

ASX

Guest

ASX

Guest

ASX

Guest