Shell Symlinking for lbzip2 and parallel compression

Hi!

I am wanting to make use of parallel bzip2 compression using lbzip2 as the default option on an 8 core FreeBSD 10.3 machine.

My question is if I do the following:

Code:
cd /usr/local/bin
ln -s /usr/bin/lbzip2 bzip2
ln -s /usr/bin/lbzip2 bunzip2
ln -s /usr/bin/lbzip2 bzcat

Will all applications that call bzip2 automatically redirect to lbzip2? Does anyone have any experience with this? Are there any scenarios where I could expect breakage? Also, how would I revert back to the original defaults if it turns out to be a nightmare?

Thanks!
 
Will all applications that call bzip2 automatically redirect to lbzip2? Does anyone have any experience with this? Are there any scenarios where I could expect breakage? Also, how would I revert back to the original defaults if it turns out to be a nightmare?

Only applications which use PATH to find one of those 3 named commands, and have /usr/local/bin ahead of /usr/bin in PATH will use those. By default, that would be nothing, as /usr/bin is at the front of the search path, as it should be by default (to prevent ports overriding system behaviour). Not all applications use those commands, many will use libbz2 (which certainly cannot be simply replaced by a threaded alternative implementation, without very careful engineering).

As far as breakage goes, things may break if those commands do not precisely implement all of the same command line options and give precisely the same external behaviour (i.e. it doesn't matter what happens inside the running binary, as long as the inputs and outputs are identical). Looking at lbzip2 on the web, it claims to essentially be identical on the command line, but I wouldn't know for certain that there are no subtle differences which could break something.

Reverting is simple — just remove the symlinks from /usr/local/bin.

Personally, I would recommend quite strongly against what you suggest. For miscellaneous cases, just let everything use the default system bzip2 tools. For general usage, particularly when used by system tools, the speed mostly should not be terribly important (e.g. when newsyslog(8) uses it, it makes absolutely no difference whether it takes 1 minute or 10 minutes, and the threaded lbzip2 will use more system resources than the normal version, potentially needlessly slowing other things down). For specific cases which make such heavy use of it that the speed is actually important, or cases where there is a user actually interactively waiting and the input files are larger, engineer solutions specific to those particular cases (i.e. arrange for those applications to explicitly use lbzip2). Taking that approach, you only need to worry about whether it all works cleanly for the narrow set of applications using it.
 
This is quite solid advice. Very informative as well. Thank you! In actuality it is one specific application that I was wanting to call these tools. According to what you have pointed out, it makes better since to find a way to have that specific application call lbzip2 rather than rewiring the entire system to use lbzip2 by default.
 
Back
Top