Distributed Build vs. Build Server

I recently tested distcc with a setup of 1 master + 2 slave nodes for ports building (slaves are diskless, booting off master). Some observations and questions about this:

OBSERVATIONS:
- The nodes have 1G RAM and a small HDD attached for SWAP. The HDD is not necessary as 1G is more than enough for the node to do it's job.
- My /usr/local/etc/distcc/hosts file has 127.0.0.1 as last entry with 2 threads allocated, nodes use full threads and lzo compression for communication. /usr is exported via NFS so nodes have same ports structure as master.
Code:
node1/4,lzo node2/4,lzo 127.0.0.1/2
However, the most noticeable performance improvement came when I mounted a separate HDD on to /usr/obj. This second HDD was considerably slower than my main HDD, but the separation freed master node's HDD to pass code to slaves much faster.
- 1 Node was enough and even then was very often idle. Many ports do not build with makejobs and some ports have bootstrap layers so I guess they cannot distribute from the "chrooted" build environment they create for themselves (like lang/gcc46 and java ports). Others just don't like distcc (devel/boost-libs).

PROBLEM: This causes the master node to be completely tied-up during non-distributable builds while the nodes sit idle.

NEW SCENARIO: Dedicated build node employing distcc.
- Make node1 do the builds (run portmaster from node1), place an HDD on node1 and direct the WORKDIRPREFIX there.
- Run portmaster with the "-g" option and from a list of ports (not portmaster -a) to create packages. Later, upgrade all ports on /usr/local through packages from PKGREPOSITORY (portmaster -PP).
- Run distcc but change host entry order to "node2 node1 master". This way master only serves the distfiles and ccache shared over nfs, collects the created packages.

QUESTION: Any better solutions to this problem? What other approach can be taken?
- The only major problem with this is that FreeBSD requires the built port to be installed before it creates a package for it (same under # make package-noinstall).
- If the ports built on node1 are to be installed before package_create, /usr will have to be exported with r/w access as well as /var/db/ports and /var/db/pkg (very ugly).
- One work-around would be to have node1 run all the builds (just make, no install) then export through nfs the folder /usr/obj (or wherever WORKDIRPREFIX has been mounted on node1. Then, master-node can run portmaster with "-C" (don't clean before) and install from ready-made builds.

EDIT ON LAST PARAGRAPH: Some ports try to install certain things in usr even if you just "make" (mostly KDE stuff I guess). But one strange thing I keep coming up with is that .configure cannot find the installed devel/libtools and tries to install the "missing" libtools??
 
I wanted to try "pump" mode of distcc, but my system does not recognize pump nor does it have a man page for pump, despite distcc(1)() refers to such a page!

For some long builds (ex: www/libxul) I let node1 do the build with make. I then exported the folder (WRKDIRPREFIX) on node1, mounted it (rw) on master and it installed perfectly with:
# portmaster -C www/libxul

And now something strange: build for print/hplip broke on master but ran ok on node1 - it also installed without errors.

Does anyone know why builds on node1 would be unable to find devel/libtool and would proceed to try and install the port when it's already installed? #libtool --config gives all correct output...

Re. My first port: How are the FreeBSD build servers structured for the task anyway? Is there any concept that could be useful for my question?
 
Hey Beeblebrox, interesting stuff. Your problem regarding this:
builds on node1 are unable to find devel/libtool ... and install the port
The reason it does that is because the port looks in /var/db/pkg for the dependent port. Since you are running this on a thin-client, the folder mounted as var does not have that record - although the necessary libraries exist in /usr.

To correct, just copy the relevant folder in host's /var/db/pkg to the diskless client's /var/db/pkg. The config of whatever you are trying to build will then be satisfied.
To be perfectly clear and to solve your specific problem do this (from host cli):
# cp -av /var/db/pkg/libtool* /wherever/pxebooot/var/db/pkg/
This will copy the necessary package registration data to the diskless chroot.

Now you can build the port in a "slave" environment.
 
Thanks a lot, Beeblebrox. I tried to thank you, but cold not for some odd reason.

I can say that your solution works. In fact, it now works so well that the pxe-booted node1 completely carries the build load, while boot-server can keep doing it's job without slamming into the wall.

The second reason that it works so well (and this just really makes no sense) is that ports that DO-NOT build in the main system, build on the distcc farm.
I just built firefox & seamonkey, which broke when building on master. I am sure the source of the problem is that environment in master vs nodes have some differences.

Installing the pre-built mozilla app is a different problem altogether because you get error:
Code:
 Your source tree contains these files:
*         /mnt/obj/asp/ports/www/firefox/work/mozilla-release/Makefile
*         /mnt/obj/asp/ports/www/firefox/work/mozilla-release/config/autoconf.mk
*   This indicates that you previously built in the source tree.
*   A source tree build can confuse the separate objdir build.
*
*   To clean up the source tree:
*     1. cd /mnt/obj/asp/ports/www/firefox/work/mozilla-release
*     2. gmake distclean
***
*** Fix above errors and then restart with               "gmake -f client.mk build"
Yeah, I don't want to "clean the source tree" and "re-gmake" everything! There's a reason I built the app off-platform.
 
Back
Top