Speeding compile or boot time

sidetone · Mar 11, 2015

For the kernel, removing hardware from the KERNCONF file that your system doesn't use, greatly reduces compile time. Also, upgrading the cpu from 32 to 64 bit does too. Both of these together reduced my kernel compile time from 2 hrs to 15 minutes. Using make with the -j option cut it down to 9 minutes.

Make options

Code:

echo 'CPUTYPE=<myprocessor>' >> /etc/make.conf
<directory>/make -j<# of cores>

The commands dmesg with grep can help find hardware

There are also other options like funroll-loops, but this may speed compile time, at the cost of performance.

There is also mounting ~~/var and~~ /tmp to memory by fstab and conf, by use of the programs tmpfs(5), and/or mdmfs(8), which should work for performance. Perhaps, I didn't mount the work directories to here.

I read that an SSD harddisk can improve boot times to a 10s of seconds. As for compiling, it had a negligible effect. Also, its function is not suited for it anyway, considering SSD isn't made for the wear and tear of constant rewrites.

I was wondering if it were possible to use FPGA's (Field-programmable gate array) to speed up compiling, and/or other processes. ASIC (Application-specific integrated circuit) use for this may seem doubtful.

Even for reduced kernel compile times at 9 minutes, compiling programs still takes hours.

There was also a thread somewhere, where they loaded /boot to memory to speed up boot time in NanoBSD. Does anyone have any ideas they'd like to add?

sidetone · Mar 12, 2015

This is good. http://www.bsdnow.tv/tutorials/ccache - Faster compiles with ccache and tmpfs

rusty · Mar 12, 2015

I found ports-mgmt/poudriere very nice for speeding up compile time, especially with the USE_TMPFS=all setting.
More info here - https://forums.freebsd.org/threads/pkgng-package-repository-using-ports-mgmt-poudriere-no-zfs.38859/

sidetone · Mar 12, 2015

I got the ports and the kernel work directories to compile in memory, along with /tmp ~~and /var~~ using tempfs, ~~varmfs~~ and mdmfs. It didn't speed up compile time, however the hard disk made less noise, and the first stages of the build seemed to fly. I'm not going to dismiss memory filesystems. anyway, there's probably a setting to make it work better. Aside from being organized, using tmpfs and ccache, I don't know if ports-mgmt/poudriere can further improve performance.

Moving the -j option higher than the number of cores didn't make improvements.

Video cards can be used for certain types of processing, and are efficient in certain tasks where cpu's aren't. It's the opencl program and devel/libclc library that allow the gpu to do computing tasks other than graphics. One logical question would be, how to enable libclc and opencl for use in recompiling (the base system program) clang? Will rebuilding it with opencl and libclc support do it?

kpa · Mar 12, 2015

Most of the compilation tasks are heavily CPU intensive with very little time spent on I/O so the only way you'll get improvements is either better parallelization or getting a more poweful CPU.

ANOKNUSA · Mar 12, 2015

sidetone said:
Aside from being organized, using tmpfs and ccache, I don't know if ports-mgmt/poudriere can further improve performance.

Depends on what you mean by "improve performance." ports-mgmt/poudriere automatically builds ports in parallel, with the number of builds based on the number of available CPU threads the system sees (so my hyperthreaded i7 quad-core builds eight ports simultaneously, for example). On one hand this means that each build process only has one process thread to work with, which means that large ports like www/firefox will take considerably longer to build. On the other hand, it also means that one large port won't hold up the entire build process. By the time that one large port is done, many (most?) other smaller ports will have been built alongside it.[1] It also simplifies the update process by letting users write out and save a complete list of ports to be built and installed, then set all options for the ports on that list, and then run a single command to build, install and update them. You can check out the homepage and BSD Now tutorial for a rundown on it.

[1]: I have no idea how or even if Poudriere organizes a prioritized build queue. Build dependencies are always built first regardless of size, of course, but it may or may not be the case that smaller/simpler ports are set to build before larger/complex ones.

sidetone · Mar 12, 2015

Poudriere is a nice tool, but I don't think it will speed up compile time, more than what can already be done with filesystem RAM, ccache, and organization.

kpa
It seems like using a RAM filesystem would speed things up, by not having the CPU wait. Perhaps filesystem RAM helps, but to an insignificant amount; I'm still unsure of this.

About GPUs, which any GPU can be used as a General Purpose GPU (GPGPU), these can be used for processing, not only limited to display output. There are some tasks where GPGPUs can processes information thousands of times faster than a CPU, yet a GPU can't do much of what a CPU can do. One of GPU's uses is in scientific calculation, and/or math which there isn't decision making involved. The CPU makes decisions, but it can't do math anywhere as efficiently as a GPU. So both of these used together can give use and maximum performance.

---
In basic terms, as the name implies, GPU is a processing unit too; and its uses go beyond graphics for output display. I'm curious if a GPU can be used to for typical FreeBSD compiling (through CLang, libclc and Opencl).

These links show a little bit about the type of (math) processing GPUs can do: About GPGPU.org, Do more with graphics: power up with GPU, Using GPGPU for water simulation

kpa · Mar 12, 2015

sidetone said:
kpa
It seems like using a RAM filesystem would speed things up, by not having the CPU wait. Perhaps filesystem RAM helps, but to an insignificant amount; I'm still unsure of this.

It depends on what type of source files are compiled. Lots of many short source files with plain C code would benefit quite a bit from RAM filesystem because the overhead of having to read many small files would be then nicely overcome and C code is often quickly compiled. If the source code is more complex, let's say long and complex C++ code the time taken to compile and optimize the code starts to dominate and in that case a RAM filesystem wouldn't help much.

sidetone · Mar 12, 2015

That makes sense. RAM and filesystems on RAM couldn't hurt. Perhaps I mounted /tmp on RAM, before timing the differences; and then little room for speed was to be made when mounting build directories on RAM too, which it seemed like there was some improvement in the first stages of compiling. Some ports with complex code may benefit from it. I'll stick with mounting filesystems on RAM anyways; it seems to work better, even if/when it does little for time.

If only they had 128bit CPU's in the mass market (which is a probably a decade away).

kpa said:
If the source code is more complex, let's say long and complex C++ code[,] the time taken to compile and optimize the code starts to dominate

Imagine what a 64, 128, 256, or 512bit GPU could do for computations, if not for general purpose compiling. Maybe useful for trigonometry, or algorithms; it would run through the math of a bookshelf full of textbooks many times quicker than a CPU by itself.

wblock@ · Mar 13, 2015

buildworld is almost entirely CPU-dependent. There are two practical ways to really speed it up. First, use NO_CLEAN. This does not delete the contents of /usr/obj, so most things will not need to be compiled at all. This depends on how much has changed in source, so the build time is somewhat variable. Of course, /usr/obj must be kept around, taking up around 3G of space for an amd64 system.

devel/ccache is the other method. Now that it is effective with Clang, it can help to about the same degree as NO_CLEAN. However, it also needs cache space, the more the better. When I used ccache alone, I gave it 4G. Recently, I tried a few tests, but did not notice much difference with just one or the other or both. In either case, a normal buildworld on my system takes about twenty minutes. With NO_CLEAN that is often under five minutes, sometimes under two.

sidetone · Mar 13, 2015

My guess is that NO_CLEAN can muddy the code; until a new make clean is done. CCache looks like a better option of the two.
--
Note: I found out the hard way, mounting /var unto RAM, will make the database's package listings not work. pkg info will turn up nothing.

Code:

varmfs=

is an option in rc.conf. Now I see that varmfs is made for read only installs. Thread how-to-make-freebsd-8-0-filesystem-read-only.14014. I'm thinking of symlinking /var/tmp and other subdirectories of tmp/ to /tmp.

wblock@ · Mar 13, 2015

sidetone said:
My guess is that NO_CLEAN can muddy the code; until a new make clean is done. CCache looks like a better option of the two.

They really are very similar in effect. With NO_CLEAN, it is make(1) that determines what needs to be rebuilt. It's not just the presence of an object file, but whether the files that section of code depends on have changed. ccache is actually called before the compiler.

I've used NO_CLEAN for several years with no serious problems. At most, rm /usr/obj first causes a full build. Before that, I used devel/ccache for several years, with no serious problems either. Immediately after the switch to Clang, it stopped helping, a problem that seems now to be fixed.

Both together might be the best option, but my initial tests did not make it appear to be worth using another 4G for cache space. Benchmarking this is a bit difficult, because it depends on the amount of changes in the source.

sidetone · Mar 13, 2015

Installing devel/ccache, and enabling it in make.conf for the kernel didn't do anything, possibly consistent with what you said. Maybe after updating Clang, it might work better. I'm still skeptical of using NO_CLEAN.

hukadan · Mar 14, 2015

devel/ccache is not used with clang by default. Be sure to build devel/ccache with CLANGLINK and LLVMLINK enabled (I am not 100% sure you need the latter).

sgeos · Mar 14, 2015

wblock@ said:
Benchmarking this is a bit difficult, because it depends on the amount of changes in the source.

You need to compare apples to apples. Compile the same somewhat arbitrary revision A followed revision B to benchmark.

wblock@ · Mar 14, 2015

sgeos said:
Compile the same somewhat arbitrary revision A followed revision B to benchmark.

Right, but the amount of dependencies can vary wildly from just a few small changes in the source, and that can unfairly bias a benchmark for one method or the other. The easiest test is just a best-case situation of building from scratch, reboot, then building again. But that is just best-case, and it is hard to generalize from that to an average improvement.

sgeos · Mar 14, 2015

wblock@ said:
But that is just best-case, and it is hard to generalize from that to an average improvement.

Can this be taken as an assertion that neither solution is clearly better? Are the different solutions clearly geared toward different circumstances?

sidetone · Mar 14, 2015

hukadan said:
devel/ccache is not used with clang by default. Be sure to build devel/ccache with CLANGLINK and LLVMLINK enabled (I am not 100% sure you need the latter).

That's a good idea. I tried it, but perhaps I'm missing something. I'll just keep looking at it, and recheck make.conf.
---

sgeos said:
Can this be taken as an assertion that neither solution is clearly better? Are the different solutions clearly geared toward different circumstances?

Comparing kernel builds or program builds within two similar jails are good benchmarks. Other ways would take up way too much time, which in most cases is not worth it, where it's just better to notice whether it's considerably faster.

wblock@ · Mar 15, 2015

sgeos said:
Can this be taken as an assertion that neither solution is clearly better? Are the different solutions clearly geared toward different circumstances?

No, just pointing out that the only sure way to know if they are different involves benchmarking those different situations. Some things are easy to test, but still take a while. For example, do best- and worst-case times for buildworld with each of these conditions. Best-case is the source not having changed at all and all of the code already being present in /usr/obj or in ccache cache, worst-case is /usr/obj having been deleted and ccache cache cleared.

plain
NO_CLEAN
ccache
NO_CLEAN and ccache

The system should be reset between each test, and really each test should be run two or three times for averages. Nothing else should be running during the test.

sidetone · Mar 23, 2015

Removing

Code:

gmake

from a port's Makefile makes it so gmake's bulky dependencies don't get installed, when FreeBSD already has Clang. This is >30 minutes of compiling redundancy, not to mention confusing options, that can break programs.

Code:

USES+=     gmake

I've also found out that removing gnu options can break the compile.

wblock@ · Mar 23, 2015

This is confusing. Clang is a compiler, not a make(1) utility. gmake is not quite compatible with the FreeBSD make(1), and most ports that say they need gmake actually do need it. Maybe you mean gcc, but Clang is not quite the same as various versions of gcc, and assuming that a port which claims to need gcc does not really need it will lead to problems.

gcc does take a while to build, but like most build dependencies, it does not change all that often. Build it and leave it installed to avoid self-induced rebuilding pain.

sidetone · Mar 23, 2015

Thanks for the clarification. So can both gmake and make use CLANG for building? Now I don't think the problem is gmake in itself, I think the problem is how it was imported from a linux perspective, how Makefile dependencies can be improved, and it needs cleaning up to suit FreeBSD 10.1. Also, removing gmake, (along with cmake, imake) does break some ports. It seems like ports specific to gnu and a few others need gmake; many other ports build fine by using make instead of gmake.

When I mess with GCC and gmake, I want to set the options to be compatible for overall use, and the configuration options seem to go on forever, then there's the circular dependency issue that can happen when I'm using make config-recursive. It takes all day, then when I leave and come back there is a chance the compile broke, this is before I tried to hack the Makefiles. I tried hacking the Makefiles with wine, and it compiles cleanly, no daylong hassle, and easy options for full features. It seems like emulators/i386-wine is updated with these, and it installs much smoother.

wblock@ · Mar 24, 2015

The choice of make utility does not affect the choice of compiler. Some Makefiles written for gmake are compatible with FreeBSD's make(1). Otherwise, gmake needs to be used or the Makefile must be patched.

It's not clear exactly what is causing your problems, or even what you are attempting to fix. After such hackery with ports, I would suggest restoring the default files and then rebuilding everything that was modified. And start a new thread, because this has nothing to do with the subject of this thread.

Speeding compile or boot time

hukadan

Guest