Is CLANG to be so much recommended for FreeBSD ?

I don't understand your question completely.

CLANG is part of the CLANG+LLVM compiler set for C-like languages (C, C++, Objective C, and a few bizarre variants). As far as I know, the only other complete set of such compilers that is available on FreeBSD is Gnu gcc.

Both work. I've used them both, sometimes on the same project. In my humble opinion, CLANG tends to output clearer warning and error messages, which helps developers. I think both are reasonably up-to-date and capable of compiling newish code (for example C++-14). I'm sure both have bugs, although very few; it is surprisingly rare to find compiler bugs. I don't know which one produces better (faster) object code, but the difference is probably small in most cases; the generally accepted wisdom is that LLVM is better in that, but I haven't measured it to be sure.

As of a few years ago (I forget which exact version), CLANG+LLVM is the default compiler on FreeBSD, but gcc is still available as a port.

For most people, it should make very little difference.

What alternatives exist other than CLANG+LLVM and gcc? The Portland group compilers used to be really good (incredibly fast code), but after Nvidia bought them, I have not seem them used. Intel sells compilers too, and again they have a very good reputation for fast code, but they are also expensive. On Windows, there is obviously Microsoft, not relevant to FreeBSD. There are various vendors of compilers for embedded systems (Green Hills, Hi-Tech), again not relevant to hobbyist use on FreeBSD.
 
There is the Intel lang/icc which I assume is used by default on Clear Linux. However, as expected, that compiler is completely focused only on Intel hardware. In the same sense of the IBM XL for POWER processors.
 
I didn't know that Intel's icc is now available as free or open source. Color me surprised. I had heard about Clear Linux a while ago: it is a Linux distribution made by Intel, and intended to make Intel CPUs look good in the IoT and AI market, where Intel is worried about competition from Arm (which has a different instruction set) and from ASICs (like Google's Tensor hardware, and the new AI-chips made by the chinese company that previously sold Bitcoin mining chips). Perversely, Clear Linux is also pretty fast for normal server and desktop use, where Intel's competitor is mostly AMD: the perverse part is that Clear Linux runs perfectly fine on AMD chips, and also makes them run faster. So Intel is (perhaps inadvertently) helping one competitor, while trying to hurt other competitors.

And obviously IBM (and HP and Sun) also still sell C and C++ compilers for their CPUs (power, itanium, and spare), but they are not relevant to the vast majority of users.
 
Well, icc did show up with some "for-your-safety" tweaks, where some code was checking the cpuid instruction and in case of not-intel go some other way which was not as heavily tuned.

But on topic, yes, clang can really be recommended. Just play a bit with the scan-build tooling giving you a detailed roadmap to the problem points. This was one strong point I brought to the table in my current job when I was yelling about the code quality.

End of story: we now run with 3 compilers and three static analyzers in the build system and your code MUST build with -Wall -Werror. Without clang, code quality would still suck a golf ball trough a garden hose.
 
Is CLANG compiler to be so much recommended for FreeBSD?

From the perspective of a common FreeBSD user, Clang is the most practical C/C++/Objective-C compiler option because it ships with the system, while any other compiler must be installed separately, perhaps using Clang for boot strapping.

As to why FreeBSD switched form GCC to Clang, the short answer is because of the license – GPLv3 vs. BSD.

In 2014, David Chisnall, the port maintainer of devel/gnustep and the developer of devel/libobjc2, wrote down the whole story, and this gives also answers about everything you always wanted to know about BSD & LLVM/Clang but were afraid to ask for: https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/201403-asiabsdcon2014-llvmbsd.pdf

There are many other possible alternatives to CLANG.

Please feel free to name a few alternatives which are BSD licensed and for this reason may be legally shipped within the BSD licensed FreeBSD base system.
 
Portable Custom Compiler, Tendra compiler (also listed in that link) and one that was used by Minix, Amsterdam Compiler Kit (ACK). But they aren't suited for C++.

PCC could be made to work with the base system, without compiling what little is needed by C++: groff (GNU) documentation, certain modules/dev devices. That way, the base install can be slim, compile quickly (saving time and be simpiler to troubleshoot), and Clang/LLVM can be installed and upgraded as needed. PCC would also have to be able to build LLVM/Clang and other relevant compilers.

When the same toolchain of the same name in the base system and ports are used together, it seems that programs are compiled with components from both, with less repeatability, or with users unsure of which components from which compiler are being used, when builds are successful. For instance, Clang will call on utilities from LLVM and elf toolchain, depending on what an instruction set calls for, with what looks like probability.

The last time I checked, there was no command line option to make [kernel] modules. Modules can be compiled on their own, [by going to the subdirectory, or editing make.conf to compile modules with the world or kernel] but there was no make command line option directly to be used from /usr/src/ or simple way for it.
 
From the perspective of a common FreeBSD user, Clang is the most practical C/C++/Objective-C compiler option because it ships with the system, while any other compiler must be installed separately, perhaps using Clang for boot strapping.

As to why FreeBSD switched form GCC to Clang, the short answer is because of the license – GPLv3 vs. BSD.

In 2014, David Chisnall, the port maintainer of devel/gnustep and the developer of devel/libobjc2, wrote down the whole story, and this gives also answers about everything you always wanted to know about BSD & LLVM/Clang but were afraid to ask for: https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/201403-asiabsdcon2014-llvmbsd.pdf



Please feel free to name a few alternatives which are BSD licensed and for this reason may be legally shipped within the BSD licensed FreeBSD base system.

License ok, but it is quite strange that you moved from GCC to Clang. Clang is not a great improvement compared to GCC. Why not other possible compilers?
 
GCC used gigabytes of bloat to get 1 usable instruction set.

40 hours of compile time, for what takes 15 minutes of all the necessary compile.

* {At one point in time, that's basing on Clang already available in base, or 3 hours of LLVM/Clang compile time compared to compiling GCC as an extra or dependency forced compiler, and it forcing unrelated dependencies to be compiled]
 
But on topic, yes, clang can really be recommended. Just play a bit with the scan-build tooling giving you a detailed roadmap to the problem points. This was one strong point I brought to the table in my current job when I was yelling about the code quality.

Thank you for pointing that out; I had thought about it last night, and then failed to write it down.

To begin with, CLANG documents the internal format of its intermediate files in a useful fashion. That means that pieces of CLANG can be used to build project-specific parsing tools, code analyzers, intermediate languages, and so on. With CLANG this is reasonably easy (still a lot of work, but doable); with the gcc internal formats it was horribly hard. For large projects with large code bases, it makes it possible to have project-specific quality improvement; for example one can automate lock acquisition/release tracing, and build deadlock detectors into the compilation this way.

More importantly, CLANG's warning messages are way better than gcc. They find lots of problems in the code that gcc can't detect; they are not quite as good as very expensive commercial solutions (Coverity still finds way more problems), but it is a great compromise between cost (CLANG is free), ease-of-use (installing it takes either nothing or an hour, no negotiations, no purchase order), and power in finding bugs (good but not great). For an amateur without deep pockets who is programming in C or C++, I would always suggest running your code through CLANG with -Wall, and taking messages very seriously (for a professional team, one really has to go much further in code quality).

And even more importantly: CLANG's warning messages are not only better, they are also to some extent orthogonal to gcc and other compilers: It finds problems that no other compiler finds; conversely, other compilers also find problems (although fewer) that CLANG ignores. At my previous job, we used to run our code through three compilers (g++, CLANG in C++ mode, and IBM AIX xlC), plus a few commercial or home-brew static analysis tools. All found problems.

The unfortunate side effect is cost. A complex project with a few million lines of code is always toolchain specific. It took us several weeks to get it to build with CLANG, and that's after the infrastructure and code had already been set up to switch compilers. On the other hand, a project of that size represents an investment of hundreds of man-years, so a few weeks of work is a good investment in improving code quality.

In summary, I was pretty happy when CLANG showed up as the default in FreeBSD. And right away I recompiled some things for fun (home projects), just to see how it went. And then spent an evening fixing bugs, or drinking because I discovered how crappy the code was I had written under gcc.

(Obnoxious side remark: An even better solution is to not program in a language that creates so many problems. For example, there are languages where "comparison of signed and unsigned" isn't an issue, where memory leaks from forgetting to free what has been malloc'ed is impossible, where locks don't need to be used so much because one uses co-routines or the "synchronized" keyword, where variables don't get implicitly converted to the wrong data type in expression making pi==3, and so on. But if one has to program in C++ or C, one has to use all help one can get.)
 
LLVM/Clang doesn't take that long to compile. Maybe 2 to 3 hours.

GCC using so many (or 20) hours of compile time, had to do with its forced unrelated dependencies, which weren't as sorted out then, but it's better now.

The advantage that GCC has, is it has instruction sets that Clang doesn't have for all architectures, but that would have to be fished out from bloated code. Another advantage of GCC, is that LLVM utilities and elf tool chain utilities don't have a fully functional equivalent for all of GCC's binutil components. For now, Clang and many of GCC's binutils have to be used together to get the best or only available performance.
 
GCC uses gigabytes of bloat to get 1 usable instruction set.

40 hours of compile time, for what takes 15 minutes of all the necessary compile.

If you are upset about large compile times, you should try the Go language sometime. The compiles are amazingly fast.

If you are stuck within C and C++, here's a good trick to reduce compile times: Try to get rid of unnecessary #includes. First, make sure every .h file you wrote has an overall "#ifndef _THIS_H ... #endif" around it. Then make sure every .h file can be compiled on its own: create a 1-line .c file that just includes a single .h file, and that .c file must compile without errors, and repeat for every .h file. If a .h file doesn't compile, then add the necessary #include statements to the top of the .h file, but within the #ifndef guards. At this point, the order of #include statements no longer matters. Then go into all .c files, and remove all unnecessary #include statements. That can be automated with a simple script, which removes them one at a time, attempts the compile, and if compile errors show up, puts them back in.

I did that once for a medium-size project (about 500KLOC). Took me several days (much of that was running the above-described script over night, and checking the results in the morning). It knocked down our typical compile time by a factor of 3. Mostly because files ended up having fewer (fake) dependencies, so the make files didn't have to recompile things needlessly. Originally, I didn't tell my manager that I was going to work on this (because he would have gotten upset at wasting my time on needless housekeeping and cleanup); after I succeeded and everyone in our department loved it, my manager was happy that (a) it got done, and (b) he didn't get the opportunity of making the obvious mistake of trying to stop a worthwhile project.
 
If you are stuck within C and C++, here's a good trick to reduce compile times: Try to get rid of unnecessary #includes. First, make sure every .h file you wrote has an overall "#ifndef _THIS_H ... #endif" around it.

I actually avoid this for my own projects for the following reasons:

1) If your header file has been opened (to read the #ifndef) then it has already wasted time on file access.

2) We should only be #include'ing a file if we really need it. For 90% of things, we should be forward declaring. I.e #includes in C++ header files are often only needed when inheriting or non-pointer objects. In .cpp files, it should be the first time that a header is included anyway so the #ifndef is pointless.

3) We want to see an error on a cyclic include. Yes, it may look ugly in the compiler log and a bit of a mind-fsck to follow but in most cases it means that you have erroneously #included a header rather than a forward declare.

For example:

Code:
#include <memory>

class Enemy;

class Player
{
  std::weak_ptr<Enemy> enemy;
};

Instead of:

Code:
// Wrong, we dont need to know Enemy internals for a pointer.
// A wasted file access.
#include "Enemy.h"

// Note still need this because we need to know its size
// because the std::weak_ptr<T> object is used. Not just a pointer.
#include <memory>

class Player
{
  std::weak_ptr<Enemy> enemy;
};

For public libraries, I do still include guard because not all developers quite understand how C / C++ works.
 
Clang is not a great improvement compared to GCC.

If you dig deep on Gentoo infra you will find some very nasty bugs which took ages to discover what was causing them, because everything seems perfectly fine in the code, and after some serious debugging the guilty was gcc simply creating bizarre code.

IDK what is going on on Gentoo these days but until a few years ago they (at least part of them) were looking to switch to Clang by default, just using gcc to compile what is hard-coded to use it (basically, kernel and GNU stuff).
 
1) If your header file has been opened (to read the #ifndef) then it has already wasted time on file access.
That's true, but is actually not a major factor. During a compile on a reasonable machine (where the source mostly fits in RAM cache), the header files will remain in the file system buffer cache. Having to do open(), read() of a few lines, parsing very superficially to find the #endif, and close() again is pretty cheap, compared to having to parse everything through the full language parser. But you are right, it is not free. Not even having to open the file is definitely better.

2) We should only be #include'ing a file if we really need it. For 90% of things, we should be forward declaring. I.e #includes in C++ header files are often only needed when inheriting or non-pointer objects. In .cpp files, it should be the first time that a header is included anyway so the #ifndef is pointless.
Definitely true. Well designed C++ code to some extent follows the rules you propose; they can't always be strictly enclosed (for example, if one has to take sizeof() and object, one needs the header file). Refactoring code in this direction is a great idea, but is also a lot of work; the changes I described above can be done quickly and mechanically, without changing the design very much.

3) We want to see an error on a cyclic include.
Absolutely. Cyclic includes means that someone wasn't thinking. But this can be also done by analyzing the dependency graph that make spits out.
 
LLVM was chosen to "finally" replace GCC. One mentioned reason was the license. I'm not a MAC user (yet) but I've understood this compiler is from that other commercial section. I have used clang because I've understood the license allows commercial program development and I'm happy it could be compatible with commercial development with MAC:s. Actually I chose FreeBSD to be code compatible with Solaris and other commercial unices. LLVM was a professional choise and already available.

escape
 
Actually, it looks like they do still provide it - see eg Intel® System Studio 2018 Update 1 for FreeBSD* Release Notes.
In that case I would apologize but I am still not entirely sure. You click on their download link and it only lists Windows, MacOS, Linux and Android.

Perhaps it is on Life support. Also I note that the page you linked to is "the Release Notes for Intel® VTune™ Amplifier 2018 component of Intel® System Studio 2018 Update 1 for FreeBSD". Perhaps the compiler itself is gone and all that is left is the profiler.

I am tempted to contact them to find out its current state. You might need a specific support contract with them.
 
License ok, but it is quite strange that you moved from GCC to Clang. Clang is not a great improvement compared to GCC. Why not other possible compilers?
Changing the license of GCC from GPLv2 to GPLv3 was the main driver behind migrating to another system level compiler. Having said that, a mature and industry level BSD-licensed compiler would replace GCC in the long term eventually.

Compared to other BSD-licensed compilers, llvm is backed and used by many companies, hence promises a fast evolution and improvement path (this is controversial due to the fact that licensing terms of llvm does not require the consumer or contributor to make its contributions publicly available).

Since each stage of compilation - front-end, optimizer, back-end- in llvm is separated clearly, it is easier to implement and adopt new and advanced features in the compiler technology. As far as I know, the code separation between compilation stages and some data structures used in these phases of compilation in gcc is not as clear as llvm (but remember, gcc was written 30 years ago, when there was much less resources and abilities to reach resources around), which makes adding new features a big tasks.

There definitely are other BSD-licensed compilers, but I think the commercial support behind llvm and clear design played a major role in making llvm/clang as the default system compiler.

Above statements does not mean GCC is not a good quality compiler, it definitely is and FreeBSD project used it for years as the default compiler. In fact, having a competition between major open-source compilers is good for the consumers of these products.
 
GCC is of course a good quality compiler since it drives many systems. It is kinda producing bloat, ok, takes ages to compile,... and more more to say. It is ok but not perfect (like everything).

On the other hands, have you ever tried to compile CLANG for your system? Good luck really. It is really difficult to manage to compile it.
https://github.com/llvm-mirror/clang

There is maybe an urgent need for a base C-compiler for *BSD operating systems (core/base system).
 
On the other hands, have you ever tried to compile CLANG for your system? Good luck really.
Yes, I have had that challange which ended with frustration. I used a clean jail to compile llvm from source, but the task was way more challening and full of pitfalls than I initially thought.

I did not have time to troubleshoot (family, kids, full time job :/ ), but somebody with some spare time can do it obviously.
 
I read your post, and try to reformulate: Long, perilous, diffucult,... to compile.
 
Last edited:
Back
Top