make -jX buildworld fails - CPU problem?

Hello all,

I'm compiling 9-STABLE r255205 on two PC's. The first one is an Intel(R) Pentium(R) 4 CPU 3.00 GHz with an ASRock ConRoe1333-D667 motherboard and 4 GB RAM while the second one is an Intel(R) Pentium(R) 4 CPU 2.80 GHz with an ASUS P5G41T-M also with 4 GB RAM. Both machines have ZFSonRooT and are connected with HAST.

The problem I'm facing is that when using -jX, machine1 will always fail (at the same point) whereas machine2 will not. This has happened in the past as well (same point of failure) and things were identical: machine1 fails and machine2 doesn't.

I'm thinking maybe the CPU might have problems (or the motherboard or the RAM maybe?).

What are your opinions?

PS: The build fails with -j2 and -j4 and always at:
Code:
===> usr.bin/ypwhich (all)
cc -O2 -pipe  -std=gnu99 -Qunused-arguments -fstack-protector
-Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -
Wno-pointer-sign -Wno-empty-body -Wno-string-plus-int
-Wno-tautological-compare -Wno-unused-value -Wno-parentheses-equality -Wno-
unused-function -Wno-conversion -Wno-switch -Wno-switch-enum -c
/usr/src/usr.bin/ypwhich/ypwhich.c
gzip -cn /usr/src/usr.bin/ypwhich/ypwhich.1 > ypwhich.1.gz
cc -O2 -pipe  -std=gnu99 -Qunused-arguments -fstack-protector
-Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -
Wno-pointer-sign -Wno-empty-body -Wno-string-plus-int
-Wno-tautological-compare -Wno-unused-value -Wno-parentheses-equality -Wno-
unused-function -Wno-conversion -Wno-switch -Wno-switch-enum  -o ypwhich
ypwhich.o
1 error
*** [everything] Error code 2
1 error
*** [buildworld] Error code 2
1 error
 
Are you using the -j option directly on the command line? That's not a smart idea because it forces the same number of jobs for every Makefile in the source tree including those that are not safe to build in parallel using multiple jobs. As far as I know the build system is already optimized to take advantage of multiple cores, no user serviceable parts inside.
 
If you are refering to make -j2 buildworld, how is that not a smart idea?

And, the buildworld should be parallel-build proof.
 
That is exactly what I mean, the options you provide on the command line get passed down to sub-make(1)s in the source tree forcefully overriding the possible restrictions on how many parallel jobs are allowed.

Buildworld is parallel build safe as long as you don't try to override the number of parallel jobs manually. There are many parts in the source tree that are not safe to build with -j n because of historical baggage that is just too hard to clean up to support building with multiple jobs.
 
I wasn't aware that the handbook actually says that. I'm wondering if that's a bit too much promised because the ports tree already has its own problems with parallel builds.
 
kpa said:
I wasn't aware that the handbook actually says that.
It's also in Warren's guide. But now that @da1 mentions it, I've also seen the usage of -j break buildworld (on a system running 9.1-RELEASE-p6/amd64 using Clang). I simply stopped using -j.
 
Last edited by a moderator:
Ports is one thing that is currently being prepared for parallel build. I'm taking about buildworld :)
 
@fonz: Not using the -j option is counter productive because one cannot take advantage of multi-core CPU.

However, my problem may be with the CPU/motherboard/RAM because so far, this is the only system where I'm experiencing this behavior.

@fonz: Can you reproduce this issue on more than one system?
 
Last edited by a moderator:
-j with values of 2 to 8 for buildworld and kernel (buildkernel plus installkernel) has worked fine for me for years. No problem with FreeBSD 9.x. That's with gcc, not Clang.
 
I'm using gcc on this machine that experiences problems but I too have been using the -j flag for years without any problems.

Hence the question in my original post.
 
da1 said:
@@fonz: Not using the -j option is counter productive because one cannot take advantage of multi-core CPU.
Sure. But if using -j breaks the build, what are you going to do? (Until you have it figured out, that is.)

da1 said:
@@fonz: Can you reproduce this issue on more than one system?
I'll check. But it will be a while.

wblock@ said:
No problem with FreeBSD 9.x. That's with gcc, not Clang.
I know that GCC is the default compiler for 9.x, but it caused internal compiler errors. Hence the switch to Clang.
 
Last edited by a moderator:
fonz said:
Sure. But if using -j breaks the build, what are you going to do? (Until you have it figured out, that is.)
I totally agree and the answer is simply don't use it until of course, as you have also stated, the problem is fixed.

wblock@ said:
9-STABLE amd64, r255240 just built for me with -j8.
Same here, on five machines now.

A bit off-topic @@wblock, since you have doc power :D. I'm not sure, how much relevance does the following have (taken from /usr/src/UPDATING):
Code:
COMMON ITEMS:

        General Notes
        -------------
        Avoid using make -j when upgrading.

Do you think this should be maybe removed since the handbook states that it's safe to use the -j flag?
 
Last edited by a moderator:
da1 said:
Do you think this should be maybe removed since the handbook states that it's safe to use the -j flag?
I think I might be able to answer this. Just because it's safe to use -j doesn't automatically make it appropriate to use in all situations. Within that reasoning I think this comment is just fine the way it is; it maybe safe, but the use of -j is still discouraged when performing an upgrade.

Just my 2 cents on the matter though.
 
The thing you are referring to, are the following sentences:
Code:
While generally safe, there are
        sometimes problems using -j to upgrade.  If your upgrade fails with
        -j, please try again without -j.  From time to time in the past there
        have been problems using -j with buildworld and/or installworld.  This
        is especially true when upgrading between "distant" versions (eg one
        that cross a major release boundary or several minor releases, or when
        several months have passed on the -current branch).

However, I think that the first sentence (from my previous post) is in direct contradiction with the handbook and this can be (or is, at least for me) confusing.
 
It's vague about what it means by an "upgrade". But the point is clear enough: if it fails with -j, try without.
 
da1 said:
@fonz: Can you reproduce this issue on more than one system?
FWIW, my dual-core netbook just successfully completed make -j3 buildworld for 9-STABLE/i386. I really should stop doing that to the poor little thing and look into cross-compiling instead. But anyway, in short: I can't reproduce the problem on another machine at the moment. So it might just be that particular box.
 
Last edited by a moderator:
That's my impression too. Seems I might need to migrate the resources to the second machine and start some hardware stress tests or something.
 
Back
Top