C program behavior and core dump with g++ and clang

Hi,

uname -a on box 1 gives
FreeBSD xxxxx 12.1-RELEASE-p6 FreeBSD 12.1-RELEASE-p6 GENERIC amd64

and on box 2 gives
FreeBSD xxxxx 12.1-RELEASE-p8 FreeBSD 12.1-RELEASE-p8 GENERIC amd64

When C++ code is compiled on box 1 using g++8, the resultant binary runs fine. The same code when compiled on box 2 using g++8, the resultant binary core dumps, that too upon start. Analyzing the core file using gdb points to a function in unused code - i mean no function/code calls it but is part of the library. No change to the code. But when the same code is compiled using the clang compiler, the resultant binary doesn't core dump.
So box 2 was upgraded to 12.2 and the uname -a now gives
FreeBSD xxxxx 12.2-RELEASE FreeBSD 12.2-RELEASE r366954 GENERIC amd64

The same pattern is seen, i.e works with clang compiled code and core dumps with g++8.

g++7 was also used and the result is the same - dumps core.

Another problem is, the code contains socket related system calls. When compiled with g++, the htons used to assign the port works fine. 443 is shown as 443 both in netstat and sockstat commands. But when compiled with clang, the ports are different. I understand the randomness of the ports is due to the fact that we have given permission only for 443 via sysctl.conf. The question is why the difference using clang and g++? How do i overcome the ports issue - critical to run the program.

--Thanks
 
OK, most likely explanation: The code is invalid, and relies on undefined (perhaps random) behavior. For example, it might be using uninitialized memory.

I don't understand your comment about socket and ports. Is the 443 calculated in the program, and the calculation gives different answers? How do you disable/enable specific ports in sysctl.conf? What you're saying here makes no sense to me, so there must be a lot of details missing, and without those details I can't understand it.

How do you overcome this? You debug the program. That starts by reading it line by line. Do you know which line of your code eventually causes the crash? Do you have a stack trace? I would modify the program to add tracing, printing progress reports as you go. Then see where exactly it crashes, and then work down, adding more and more details to your traces, until you find the problem.

Obviously, the real answer depends crucially on who wrote the code, and whether the author is reachable.
 
Hi,

uname -a on box 1 gives
FreeBSD xxxxx 12.1-RELEASE-p6 FreeBSD 12.1-RELEASE-p6 GENERIC amd64

and on box 2 gives
FreeBSD xxxxx 12.1-RELEASE-p8 FreeBSD 12.1-RELEASE-p8 GENERIC amd64

When C++ code is compiled on box 1 using g++8, the resultant binary runs fine. The same code when compiled on box 2 using g++8, the resultant binary core dumps, that too upon start. Analyzing the core file using gdb points to a function in unused code - i mean no function/code calls it but is part of the library. No change to the code. But when the same code is compiled using the clang compiler, the resultant binary doesn't core dump.
So box 2 was upgraded to 12.2 and the uname -a now gives
FreeBSD xxxxx 12.2-RELEASE FreeBSD 12.2-RELEASE r366954 GENERIC amd64

The same pattern is seen, i.e works with clang compiled code and core dumps with g++8.

g++7 was also used and the result is the same - dumps core.

Another problem is, the code contains socket related system calls. When compiled with g++, the htons used to assign the port works fine. 443 is shown as 443 both in netstat and sockstat commands. But when compiled with clang, the ports are different. I understand the randomness of the ports is due to the fact that we have given permission only for 443 via sysctl.conf. The question is why the difference using clang and g++? How do i overcome the ports issue - critical to run the program.

--Thanks
I'm afraid your description is just too vague to even attempt a resolution.
When you dump core, you say it points to an unused function? By virtue of it pointing to it, it's not unused.

Regardless, do you understand the code and the point of the crash? Your core dump will show this. Also, a determinant could be the amount of optimisation you're performing, whether that differs in gnu v clang is up to you to discover.

I think you need to post specific code, textualised core dump info, etc.
 
When C++ code is compiled on box 1 using g++8, the resultant binary runs fine. The same code when compiled on box 2 using g++8, the resultant binary core dumps, that too upon start. Analyzing the core file using gdb points to a function in unused code - i mean no function/code calls it but is part of the library.
This is nonsense. If it is dumping core there then obviously something is calling it.

The only thing that I can guess is that you have ill-defined dependencies between the initialization of your static/global objects. The most likely thing is that this is to do with the C++ constructors (potentially also C with things like __attribute__((constructor)) but I've very rarely seen that in the wild).

The order that C++ constructors get called for a translation unit (souce code file) _is_defined, and will be in the order that they appear in that file. However, it is not defined _between_ translation units. There is no guarantee that g++ and clang++ (or any other compiler) will use the same order.

If you can't resolve the problem with GDB and your core dump, then try either Valgrind or address sanitizer.
 
Thanks for the responses. My bad if the post was vague and my apologies for late response. The issue is now resolved but have no clue on how it was resolved. The only difference is that the box was shutdown and unused for few days and reusing now, the problem is no more. I understand this may be cryptic or some magic, but it is working now. All parts are tested and seem to be working fine. No change was done to the sources except fresh compile, with the same compile flags/options as earlier.

--Thanks.
 
That doesn't sound very reassuring. One of the things about Undefined Behaviour (bugs) is that it is undefined, and can be non-deterministic. So if there is a bug, then your software might crash again tomorrow.

It's also possible that you have some broken dependencies in your build system, and doing a clean build 'fixed' things.
 
Just to be sure, may I enquire whether the code was compiled using the -g option?

If not, the backtrace and the gdb output might not be helpful.
 
Yes, -g option was used during compile. The clean build could have solved but no changes were done to the system. While i am cautiously monitoring, for now i shall consider this as closed.
 
Back
Top