Details on CPUTYPE? Flags

dcbdbis · Mar 23, 2015

Good Morning All,

This post contains an email trail that may enlighten others on the nature of "CPUTYPE?=" flags.

This post comprises an email received from the LLVM/Clang dev mailing list, an email from wblock, along with a vernacular summary by myself.

//-------------Begin wblock

This is very interesting. It also explains how "native" is a bit safer, being less specific.

Yes, it would be good to post this to the forums. I'd say in General or possibly System Hardware.

Thanks!

//-----------End wblock

//-----------Begin Dave

Hello wblock,

I posted to a mail list for LLVM/Clang. I asked about the difference between the "march=native" and "march=bdver2" (or any other arch). As you know, CPUTYPE? gets parsed to CLANG/LLVM
as the value for "march=".

Here is the response I received:

//------------------------Begin LLVM/CLANG Dev Response

By using march=native you leave it to the compiler's intelligence to figure out the underlying arch.

The compiler tries to identify the underlying arch using the cpuid flags. Compiler then tries to generate best code for that arch.

By using march=<archname>, the user takes the responsibility of advising the compiler to generate best code for the arch that he has selected.

For example,

· Bdver1 architecture has fma4 support.

· Bdver2 architecture has fma3 and fma4 support.

If you use march=bdver1 on a bdver2 machine, compiler generates code which doesn't have fma3.

If you use march=native on bdver2 machine, the compiler may generate both fma3 and fma4 code.

-Ganesh
//-----------------------End LLVM/CLANG Dev Mail list response

//---------------Vernacular by Dave

And in other responses from the same mail list....it is clear that on any specific arch...the "native" flag goes for what the processor supports, MMX, SSE, etc. NOT for the specific
microarch family itself (cache and cpu core que specific type stuff).

In other words "native != bdver2". ...and... "native != corei7"

Basically the compiler simply builds with the flags it sees fit for a particular CPU, but does not perform any special optimization to take advantage of the particular CPU's microarch.
(Nehalem, SandyBridge, Silvermont, etc.)

By specifying a specific CPU arch (corei7,...or bdver2 or whatever)...the compiler not only includes all the special functionality that that particular CPU provides, but also compiles to take
advantage of that CPU's microarch. Cache, pipelines, core queues, etc.

Summary, building for a specific arch "march=corei7" or "march=bdver2" is the only way to get binaries built that will take advantage of ALL the specified CPU's features and microarch.

No one was surprised that "bdver2" gave me a measurable improvement over "native".

Just thought I'd pass along the information...

Do you think this would be something that I should post on the FreeBSD forums for others? If so, which forum topic should I post it under?

Sincerely and respectfully,

Dave

//--------------End Cut-N-Paste

worldi · Mar 24, 2015

dcbdbis said:
No one was surprised that "bdver2" gave me a measurable improvement over "native".

Well, I for one am surprised. Do you have any numbers? Did you objdump the binaries and diff them?

This sounds like there's something wrong with clang's CPU detection. It should be able to figure out every nitty gritty detail of the CPU and act accordingly, e.g. it should be able to figure out the little differences between corei7-class CPUs. If it fails to do so then "-march=native" is basically useless.

Details on CPUTYPE? Flags

dcbdbis

worldi