dumb 32 vs 64bit questions...

Sorry, I'm sure these are dumb questions...

I've had a fair bit of experience with FreeBSD 32-bit Intel, but want to make my next machine an AMD64 FreeBSD machine. However, I've got important programs written for 32-bit Unix.

1) Will these programs run unmodified on AMD64 FreeBSD if they were compiled on 32bit FreeBSD?
2) Is the C compiler on AMD64 FreeBSD set up for a compiling 32-bit code as well such that it should compile with minimum or zero changes?
3) Are there significant differences in what ports are available for AMD64 FreeBSD?
4) Any differences in kernel/device driver support for hardware?

thanks...
 
1) No. Well, an amd64 CPU can run i386 binaries. I don't now how running 32-bit binaries on a 64-bit machine works in FreeBSD; in Linux, you would either install 32-bit librairies in {/usr,}/lib32 or install a 32-bit world inside a chroot, and I would expect a similar procedure on FreeBSD. Or you can install FreeBSD i386 on your amd64 machine, run your binaries unmodified, but don't take advantages of all the features of amd64.

2) I'm not sure I understand. With "-m32" (IIRC), gcc will produce binaries suitable for a 32-bit machine, but this won't be the default. Programs that don't assume a specific width for data types, and don't do things like storing a pointer inside an int will compile unmodified. Programs that do are already broken, but you just don't know it yet.

3) 4) As far as I'm aware, no.
 
An amd64 installation has two things by default:

In the kernel:
Code:
options         COMPAT_FREEBSD32        # Compatible with i386 binaries

In the base system:
/usr/lib32/*
/libexec/ld-elf32.so.1
 
monkeyboy said:
2) Is the C compiler on AMD64 FreeBSD set up for a compiling 32-bit code as well such that it should compile with minimum or zero changes?

I'm not sure if you are asking:

A) Can it compile source code which assumes a 32-bit environment into native 64-bit binaries.

or

B) Can it compile source code and produce 32-bit binaries which will run on both amd64 and i386.

The answer to A is "maybe" - a lot depends on how clean the source code is, as well as how old it is. Where you normally run into trouble is with code that assumes a certain size for "int" and so forth. One hint is if the code already compiles on both big-endian and little-endian systems, which would indicate that somebody at least thought about different architectures when writing it.

The answer to B is "probably". I find that adding:
Code:
-m32 -DCOMPAT_32BIT -L/usr/lib32 -B/usr/lib32
to the cc command line will generate 32-bit binaries and doesn't trigger any of the issues in A. I'm not sure if those binaries will actually run on an i386-only system.

I have a multi-100K-line monstrosity that I have been dragging around from system to system from the early 4BSD/VAX days, and I can compile, link, and run it using the above steps.
 
Thanks you guys for all the input...

B) Can it compile source code and produce 32-bit binaries which will run on both amd64 and i386.
I pretty much mean 'B'... I've got a big 32-bit program (actually an interactive language/incremental compiler) that I need to have work well under FreeBSD AMD64.

I will eventually port it to native 64-bit, but its going to take work, I'm sure. Its not that easy to have written a truly portable C-based interactive interpreter that runs on 8bit, 16bit and 32bit CPUs using C from the 1980's... initial development was on the PDP-11 Unix V6.

And yes, it runs on SUN Sparc and 68K, so it handles endian-ness okay...
 
Thanks again for the help in compiling for 32-bit in AMD64.

Just a followup. I managed to also get the program to compile 64-bit without TOO much modification, mostly things related to switching between sgttyb and termios, then changing defs of long and int.

But then comes the second level of porting, with possibly some fairly subtle differences that will take some thinking and I suppose "regression" testing.

One area of problem that I discovered is in printf(3). There doesn't seem to be a portable way to tell printf() to do the right thing for 16bit, 32bit and 64bit ints ACROSS 16-bit, 32-bit and 64-bit platforms and C compilers. For example, the format spec %ld means long signed int. But that is 32-bit in the 16 and 32bit world and 64-bit in the 64-bit world. So it seems I have to craft #ifdefs to choose different printf formats for different compile/run-time environments. Differences arise when, for example, you try to print the value of (~1), depending on the size of the integer.

I guess the point/issue/question is equivalent to asking for C99 printf() that handles int32_t, etc correctly.

looking around, I see stuff like: fprintf(stdout, "%"PRIu32" \n", var); as defined in inttypes.h, but that's doing just what I'm doing and doesn't really ask anything of printf(), just a bunch of #define's; also I don't think it will do the right thing for a 32-bit program in a 64-bit world. I guess there must be a different inttypes.h for 32-bit compilation. (is there? nothing in /usr/lib32.)
 
monkeyboy said:
...
I guess the point/issue/question is equivalent to asking for C99 printf() that handles int32_t, etc correctly...
...

For FreeBSD and most other *x systems, that employ the LP64 data model (= I32LP64 model), everything boils down to three things:

  1. Avoiding the long data type since this is the only integer type, that is of different size on 32 and 64 bit systems.
    In my programs, I simply replaced any instance of long by int (and in printf() formatting %ld to %d), when I wanted to stay with 32 bit integers, and I replaced long by long long / %lld, when I really wanted to have 64bit integers.
    .
  2. Not storing pointers into integers.
    .
  3. Using the #pragma pack(n) directive when typedefing structs that you need to read/write in binary form from/to any device.

By adhering to these few things, my programs compile and run well on 32 and 64 bit systems without any #ifdef.

Furthermore, some of my 32bit/64bit programs can read ancient 25 year old binary data, created by Turbo Pascal programs.

Note: MS Windows employs the LLP64 = IL32P64, i.e. on Windows long is still 32bit. Anyway, avoiding long doesn't hurt here either, and it is a good idea at least for improving portability.

Best regards

Rolf
 
rolfheinrich said:
...
In my programs, I simply replaced any instance of long by int (and in printf() formatting %ld to %d), when I wanted to stay with 32 bit integers, and I replaced long by long long / %lld, when I really wanted to have 64bit integers.
Good advice. Better would be to use defined types which can be changed in one place or use the predefined _8/_16/_32/_64 types. The type of int is explicitly undefined in C, it can be 16 to 64 bits without any warning. Loops going up to 70000 can take sometimes a long time, if you get my meaning ;)
[*]Not storing pointers into integers.
Good advice.
[*]Using the #pragma pack(n) directive when typedefing structs that you need to read/write in binary form from/to any device.
...
Argh! Here I just got mugged down memory lane.

Without going into details, this can get you into serious hot water as some processors tolerate unaligned accesses to floating point or multibyte values - and others do not. When possible, provide access layer code that does not depend on such features. You may use packed date on one platform when it works, but when not there needs to be a fallback.
 
Crivens said:
... Better would be to use defined types which can be changed in one place or use the predefined _8/_16/_32/_64 types ...

I assume that you mean the exact-width integer types as of ISO/IEC 988:1999 spec, and that are defined in <stdint.h> - int8_t, int16_t, int32_t, int64_t. I agree for new projects, where you started right away with [-std=c99]. The OP was talking about a hugh code base:

monkeyboy said:
... that runs on 8bit, 16bit and 32bit CPUs using C from the 1980's... initial development was on the PDP-11 Unix V6.

I figured that switching everything to C99 may be perhaps a more involved option than sedding "long" by "int" ;-)


Crivens said:
...Without going into details, this can get you into serious hot water as some processors tolerate unaligned accesses to floating point or multibyte values - and others do not.

#pragma pack(2) works well for mc68k, ppc, i386, and x86_64

Of course, with any programming you can come into deep water, if you do not know exactly what you are doing and for what reason.

My respective advice was meant to complete a list of issues that I came over when converting code that evolved over the decades from 32 to 32/64 bit. The binary data has been written long ago, and using #pragma pack(2), I was able to read it in with out issues.

Nowadays, I write out data in different ways.

Best regards

Rolf
 
rolfheinrich said:
For FreeBSD and most other *x systems, that employ the LP64 data model (= I32LP64 model), everything boils down to three things:

  1. Avoiding the long data type since this is the only integer type, that is of different size on 32 and 64 bit systems.
    In my programs, I simply replaced any instance of long by int (and in printf() formatting %ld to %d), when I wanted to stay with 32 bit integers, and I replaced long by long long / %lld, when I really wanted to have 64bit integers.
    .

By adhering to these few things, my programs compile and run well on 32 and 64 bit systems without any #ifdef.
S

Sure, but the program I am referring to was written beginning with the 16-bit world when the only way to ask for a 32-bit int was "long". And I need the program to still compile in the 16-bit world. Also I have to handle shorts (16-bit ints) correctly as well. And since the program itself is a programming language compiler/interpreter where data types are of specific sizes and binary data read and written needs to port across 16, 32 and 64-bit worlds, all of this needs to be "tight".

Is there a C99 for the 16-bit world (like DOS)? Also it doesn't look like FreeBSD comes setup for 32-bit compilation on AMD64 with a proper <stdint.h> and <inttypes.h> headers.

I vaguely recall the debate about whether to make long's in AMD64 to be 32-bit or 64-bit. I think I would have voted to keep longs at 32-bit (and shorts at 16-bit). The type "int" was always the one that was not guaranteed to be a particular size and was generally the native word length of a CPU. It is weird to make ints to be 32-bit on AMD64 but change longs. I'm not bothered by the longs are shorter than ints issue.
 
monkeyboy said:
... And I need the program to still compile in the 16-bit world... Also I have to handle shorts (16-bit ints) correctly as well...

If you need the same code base to compile on 16/32/64 bit machines, and if you would like to go away with minimal code changes, then perhaps the following is an option:

Instead of replacing "long" to "int" you would replace "long" to "int32".

Then in one central header file you would make the following definitions:

Code:
#if __LP64__
#define int32 int
#else
#define int32 long
#endif
 
rolfheinrich said:
If you need the same code base to compile on 16/32/64 bit machines, and if you would like to go away with minimal code changes, then perhaps the following is an option:

Instead of replacing "long" to "int" you would replace "long" to "int32".

Then in one central header file you would make the following definitions:

Code:
#if __LP64__
#define int32 int
#else
#define int32 long
#endif

Yes, it's what I already did. It doesn't solve the printf() problem though. I needed similar #ifdefs to use the right printf format strings for the different types.
 
monkeyboy said:
yes, its what I already did. It doesn't solve the printf() problem though. I needed similar #ifdefs to use the right printf format strings for the different types.

If it were only 32/64bit systems you would simply go away with %d instead of %ld for formatting 32 bit integers.

On 16bit systems, you need %ld for formatting 32 bit integers, and indeed you need some sort of compile time switch for this. So, if you absolutely need to stay with the same code base for any platform then you would achieve your goals best, i.e. with no performance hit for no platform, with the construct that you already mentioned

monkeyboy said:
... looking around, I see stuff like: fprintf(stdout, "%"PRI32"\n", var); ...

Code:
#if (defined(__this16bitCC__) || defined(__that16bitCC__))
#define int32 long
#define PRI32 "ld"
#define PRU32 "lu"
#elseif
#define int32 int
#define PRI32 "d"
#define PRU32 "u"
#endif
 
rolfheinrich said:
I assume that you mean the exact-width integer types as of ISO/IEC 988:1999 spec, and that are defined in <stdint.h> - int8_t, int16_t, int32_t, int64_t. I agree for new projects, where you started right away with [-std=c99]. The OP was talking about a hugh code base:
It was not meant as criticism, but if you go sed-ing the source anyway then why not use the correct types then? And I meant these defines, yes. Currently I am not at my box, so I face limitations in information availability. :(
I figured that switching everything to C99 may be perhaps a more involved option than sedding "long" by "int" ;-)
Maybe it would be wortwhile to try to refacture the code base w.r.t this using eclipse or any other capable tool? It does not mean to continue using these, but I found the power to smartly change types in a code base pretty convinient.
#pragma pack(2) works well for mc68k, ppc, i386, and x86_64

Of course, with any programming you can come into deep water, if you do not know exactly what you are doing and for what reason.
True.
But I am sorry that I can not go into details here why this triggered me, it has to do with weeks of pulled hair, cursing and a bug report being closed because "our tests work", being done on i386 which tolerates badly aligned floating point. That is pretty frustrating when you can demonstrate that they urgently need to change something, and you use a machine that traps such accesses.

Again, it was not meant as an insult or anything, just a dark flashback. ;)

Best Regards
 
Crivens said:
It was not meant as criticism, but if you go sed-ing the source anyway then why not use the correct types then? And I meant these defines, yes.

There was no need to sed as the data structures that are type/size sensitive were already written as #defines in the original code base. It was the printf formats that were a little less obvious and a little more scattered but even those weren't that hard or extensive. I do hope to keep a single code base for the variety of 16, 32 and 64bit platforms -- been successful up 'til now (20+ years, first version written in 1987). It's the older 16 and 32bit compilers that make it challenging, but even the 16-bit versions are still in very active use.
 
Maybe you need to also define the format part of the printf and use string concatenation in the preprocessor on them, that would solve that problem. But it will take some work on any printf/scanf line.
 
Back
Top