C Moving from 32-bit to 64-bit

Cataclismo · Dec 21, 2014

Hi,
I compiled a source under FreeBSD 9.2 64-bit which was meant for 32-bit.
I built source successfully using G++ 4.8, C++11 and libc++.

But I have a problem. The application is not working properly. I can't connect to my app properly. Is there a problem if I moved from 32-bit to 64-bit? Is there something that I should modify in order to work? I mean, I built the source so I don't get it. Thank you.

junovitch@ · Dec 22, 2014

There are bad assumptions in the code somewhere. A common one I've seen is assuming a 'long' is 32 bits and things breaking when they are 64 bits on a 64 bit platform. Both may be technically correct and compile but that doesn't mean there aren't issues at runtime. I would contact whoever the developer or maintainer of the software is and have them fix it. Realistically 64-bit is common enough that whoever developed your application should have accounted for it.

kpa · Dec 22, 2014

One of the worst things you can do is take a pointer and treat it as an int (by explicit casting) and then do calculations on it and put the result back into a pointer variable. This will work only when sizeof(void *) equals sizeof(int). At least on amd64 this assumption does not hold because there sizeof(int) is 32 bits but pointers are 64 bits. I think compilers will catch most of this type of errors now but if you do explicit casts the compiler might not actually give you a warning.

zirias@ · Dec 23, 2014

kpa said:
At least on amd64 this assumption does not hold because there sizeof(int) is 32 bits but pointers are 64 bits.

In fact, although for most C/C++ implementations on amd64, this is indeed the case, it is not required. int could be 64 bits. C/C++ standards say that plain ints should be the natural size offered by the architecture, so this would be a very straight-forward choice. Apart from that there are few requirements on the integer sizes (char <= short <= int <= long along with some minimum sizes). Leaving int at 32bit for most 64bit implementations was probably done to avoid a whole class of incompatibilities.

On the other hand, casting between any integer type and a pointer is always "smelly" -- there's no guarantee that the size of a pointer corresponds to any integer type. AFAIR, this was done for architectures where pointers include e.g. some segment bits.

Cataclismo, in a nutshell, check the code and/or file a bug report. Normally, these issues are fixed easily (although there are lots of possible erroneous assumptions leading to these). One notable exception I've come across was code for emulating a Motorola 68000 CPU. Addresses (32bit for this 16bit CPU) were mapped directly, using pointer offsets, which creates huge problems when porting to a 64bit platform. So, if the code you tried to compile explicitly states it's intended for 32bit -- maybe there's a good reason for it, e.g., it is emulating some piece of hardware?

SirDice · Dec 23, 2014

Zirias said:
One notable exception I've come across was code for emulating a Motorola 68000 CPU. Addresses (32bit for this 16bit CPU) were mapped directly, using pointer offsets, which creates huge problems when porting to a 64bit platform.

The 68000 is a 32 bit CPU but was only capable of addressing 24 bits. All registers were 32 bit though.

(I did a lot of 68000 assembly back in the day)

zirias@ · Dec 23, 2014

SirDice said:
The 68000 is a 32 bit CPU but was only capable of addressing 24 bits. All registers were 32 bit though.

Oh my fault ... so it was more like a "hybrid", only the data bus was 16bit ... well, I stopped coding assembly after the MOS 6502

But I think for my example of "special" C software that's hard to port to a 64bit platform, I was ... close enough

Still thanks for the clarification!

obsigna · Dec 24, 2014

Zirias said:
In fact, although for most C/C++ implementations on amd64, this is indeed the case, it is not required. int could be 64 bits. C/C++ standards say that plain ints should be the natural size offered by the architecture, so this would be a very straight-forward choice. Apart from that there are few requirements on the integer sizes (char <= short <= int <= long along with some minimum sizes). Leaving int at 32bit for most 64bit implementations was probably done to avoid a whole class of incompatibilities. ...

The most important property in this respect is the Data Model of the platform in question. FreeBSD (amd64) as most of the Unix-like platforms uses LP64. That means, long and pointers are 64 bit, while short and int are 16 and 32 bit respectively. Win64 uses ILP64 though, and here also int is 64 bit.

Let's assume, the 32bit code utilizes the generic C types, and let's compare the data models:

Code:

Type:   char    short    int    long    long long    pointer
LP32     8       16      32      32        64          32
LP64     8       16      32      64        64          64

The long and pointer types differ in size, and the rest is the same.

In order to make 32bit code (utilizing generic C types) LP64 compatible and at the same time keeping it working on 32bit machines, the first thing to do is to replace any occurrence of type long by either of type long long or type int, both of which are invariant among said data models.

If the code in question places the content of a pointer into an int variable, then that type of that variable should be changed to intptr_t.

In formatted output of size_t and ssize_t, place a z in front of the respective format specifier, i.e. %zd instead of %d and %zu instead of %u. This would work on 32- and 64bit machines.

If the code reads/writes binary data (structures) from/to external devices, then a canonical format needs to be defined and some glue code needs to be developed which converts between the canonical format and the respective internal one.

For new programs it is best to utilize the explicit integer types from <stdint.h>.

zirias@ · Dec 24, 2014

obsigna said:
The most important property in this respect is the Data Model of the platform in question. FreeBSD (amd64) as most of the Unix-like platforms uses LP64. That means, long and pointers are 64 bit, while short and int are 16 and 32 bit respectively. Win64 uses ILP64 though, and here also int is 64 bit.

I didn't want to go that deep into details, but, for the sake of completeness: There's a third 64bit data model (LLP64) where even long is 32bit. It basically comes down to the fact, that ONLY the size of a pointer is dictated by the hardware architecture, everything else depends on the implementation.

obsigna said:
If the code in question places the content of a pointer into an int variable, then that type of that variable should be changed to intptr_t.

That's probably the best you can do if you absolutely MUST store a pointer in an integer. On some "exotic" architectures, intptr_t could actually be bigger than void* because there is no integer matching the size of a pointer exactly. So, better avoid storing pointers in integers altogether.

Even converting between void* and void(*)void is generally unsafe, they could have different sizes. Unfortunately, dlsym() forces you to do this. The correct alternative dlfunc() (see dlopen(3)) isn't available on all platforms, notably Linux.

AFAIR, the guarantee given by intptr_t is just being big enough to hold data pointers as well as function pointers without truncation.

obsigna said:
For new programs it is best to utilize the explicit integer types from <stdint.h>.

Agreed, with one restriction: IFF you need exact sized integers. Stick with the simple integer types where it doesn't matter when they're larger than the minimum required.

There's also a catch with using <stdint.h> in C (not C++) programs: It's not available prior to C99. This may sound ridiculous, but MSVC doesn't support parts of C99, including <stdint.h>. So, when writing a C program using this and wanting to be portable to MSVC, you have to do some preprocessor magic and define your own u?int_\d\d_t for Win32 and Win64. BTDT.

I think after all these technical details and catches of the C language, it's quite obvious that subtle bugs related to integer and pointer sizes are quite numerous. I'd still say they SHOULD be easy to fix for the original authors (given some exceptions like CPU emulators written in C), but it can be really hard to spot them in third party code ....

Terry_Kennedy · Jan 12, 2015

I have a piece of legacy code I've been dragging around since 4.2BSD on a VAX-11/750. It was a commercial product from a vendor that has gone out of business, and the C is obfuscated enough that it is a nightmare to work on. It doesn't run properly if compiled natively on FreeBSD amd64, but it does behave if it is compiled for the 32-bit emulation provided by the kernel config:

Code:

options COMPAT_FREEBSD32

Building in 32-bit mode on amd64 is done by adding the following to the software's Makefile compiler flags:

Code:

-m32 -DCOMPAT_32BIT -L/usr/lib32 -B/usr/lib32

Crivens · Jan 13, 2015

Zirias said:
Oh my fault ... so it was more like a "hybrid", only the data bus was 16bit ... well, I stopped coding assembly after the MOS 6502

But I think for my example of "special" C software that's hard to port to a 64bit platform, I was ... close enough Still thanks for the clarification!

Then there was the 68008, which actually had 8 bit as a bus width. The registers in all of the 68k CPUs were 32 Bits, but the limit to 24 bits on the 68000 for the address bus led to the 'clever idea' (cough cough) of a certain software company to store some status information in that uppermost byte - paving the way to havoc once the 68020 was used and was fed with more memory.

I loved these things, and still do. Writing assembler for them was about as straight forward as some high-level language.

C Moving from 32-bit to 64-bit

Cataclismo

junovitch@

kpa

zirias@

SirDice

Administrator

zirias@

obsigna

Profile disabled

zirias@

Terry_Kennedy

Crivens

Administrator