C Named semaphore, uninitialized bytes

ProphetOfDoom · Apr 17, 2022

Hullo, I hope someone can help? I have this C code:

C:

#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <unistd.h>
#include <semaphore.h>
#include <fcntl.h>

int main(void)
{
  errno = 0;
  sem_t *s = sem_open("/lala",O_CREAT,0777,0);
  perror("Our error?");
  sem_close(s);
  sem_unlink("/lala");
}

It "works" and reports no error, but when I run it under valgrind, it says:

Code:

username@freebsd:~/horrorshow $ valgrind --track-origins=yes -s ./a.out
==19415== Memcheck, a memory error detector
==19415== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==19415== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
==19415== Command: ./a.out
==19415== 
==19415== Syscall param write(buf) points to uninitialised byte(s)
==19415==    at 0x499680A: _write (in /lib/libc.so.7)
==19415==    by 0x4913551: sem_open (in /lib/libc.so.7)
==19415==    by 0x2019EF: main (in /usr/home/username/horrorshow/a.out)
==19415==  Address 0x7fc0003fc is on thread 1's stack
==19415==  in frame #1, created by sem_open (???:)
==19415==  Uninitialised value was created by a stack allocation
==19415==    at 0x49132ED: sem_open (in /lib/libc.so.7)
==19415== 
Our error?: No error: 0
==19415== 
==19415== HEAP SUMMARY:
==19415==     in use at exit: 63 bytes in 3 blocks
==19415==   total heap usage: 6 allocs, 3 frees, 230 bytes allocated
==19415== 
==19415== LEAK SUMMARY:
==19415==    definitely lost: 0 bytes in 0 blocks
==19415==    indirectly lost: 0 bytes in 0 blocks
==19415==      possibly lost: 0 bytes in 0 blocks
==19415==    still reachable: 63 bytes in 3 blocks
==19415==         suppressed: 0 bytes in 0 blocks
==19415== Rerun with --leak-check=full to see details of leaked memory
==19415== 
==19415== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==19415== 
==19415== 1 errors in context 1 of 1:
==19415== Syscall param write(buf) points to uninitialised byte(s)
==19415==    at 0x499680A: _write (in /lib/libc.so.7)
==19415==    by 0x4913551: sem_open (in /lib/libc.so.7)
==19415==    by 0x2019EF: main (in /usr/home/username/horrorshow/a.out)
==19415==  Address 0x7fc0003fc is on thread 1's stack
==19415==  in frame #1, created by sem_open (???:)
==19415==  Uninitialised value was created by a stack allocation
==19415==    at 0x49132ED: sem_open (in /lib/libc.so.7)
==19415== 
==19415== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

The only lead I have is that if the semaphore already exists when the program starts, the valgrind error goes away. What am I doing wrong?
Thank you.

ProphetOfDoom · Apr 19, 2022

Okay so, I know there are some really smart people on this forum. So can I take it that everyone is as mystified as I am about this? I’m starting to think it might be a bug in valgrind as it’s not the most stable piece of software and is always leaving core dumps. I seriously doubt it’s a bug in libc as it would have been fixed decades ago.
I’ve read through the semaphores man page at least three times now and can’t see anything to suggest I’m doing anything wrong. But then again, a bad workman blames his tools...

Mathieu · Apr 19, 2022

Valgrind might be right here. I did a quick check and the _padding field of the "tmp" sem_t stack variable in _sem_open() isn't initialized before writing. IIUC the field is never actually used (even after reading it back) so "using" uninitialized memory arguably isn't a bug in this case. But it probably should be fixed for other reasons...

Diff:

diff --git i/lib/libc/gen/sem_new.c w/lib/libc/gen/sem_new.c
index 409f4ce76608..c059523db76b 100644
--- i/lib/libc/gen/sem_new.c
+++ w/lib/libc/gen/sem_new.c
@@ -215,9 +215,13 @@ _sem_open(const char *name, int flags, ...)
             goto error;
     }
     if (sb.st_size < sizeof(sem_t)) {
-        tmp._magic = SEM_MAGIC;
-        tmp._kern._count = value;
-        tmp._kern._flags = USYNC_PROCESS_SHARED | SEM_NAMED;
+        tmp = (sem_t){
+            ._magic = SEM_MAGIC,
+            ._kern = {
+                ._count = value,
+                ._flags = USYNC_PROCESS_SHARED | SEM_NAMED,
+            },
+        };
         if (_write(fd, &tmp, sizeof(tmp)) != sizeof(tmp))
             goto error;
     }

ProphetOfDoom · Apr 19, 2022

Well that is interesting. Thanks Mathieu. I suppose if the field is not used it could be argued it’s a performance thing.
Valgrind should probably have a suppression for this so it doesn’t confuse people.

Paul Floyd · Apr 19, 2022

AlexanderProphet said:
Okay so, I know there are some really smart people on this forum. So can I take it that everyone is as mystified as I am about this? I’m starting to think it might be a bug in valgrind as it’s not the most stable piece of software and is always leaving core dumps. I seriously doubt it’s a bug in libc as it would have been fixed decades ago.
I’ve read through the semaphores man page at least three times now and can’t see anything to suggest I’m doing anything wrong. But then again, a bad workman blames his tools...

Please report any bugs that you find with Valgrind. https://bugs.kde.org preferably, but the freebsd bugzilla will also reach me.

Since I upstreamed the FreeBSD port of Valgrind (3.18.1) then it should be reasonably stable on amd64.

Paul Floyd · Apr 19, 2022

AlexanderProphet said:
Well that is interesting. Thanks Mathieu. I suppose if the field is not used it could be argued it’s a performance thing.
Valgrind should probably have a suppression for this so it doesn’t confuse people.

Valgrind checks that the memory passed to the write syscall (in this case) is valid.

It would be possible to suppress errors containing

_write (in /lib/libc.so.7)
sem_open (in /lib/libc.so.7)

at the top of the stack

However that looks like it would be a bad idea as it would mask any errors due to the contents of "value" being invalid.

So I won't add it to the default Valgrind suppression file.

ProphetOfDoom · Apr 19, 2022

Hi Paul,
Sorry for sounding so unappreciative it was wrong of me - I really appreciate your work on valgrind, it has got me out of some real programming difficulties on many occasions. Do you still want me to report this as a bug?

Paul Floyd · Apr 19, 2022

IMO it's a bug in libc. I can't see an easy way to fix this in Valgrind. sem_open isn't a syscall (unlike on Darwin) so I can't modify at the syscall interface.
Like I said, if I add a suppression that will risk false negatives.

You can always use your own suppression file as a workaround.

grantcm · Apr 30, 2022

Based on this https://sourceware.org/git/?p=glibc.git;a=commit;h=05598a0907cad1350962e89b781215209a785d92, it looks like it is a bug in libc. If you use a version after 2.26, it should be fixed.

Paul Floyd · May 2, 2022

That's glibc, but it is a confirmation that it ought to be fixed in in FreeBSD libc.