C How to catch errors when writing in C [popen() issue]

I am trying to write data in C with fputs() and would need to get an error when the destination doesn't exist.
This is my code:
Code:
#include <stdlib.h>
#include <unistd.h>
#include <time.h>
#include <regex.h>
#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[])
{
  FILE *wp;
  int i, j;

  if((wp = popen("/usr/crap/bogus", "w")) == NULL) {
    printf("popen failed\n");
    exit(1);
  }
  printf("popen okay\n");

  for(i = 1; i < 1000; i++)
    if((j = fputs("crapcrapcrap\n", wp)) == EOF) {
      printf("fputs error: EOF\n");
      exit(1);   
    } else
      printf("fputs #%d okay %d chars\n", i, j);
 
  printf("pclose: \n");
  i = pclose(wp);
  printf("pclose result=%d\n", i);

  exit(0);
}

And this is the output:
Code:
$ ./a.out
popen okay
fputs #1 okay 13 chars
fputs #2 okay 13 chars
fputs #3 okay 13 chars
fputs #4 okay 13 chars
...
fputs #31 okay 13 chars
fputs #32 okay 13 chars
fputs #33 okay 13 chars
sh: /usr/crap/bogus: not found
fputs #34 okay 13 chars
fputs #35 okay 13 chars
fputs #36 okay 13 chars
...
fputs #312 okay 13 chars
fputs #313 okay 13 chars
fputs #314 okay 13 chars
fputs #315 okay 13 chars
$

And here the program just disappears without reaching exit() and without a message.
How can I catch the error condition in code?
 
Your problem is popen(3). Quoting the manual:

Code:
     The command argument is a pointer to a null-terminated string containing
     a shell command line.  This command is passed to /bin/sh using the -c
     flag; interpretation, if any, is performed by the shell.
[...]
     The popen() function returns NULL if the fork(2) or pipe(2) calls fail,
     or if it cannot allocate memory.

Executing the shell will always work. So, just don't use popen(3) if you need better error checking. Use fork(2), pipe(2), dup2(2) and execve(2) instead.

(personal note, I think popen() is a massively stupid function...)
 
Your problem is popen(2).
Yeah, I like Your style. But popen() is easy (for people like me who don't have their own C library and use it every day. ;)).

Quoting the manual:
Yeah, read on:
Failure to execute the shell is indistinguishable from the shell's
failure to execute command, or an immediate exit of the command. The
only hint is an exit status of 127.

So it should work. (I don't care about distinguishing these, I just want to receive that exit status - anywhere. Currently it is nowhere.)
Fancily, when I don't write to the handle, things do work:

Code:
$ ./a.out
popen okay
pclose: 
sh: /usr/crap/bogus: not found
pclose result=32512
$

Executing the shell will always work.
Yes, and after the execution fails, the shell becomes a zombie process, while the cooked file writer happily fills the buffer, until it is full...

So, just don't use popen(2) if you need better error checking. Use fork(2), pipe(2), dup2(2) and execve(2) instead.
Du you see my painful face?
 
Okay, working thru this one now... let's see if that does better.
 
I don't know what you're trying to achieve. In general popen is source of many issues. @zirias made a good point, you should handle it with pipe, fork an exec (they are coming from the same lib as popen, you don't need custom library for that).

For the sake of popen though, what is your goal? You're trying to open a process and push 13B 1000 times (you are not executing this command 1000 times if that was the goal). As sh does exist this call (popen) is successful. The error you see is due to pipe being full, not because program doesn't exist.
Truss will show you:
Code:
11211: write(4,"crapcrapcrap\ncrapcrapcrap\ncrap"...,4096) ERR#32 'Broken pipe'
 
fputs, does that not return EOF or some other indication of an error?
Typically one would do something like if (fputs() != EOF) then "all good" else "oh crap" endif
One also typically would look at errno on a failure.
 
fputs, does that not return EOF or some other indication of an error?
Typically one would do something like if (fputs() != EOF) then "all good" else "oh crap" endif
One also typically would look at errno on a failure.
That is exactly what I am doing.
And it does NOT return EOF, instead it silently terminates the program in midflight.
 
And it does NOT return EOF, instead it silently terminates the program in midflight.
You're getting a SIGPIPE, see signal(3). I assume this is only generated once the pipe buffer is full...

If there was a reader, and this reader closed their end, fputs() could know about the closed stream and return EOF *). But there never was a reader, and thanks to popen() madness, you have no way to reliably detect that. Your only chance would be to install a signal handler for SIGPIPE.

Really, avoid popen(3) for reliable code. The fact the additional shell process it starts is completely superfluous in at least 90% of all cases just adds insult to injury for this annoyingly stupid function.

Side note for "defensive coding", check fputs() return value for <=0 instead of ==EOF: Anything that's not a "success" return value is an error. But this is obviously not the issue here.

*) edit, this was wrong, I confused it with being on the reading end. left for transparency, just strike out to avoid confusing future readers.
 
Last edited:
I don't know what you're trying to achieve. In general popen is source of many issues. @zirias made a good point, you should handle it with pipe, fork an exec (they are coming from the same lib as popen, you don't need custom library for that).
I once read one of the implementations of that scheme (net/dhcpcd), and it looked very difficult - they do a fork, then they do another fork from that fork, and dupping and closing all kinds of filehandles interim, and then from there they fork their actual worker processes. It was hard to understand good enough for debugging purpose, but it didn't look inviting to implement something similar.

For the sake of popen though, what is your goal?

What I am trying to achieve: I simply want to write some data to an existing program. And I want to obtain a useful error when that program is not willing to operate (or doesn't exist). That's all.

You're trying to open a process and push 13B 1000 times (you are not executing this command 1000 times if that was the goal). As sh does exist this call (popen) is successful. The error you see is due to pipe being full, not because program doesn't exist.
The program doesn't exist. After the shell tries to execute a nonexisting program, it terminates. ps shows it as zombie.

Now I did switch to the pipe/fork/dup/exec scheme, and that behaves exactly the same! When the exec() fails, it returns with an error - but that happens in the child!

The parent, who does the writes, does not see any errors - and continues to write successfully to the non-existent destination - until after some kB written the parent just dissolves in midflight - without a recognizable error, without anything one could catch and handle.
Again, the child terminates and is shown as zombie. It doesn't matter if there is a shell inbetween or not.

I now tried to call waitpid(pid, &status, WNOHANG); for the child, and get this:
Code:
waitpid returned with 512
status = WIFEXITED, exitstat=2

So, is it necessary to call waitpid() after every write?

Btw, in the other case, when and while the child program does exist and is nicely operating, waitpid() reports this:
Code:
waitpid returned with 4473840
status = WIFSIGNALED. signal=112
status = COREDUMP

(Not sure what that means, I just grabbed the values from the manpage)
 
As mentioned in my previous post the failure is due to pipe being full, that's why you have the abnormal program termination. Program (/usr/crap/bogus) doesn't need to exist for that. Pipe will exist with the successful popen call.
waitpid manpage has a section that shows how to decode the status (status has more info encoded in it).

How do you determine whether program is cooperating? Your written data is buffered on that pipe. Successful write to the pipe doesn't necessarily mean it was read on the other side.
 
You're getting a SIGPIPE, see signal(3). I assume this is only generated once the pipe buffer is full...
setvbuf() can make the termination happen earlier, but not really different.

If there was a reader, and this reader closed their end, fputs() could know about the closed stream and return EOF. But there never was a reader, and thanks to popen() madness, you have no way to reliably detect that. Your only chance would be to install a signal handler for SIGPIPE.
And how would I detect it without popen() ?
It now looks like this:

Code:
  pid = fork();
  if (pid == 0) {    /* child */
    close(pipefd[1]);
    dup2(pipefd[0], STDIN_FILENO);
    if(execv(feedcmd, feedargs) != 0)
      do_error(98);
    
  }
  close(pipefd[0]);
  if((wp = fdopen(pipefd[1], "w")) == NULL)
    do_error(97);

Here I'm getting my error-98 - but that is of no consequence, it just termintes the child. I do NOT get an error-97, nor any error from subsequent writes. Only after some (random!) amount of data written, the parent dissolves. $? is 141 - but that is not what I want, I want to handle the situation from within my code.
 
As mentioned in my previous post the failure is due to pipe being full, that's why you have the abnormal program termination. Program (/usr/crap/bogus) doesn't need to exist for that. Pipe will exist with the successful popen call.
waitpid manpage has a section that shows how to decode the status (status has more info encoded in it).
I did what is written there, without for now trying to understand it.

How do you determine whether program is cooperating? Your written data is buffered on that pipe. Successful write to the pipe doesn't necessarily mean it was read on the other side.
Well, I didn't think about that before. Only, at the place where I wrote the code, the respective program did not exist - and so I noticed that this situation is treated in a very ungraceful manner. (I don't like programs disappeareing in midflight, without the chance to print out some custom message that one could grep for later on.)
 
It now looks like this:
What is do_error()? There's no need to ever check the return value of execve: if it ever returns, it's an error. But you should exit(3) then, otherwise you'll end up executing code that was meant for the parent process.

Still, I correct myself: popen() isn't the only issue here. As you're on the writing end, you can't detect when the reading end closed the pipe, no matter what. What you can detect however is when your child exits, and you can close your parent's pipe end when that happens. Of course, all of this isn't possible with popen() either. Here's some proof-of-concept (test) code (and it shows fputs() eventually returning EOF and setting errno to EBADF):

C:
#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

static pid_t pid;
static int pfd[2];

static void handler(int sig)
{
    if (sig != SIGCHLD) return;
    int st;
    if (waitpid(pid, &st, WNOHANG) == pid && WIFEXITED(st)) close (pfd[1]);
}

int main(void)
{
    pipe(pfd);
    signal(SIGCHLD, handler);
    pid = fork();
    if (pid == 0)
    {
        close(pfd[1]);
        dup2(pfd[0], STDIN_FILENO);
        close(pfd[0]);
        char *const cmd[] = { "/foo/bar/baz", 0 };
        execve(cmd[0], cmd, 0);
        exit(1);
    }
    FILE *out = fdopen(pfd[1], "a");
    int i, rc;
    for (i = 0, errno = 0; (rc = fputs("test", out)) > 0 && errno == 0; ++i);
    printf("%s on write #%d, returned %d\n", strerror(errno), i, rc);
}
This code misses lots of sanity/error checks!

You could detect the error much quicker by disabling stdio buffering of course (or by avoiding FILE * entirely and instead using low-level POSIX I/O functions).

CAUTION: If your real code opens other file descriptors, don't just close the pipe end from the signal handler. Instead, set a volatile sig_atomic_t flag and check it from your main program. File descriptor numbers are reused, so otherwise you might end up writing to some random other file or socket or whatever...
In a nutshell, this code is NOT recommended at all! See post #16 for a better approach. Also, the code here fails to check for a child terminated by a signal, but I'll leave it that way, it's just poc stuff...

edit: Did a few changes to the example code, no need to reset the signal handler as these are already reset by execve(), instead close the duplicate fd. Btw, you can check everything works well by replacinc /foo/bar/baz with e.g. /bin/cat.
 
Paul Floyd OP's problem is almost certainly an overflowing pipe buffer and a resulting SIGPIPE.

PMc just for completeness, here's another example code (still missing lots of checks) handling child process in a way I would recommend:
C:
#include <errno.h>
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

static pid_t pid;
static volatile sig_atomic_t childexited;
static volatile int childrc;
static volatile int childsig;

static void handler(int sig)
{
    if (sig != SIGCHLD) return;
    int st;
    if (waitpid(pid, &st, WNOHANG) == pid)
    {
        if (WIFEXITED(st))
        {
            childrc = WEXITSTATUS(st);
            childexited = 1;
        }
        else if (WIFSIGNALED(st))
        {
            childsig = WTERMSIG(st);
            childexited = 1;
        }
    }
}

int main(void)
{
    int pfd[2];
    pipe(pfd);
    signal(SIGCHLD, handler);
    pid = fork();
    if (pid == 0)
    {
        close(pfd[1]);
        dup2(pfd[0], STDIN_FILENO);
        close(pfd[0]);
        char *const cmd[] = { "/foo/bar/baz", 0 };
        execve(cmd[0], cmd, 0);
        exit(errno);
    }
    FILE *out = fdopen(pfd[1], "a");
    fcntl(pfd[1], F_SETFL, fcntl(pfd[1], F_GETFL, 0) | O_NONBLOCK);
    for (;;)
    {
        if (childexited)
        {
            if (childsig)
            {
                fprintf(stderr, "child killed by signal %d\n", childsig);
                exit(1);
            }
            else
            {
                fprintf(stderr, "child process died: %s\n", strerror(childrc));
                exit(childrc);
            }
        }
        errno = 0;
        if (fputs("test", out) <= 0 && errno != EAGAIN)
        {
            perror("fputs()");
            exit(1);
        }
    }
    return 0;
}
When you run this, you will notice calling fputs() isn't even attempted as we already have a childrc from the failed execve(). You could also try with e.g. /bin/cat for the child, kill the child manually and observe what happens. Few more notes:
  • In signal handlers, be very carful to only execute async-signal-safe functions. There are very few of them. As a rule of thumb, any I/O is always unsafe. As a best practice, do as little as possible, instead just set volatile sig_atomic_t flags.
  • sig_atomic_t is only required to hold 8 bits. In most cases, best set another volatile variable first, then update a sig_atomic_t flag.
edit: updated the code, original version missed checking for child terminated by a signal....
edit2: updated again to fix another reliability issue: writing to the pipe might block when there is a reader but we have to wait for it ... but then we can't check childexited (so if the child dies while we're blocked, we have a deadlock...). In real-world code, you would probably use poll(2) or select(2) to know which fd is ready to write. Here with just the pipe, it's easier to set O_NONBLOCK and just ignore the EAGAIN error from fputs().
 
I'll try to summarize what can be taken from this thread in a nutshell:
  • Reliable error handling in POSIX/C is not a simple thing 🙈
  • On the reading end of a pipe, you will normally get an error trying to read after the writing end was properly closed. That's not possible the other way around.
  • Writing to a pipe with no reader will ultimately overflow the operating system's pipe buffer, resulting in a SIGPIPE that, by default, kills your process.
  • If you have child processes, be sure to properly monitor them. Handling SIGCHLD is a great way to do so.
  • Be careful with blocking I/O (which is the default mode). Either make sure some I/O operation won't block before actually doing it (using e.g. [p]select(), [p]poll(), ...), or use non-blocking I/O and properly handle EAGAIN/EWOULDBLOCK.
  • Avoid popen(). It's bad for reading and worse for writing for several reasons, most importantly you have no control over the child process.
 
Bonus example if you need to clearly distinguish whether execve() failed: Use a second pipe to transport the error message back to the parent, and set FD_CLOEXEC on it, so it is automatically closed in case execve() succeeds.

C:
#include <errno.h>                    
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

static pid_t pid;
static volatile sig_atomic_t childexited;
static volatile int childrc;
static volatile int childsig;

static void handler(int sig)
{
    if (sig != SIGCHLD) return;
    int st;
    if (waitpid(pid, &st, WNOHANG) == pid)
    {
        if (WIFEXITED(st))
        {
            childrc = WEXITSTATUS(st);
            childexited = 1;
        }
        else if (WIFSIGNALED(st))
        {
            childsig = WTERMSIG(st);
            childexited = 1;
        }
    }
}

int main(void)
{
    int pfd[2];
    pipe(pfd);
    int errpfd[2];
    pipe(errpfd);
    signal(SIGCHLD, handler);
    pid = fork();
    if (pid == 0)
    {
        close(pfd[1]);
        dup2(pfd[0], STDIN_FILENO);
        close(pfd[0]);
        close(errpfd[0]);
        fcntl(errpfd[1], F_SETFD, FD_CLOEXEC);
        char *const cmd[] = { "/foo/bar/baz", 0 };
        execve(cmd[0], cmd, 0);
        FILE *errout = fdopen(errpfd[1], "a");
        fprintf(errout, "Failed to execute `%s': %s\n",
                cmd[0], strerror(errno));
        exit(1);
    }
    close(pfd[0]);
    close(errpfd[1]);
    FILE *in = fdopen(errpfd[0], "r");
    char childmsg[256];
    if (fgets(childmsg, 256, in))
    {
        fputs(childmsg, stderr);
        exit(1);
    }
    fclose(in);
    fputs("child started.\n", stderr);
    FILE *out = fdopen(pfd[1], "a");
    fcntl(pfd[1], F_SETFL, fcntl(pfd[1], F_GETFL, 0) | O_NONBLOCK);
    for (;;)
    {
        if (childexited)
        {
            if (childsig)
            {
                fprintf(stderr, "child killed by signal %d\n", childsig);
                exit(1);
            }
            else
            {
                fprintf(stderr, "child terminated with exit code %d\n", childrc);
                exit(1);
            }
        }
        errno = 0;
        if (fputs("test", out) <= 0 && errno != EAGAIN)
        {
            perror("fputs()");
            exit(1);
        }
    }
    return 0;
}
 
What is do_error()? There's no need to ever check the return value of execve: if it ever returns, it's an error. But you should exit(3) then, otherwise you'll end up executing code that was meant for the parent process.
And that's exactly what it does (after reporting the fact to the stakeholders).

Still, I correct myself: popen() isn't the only issue here. As you're on the writing end, you can't detect when the reading end closed the pipe, no matter what.
Okay, that's the bad news.
But then, eventually, the running code does detect that it cannot get rid of the written data - and terminates unconditionally. Also, when I manage to close the pipe before it actually fails, then I get a proper error code from pclose().

You could detect the error much quicker by disabling stdio buffering of course (or by avoiding FILE * entirely and instead using low-level POSIX I/O functions).
Yeah, one could. But we can already see how this issue gets more and more elaborate.
And it wasn't the issue, anyway.
The issue was: I have one subsystem providing data, and another command that should reveive that data - but some parsing and a regexp must be applied in between. Both subsystems are packaged from ports and I do not want to modify them, but, since people no longer use the daemon group nowadays,the sending subsystem has a different uid:gid than the receiving, so scripts will not work it and I need a C program to do a SUID.

I once learned K+R C, I can do line-parsing in C, and there is a regexp in C, so I decided to do the whole thing in C. Then somewhere around 4pm yesterday all of it worked, and I only needed to do that write to the receiving command. 5 more minutes, I thought - as this should really be easy...

BTW, if I had done it in ruby, it would look like this and, voila, just work:
Code:
irb(main):001:0> require "open3"
=> true
irb(main):027:0> stdin, stdout, stderr = Open3.popen3("cat | bogus")
=>
[#<IO:fd 26>,                                                              
...                                                                        
irb(main):028:0> stdin.puts("hi")
=> nil
irb(main):029:0> stdin.puts("hi")
(irb):29:in `write': Broken pipe (Errno::EPIPE)
        from (irb):29:in `puts'                                            
        from (irb):29:in `<main>'                                          
        from /usr/local/lib/ruby/gems/3.0/gems/irb-1.4.1/exe/irb:11:in `<top (required)>'
        from /usr/local/bin/irb:25:in `load'                               
        from /usr/local/bin/irb:25:in `<main>'

They apparently achieve this by creating an extra thread that waits for the child. (Which might be a better idea than twiddling around with the signals.)
 
But then, eventually, the running code does detect that it cannot get rid of the written data - and terminates unconditionally.
That's not what happens. It receives a SIGPIPE and the default handling for this is to terminate. You could instead just handle the SIGPIPE. I just think this makes little sense as it happens with a lot of delay (you'll first have to fill the OS' pipe buffer). Better handle errors as they occur, see examples above.

They apparently achieve this by creating an extra thread that waits for the child. (Which might be a better idea than twiddling around with the signals.)
It's not a better idea ... why add a thread you won't need? *)

But it's a necessity if you want to provide an API like popen() that works in a (somewhat) reliable way. You can't just fiddle with signal handling in library functions.

BTW, just implementing a popen() alternative for C using a thread monitoring the child wouldn't help much either. You still can't just close the fd as this would go havoc if the main program opens new fd's, the first one would just "replace" the old one. You would need to provide your own abstraction for reading and writing that pipe, so you can add an error condition specifically for "child exited". That's probably just not worth it. For C, just forget about it and avoid popen() ;)

---
*) edit, remember, threads not only add overhead but also complexity: to communicate something to another thread, you need at least memory barriers, possibly locks. IMHO, a thread just sitting there blocked waiting for something isn't the greatest design...
 
IMHO, a thread just sitting there blocked waiting for something isn't the greatest design...
This was the original promise of threads. They would be so lightweight that you could have any number of them sitting around doing nothing with without incurring heavy overhead. Thus the socket-per-thread designs that were the new hotness at the turn of the millennium. That promise was not fulfilled, as we all know. I believe Linux threads are particularly expensive, and were called "lightweight processes" at one point.
 
"Lots of My Opinion" follows:

Jose I think it boils down to implementation. Solaris, down at the kernel level has been fully preemptible for a while, lots of work winds up on what I'll call "kernel threads". Device interrupts, instead of running in the context of the current thread/process would do minimal work to service hardware then queue that work up. The queued work winds up getting done by a thread of some sort.
I think a lot of FreeBSD has migrated to similar patterns (see the grand bikesheds with Matt Dillion around the 4.x to 5.x transition).

Threads at/up to the user level I think become a different story. User programs inherently block, waiting on something a lot of time. Wait for something on a socket before they do something (pretty much every webserver out there, no?), then after they are done, they go back and wait for more. So I think "user threads" wind up being a bit heavier than kernel threads simply because they have to carry a lot more context with them.

"Interprocess communications" or talking with another thread, a lot I think depends on what the communications is.
Look at a pipe: effectively socket communications. Thread 1 writes to a socket, Thread 2 reads from the socket. The kernel can arbitrate between the two threads. Reader doesn't unblock until Writer has completed a write.

IMHO, a thread just sitting there blocked waiting for something isn't the greatest design...
Yet this is pretty much every program that sits waiting on a file descriptor, no?
A main thread of execution, sitting blocked/waiting on something, then spawning off a thread of execution to handle the input is not a bad thing (servers, listen on socket, accept, spawn thread to handle the request).

Multiple threads, yes you can get it wrong very quickly and wind up losing at the end (performance), but properly designed software utilizing threads is not inherently bad.
 
Yet this is pretty much every program that sits waiting on a file descriptor, no?
A main thread of execution, sitting blocked/waiting on something, then spawning off a thread of execution to handle the input is not a bad thing (servers, listen on socket, accept, spawn thread to handle the request).
You're talking about the tyical IO-dispatcher thread in a modern server application. This will use some form of async I/O (events) and then just offload the "computational" work to some worker thread, which is a very good design. But spawning a thread to do nothing but to wait for one singular event is more or less the opposite of that.
 
Would love to continue the threads/ async I/O discussion, but we're seriously redshifting PMc 's thread for a very specific question. Mind If I ask the mods for a split?
 
  • Like
Reactions: mer
Back
Top