C/C++ Writing on a closed socket doesn't fail

RalfPeg

Member


Messages: 22

Hi! When I'm trying to write to a closed socket (in my project I wanted to use this, to inform the other side about an abortion) the data is written successfully. Shouldn't the be an error or an EOF or something? Reading on a closed socket is working resp. read returns 0.

Comprehensible code:

Code:
#include <unistd.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <string.h>
#include <stdio.h>

//server

int main()
{
        int sd;
        int client;
        struct sockaddr_in sa;
        int reuse = 1;

        sd = socket(AF_INET, SOCK_STREAM, 0);
        if(sd == -1)
        {
                printf("socket\n");
                return 1;
        }

        memset(&sa, 0, sizeof(sa));
        sa.sin_family = AF_INET;
        sa.sin_port = htons(12345);
        sa.sin_addr.s_addr = htonl(INADDR_ANY);

        if(setsockopt(sd, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof(reuse)) == -1)
        {
                printf("setsockopt\n");
                return 1;
        }

        if(bind(sd, (struct sockaddr *) &sa, sizeof(sa)) == -1)
        {
                printf("bind\n");
                return 1;
        }

        if(listen(sd, 1) == -1)
        {
                printf("listen\n");
                return 1;
        }

        if((client = accept(sd, NULL, NULL)) == -1)
        {
                printf("accept\n");
                return 1;
        }

        if(close(sd) == -1)
        {
                printf("close sd\n");
                return 1;
        }

        if(close(client) == -1)
        {
                printf("close client\n");
                return 1;
        }

        printf("success\n");
    
        return 0;
}



Code:
#include <unistd.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <string.h>
#include <stdio.h>

//client

int main()
{
        int sd;
        struct sockaddr_in sa;
        int reuse = 1;
        ssize_t bytessent;

        sd = socket(AF_INET, SOCK_STREAM, 0);
        if(sd == -1)
        {
                printf("socket\n");
                return 1;
        }

        memset(&sa, 0, sizeof(sa));
        sa.sin_family = AF_INET;
        sa.sin_port = htons(12345);
        if(inet_pton(AF_INET, "127.0.0.1", &sa.sin_addr) != 1)
        {
                printf("inet_pton\n");
                return 1;
        }

        if(connect(sd, (struct sockaddr *) &sa, sizeof(sa)) == -1)
        {
                printf("connect\n");
                return 1;
        }

        //wait until the server closed the connection
        sleep(5);

        bytessent = write(sd, "abcdefgh", 8);
        if(bytessent <= 0)
        {
                printf("write\n");
                return 1;
        }

        if(close(sd) == -1)
        {
                printf("close sd\n");
                return 1;
        }

        printf("success\n");

        return 0;
}
 

mark_j

Aspiring Daemon

Reaction score: 420
Messages: 788

When you use write(2) you need to ensure you also check the errno (intro(2)) should the result be -1.
In your test (bytessent <=0), this covers two scenarios, actually 0 bytes sent as well as an error. Is this intended? I think not. You need to fix that.
If you test for epipe, it will tell you the client has closed the connection.
 
OP
R

RalfPeg

Member


Messages: 22

Well, the expected behaviour of this example program is to terminate after calling write, because write returns 0 after writing to the (closed) sd which matches the condition, but it returns 8 (or whatever) as if the connection was still active und says "success". errno equals zero btw.
 

Emrion

Aspiring Daemon

Reaction score: 155
Messages: 544

I've already seen this. IIRC, it's the way the TCP/IP stack is working. Hoping to not say something false: the bytes are actually written because the kernel didn't yet clean the socket from its own memory. I need to investigate my old code to be sure but I don't have time now.

As you see, use write or read to know if a connection is alive dead is a bad idea.
 

Emrion

Aspiring Daemon

Reaction score: 155
Messages: 544

Read returns 0 because there is nothing to read in the corresponding TCP/IP buffer. The bytes sent by write are accepted and go in a TCP/IP buffer (they are lost in fine).
 

mark_j

Aspiring Daemon

Reaction score: 420
Messages: 788

Well, the expected behaviour of this example program is to terminate after calling write, because write returns 0 after writing to the (closed) sd which matches the condition, but it returns 8 (or whatever) as if the connection was still active und says "success". errno equals zero btw.
That's certainly extra information you didn't provide beforehand.
[USER=54797]Emrion[/USER] is quite correct, the kernel is handling the buffer so it reports success. Therefore to ascertain the state, simply read(2)from the socket, it will then fail.
 
OP
R

RalfPeg

Member


Messages: 22

That would mean, that the client should send some kind of command ("sending data") to the server, the server answers ("sending data acknowledged" or "aborted" or ...) and if it's successful everything is great, and read with returncode 0 already shouldn't happen resp. should be treated as an error, wouldn't it?

But isn't it a bug, if the kernel doesn't know that the other side already closed the connection?
 

Emrion

Aspiring Daemon

Reaction score: 155
Messages: 544

I think the kernel knows, at least if it's a gracefully connection ending.

You might want to use poll(2) in a while loop. You're waiting for events and among these you have POLLERR, POLLHUP, POLLNVAL but also others that indicate a packet to read for example.

My old code was something like (in a dedicated thread):
Code:
pfd[0].fd = sock;

        while (bRun) {

            poll(pfd, 1, -1);

            // Connection died?
            if (pfd[0].revents & (POLLERR | POLLHUP | POLLNVAL)) {
               ...
                bRun = false;
                break;
            }

            // Something to read
            if (pfd[0].revents & (POLLIN | POLLPRI)) {
                 ....
            }
       }

It was originally writed for linux. Concerning FreeBSD, you have also kqueue(2), but I never used this one.
 

Eric A. Borisch

Aspiring Daemon

Reaction score: 310
Messages: 528

Welcome to the world of network programming.

Reads will return 0 because there is nothing to read, but write() copies data into a buffer and returns. TCP needs to buffer user data to be able to handle retransmits without user intervention.

Buffering & returning for small (even though technically "blocking") sends is an optimization so that the writing program doesn't have to wait for transmission and acknowledgement from the far side for every write() call, which could be tens to hundreds of milliseconds depending on the network. For this small write like this with no other traffic, you're certainly going to get buffered and returned right away.

Behind the scenes, it will wait until either a timer expires or minimum buffered/waiting size is reached before actually transmitting data on the network. Again, an optimization. (One you can disable with setsockopt(2) and tcp(4)'s TCP_NODELAY; this may also cause your small write -- into a remotely closed socket -- to fail; I'm not sure.)

At that point, once it (behind the scene) goes to actually transmit, it will figure out fairly quickly that the remote side isn't interested in receiving data anymore. If you wait a second (likely much less) after your first write()-post-remote-close(), a subsequent write() will likely return -1 with errno=EPIPE.
 

mark_j

Aspiring Daemon

Reaction score: 420
Messages: 788

That would mean, that the client should send some kind of command ("sending data") to the server, the server answers ("sending data acknowledged" or "aborted" or ...) and if it's successful everything is great, and read with returncode 0 already shouldn't happen resp. should be treated as an error, wouldn't it?
It's more complex than this. The kernel cannot just assume the socket has been totally disconnected. It has to assume that there's a packet on its way. Perhaps you could read up on TIME_WAIT, CLOSE_WAIT etc, to help you understand? TCP is designed to be resilient, drop-outs, re-transmissions etc can be recovered.
Oh, and it happens regardless of whether the client or server is located on the same machine or a remote one; the kernel acts the same regardless.

But isn't it a bug, if the kernel doesn't know that the other side already closed the connection?
It can't control the remote end, so no, it's not a bug.
 

Matlib

Member

Reaction score: 18
Messages: 21

First of all, when dealing with sockets you should disable SIGPIPE to avoid unnecessary head scratching:
Code:
signal (SIGPIPE, SIG_IGN);


Second, there are a number of race conditions here:
– many of these calls return EINTR if interrupted by a signal, and you must repeat the call,
– write(2) may write less bytes than it was provided with, so it must be put in a loop until all data is actually sent.

In general, there's nothing wrong with write returning 0. It will happen if the third argument (data size) is also 0.

Appearantly the first small write to a socket closed by remote end succeeds, even if it was closed many seconds before. A subsequent write returns either ECONNRESET or EPIPE (+SIGPIPE).
 

yayj

New Member

Reaction score: 2
Messages: 15

According to your code, the TCP connection state is so-called half-open when the client invokes write(2). To be more precise, the socket on the client-side is CLOSE-WAIT, and the socket on the server-side is FIN-WAIT-2. TCP FSM describes it clearly. It will be easily verified when we execute:
Code:
$ ./server && netstat -anp tcp | grep 12345
success
tcp4       0      0 127.0.0.1.12345        127.0.0.1.41534        FIN_WAIT_2
tcp4       3      0 127.0.0.1.41534        127.0.0.1.12345        CLOSE_WAIT

According to TCP specification,
The user who CLOSEs may continue to RECEIVE until he is told that the other side has CLOSED also.
On the other hand, it's not a bug to permit sending some data successfully after FIN has been already received. If we try to capture the datagrams, we can see that the data was transferred to the server(the penultimate one).
Code:
$ tcpdump -tni lo0 tcp port 12345
...
IP 127.0.0.1.12345 > 127.0.0.1.53912: Flags [F.], seq 1, ack 1, win 1277, options [nop,nop,TS val 1746571422 ecr 3079116957], length 0
IP 127.0.0.1.53912 > 127.0.0.1.12345: Flags [.], ack 2, win 1277, options [nop,nop,TS val 3079116958 ecr 1746571422], length 0
IP 127.0.0.1.53912 > 127.0.0.1.12345: Flags [P.], seq 1:9, ack 2, win 1277, options [nop,nop,TS val 3079119965 ecr 1746571422], length 8
IP 127.0.0.1.12345 > 127.0.0.1.53912: Flags [R], seq 3260343863, win 0, length 0

As you may have already noticed, the server sent RST to refuse those data. As a result, SIGPIPE is supposed to be triggered if you try to write again.
Code:
bytessent = write(sd, "abcdefgh", 8);  // supposed to be successful
bytessent = write(sd, "abcdefgh", 8);  // SIGPIPE generated

It doesn't look like a half-duplex connection, described in RFC. The reason is that the server sent FIN via close(2), not shutdown(2). We modify the server,
Code:
...
// if(close(client) == -1)
if (shutdown(client, SHUT_WR) == -1)
{
  printf("close client\n");
  return 1;
}

char buf[8];
if (recv(client, buf, 8, 0) != 8)
{
  printf("not received\n");
  return 1;
}

printf("success\n");
...

Then it would print "success". Here are the captured datagrams.
Code:
$ tcpdump -tni lo0 tcp port 12345
...
IP 127.0.0.1.12345 > 127.0.0.1.36202: Flags [F.], seq 1, ack 1, win 1277, options [nop,nop,TS val 611164976 ecr 3022371360], length 0
IP 127.0.0.1.36202 > 127.0.0.1.12345: Flags [.], ack 2, win 1277, options [nop,nop,TS val 3022371360 ecr 611164976], length 0
IP 127.0.0.1.36202 > 127.0.0.1.12345: Flags [P.], seq 1:9, ack 2, win 1277, options [nop,nop,TS val 3022374361 ecr 611164976], length 8
IP 127.0.0.1.36202 > 127.0.0.1.12345: Flags [F.], seq 9, ack 2, win 1277, options [nop,nop,TS val 3022374361 ecr 611164976], length 0
IP 127.0.0.1.12345 > 127.0.0.1.36202: Flags [.], ack 10, win 1276, options [nop,nop,TS val 611167977 ecr 3022374361], length 0
 
Top