C remusock: a tool to make a local unix domain socket available on a remote machine

zirias@ · Oct 11, 2021

Quite some time ago, I wrote a tool for my own needs (see thread title) and called it remusock. While programming it, I asked a few questions here (so some might have seen it already) and learned a lot about basic socket programming in pure POSIX, challenges with "async" models when the POSIX APIs don't provide it (and how to emulate it with a thread pool), as well as about a strange bug in Linux

Now, I was finally fed up with having to start the tool manually on every reboot and created a FreeBSD port including an init script. I didn't submit this port to FreeBSD because I'm very unsure whether anyone else could actually use this tool… partially posting this thread to find out more

The tool really only does the one thing described in the thread title. You could probably come up with something somewhat similar using net/socat. But then, I tried to do that one job as good and reliable as possible. So, remusock will automatically re-establish lost TCP connections, allows the TCP connection to work in whichever direction you prefer, allows multiple TCP clients if it's using the same direction as the unix socket connection, and of course multiplexes several unix socket connections over a single TCP connection. Still, it's very "special purpose".

So, is there anyone who thinks they could need such a tool so far?

I also have in mind to maybe add encryption (TLS) and authentication (maybe using PAM?) some day, although I don't need this myself (I'm only using it over a trusted VPN). With these features added, would it be more interesting for someone?

zirias@ · Oct 11, 2021

Hehe, thanks for your "likes"! I'm a bit unsure how to interpret them though: Do you just like the effort (maybe even the code, idk?) – or would you have a usecase for that kind of tool?

zirias@ · Oct 11, 2021

Another remark if you look at the code: Designing this tiny tool, I followed an event-driven OOP approach. It can be done in C and I like it. And I know some C programmers will detest it with a passion.

One thing I noticed though: For the multiplexing of several connections through a single TCP "tunnel", I needed a protocol. Although this protocol is hilariously simple (just prefixing short "commands" to the payload, most of the time just a single letter), the "class" (protocol.c) quickly grew huge, more than 700 lines. This might be a "code smell". Maybe I already touched the limitations of programming in C, choosing that design?

Or maybe I mixed in too many aspects (e.g. handling of lost connections is also the responsibility of the protocol), but I really run into problems trying to separate these concerns…

Still, my primary goal posting this thread is knowing whether anyone could actually need such a tool, so it would make sense for me to put some effort into it, implementing features that others might need although I personally don't…

zirias@ · Oct 22, 2021

Just committed a small improvement on my daemon code after learning that file locking is an elegant thing to do on pidfiles for detecting stale ones (yep, probably stupid I didn't realize that much earlier). It should be much more reliable than just checking for the presence of a process with the pid, using kill().

Hope I got it correct: https://github.com/Zirias/remusock/commit/878c18e4af2dadcf531db8537098e1f470f9fe1e

(This whole code to daemonize grew pretty large…)

Jose · Oct 22, 2021

Zirias said:
Just committed a small improvement on my daemon code after learning that file locking is an elegant thing to do on pidfiles for detecting stale ones (yep, probably stupid I didn't realize that much earlier). It should be much more reliable than just checking for the presence of a process with the pid, using kill().

Hope I got it correct: https://github.com/Zirias/remusock/commit/878c18e4af2dadcf531db8537098e1f470f9fe1e

(This whole code to daemonize grew pretty large…)

I'm not much of a C programmer, so please forgive me if this is stupid. Starting at line 149 where you have:

Code:

    if (pf)
    {
        fprintf(pf, "%d\n", pid);
        fclose(pf);
    }
    return EXIT_SUCCESS;

Would it be better to do

Code:

    if (pf)
    {
        fprintf(pf, "%d\n", pid);
    }
    rc = EXIT_SUCCESS;
    goto done;

Yes this is somewhat stupid, and does the same thing as your code but in a slightly less efficient way. The only reason I suggest it is someone who is a good C programmer told me once that in the presence of the "goto cleanup" C idiom, the function should have a single return path that always goes through the cleanup code.

zirias@ · Oct 22, 2021

That might be a little optimization indeed, I'll look into it (and maybe find more occurances), although:

Jose said:
a good C programmer told me once that in the presence of the "goto cleanup" C idiom, the function should have a single return path that always goes through the cleanup code.

this is of course true (and kind of corresponds to the finally block in some "modern" languages), but I'm not sure yet whether it's possible to fully adhere to this in some code that is potentially executed by three different processes

edit: about efficiency, I normally trust in modern optimizers, they often do a good job

eternal_noob · Oct 22, 2021

the only reason to use a GOTO is if you programmed yourself so far into a corner that it is the only way out. In other words, proper design ahead of time and you won't need to use a GOTO later.

GOTO still considered harmful?

Everyone is aware of Dijkstra's Letters to the editor: go to statement considered harmful (also here .html transcript and here .pdf) and there has been a formidable push since that time to eschew the

stackoverflow.com

zirias@ · Oct 22, 2021

eternal_noob in C, goto is good practice, given you use it correctly. That is for cleanup and/or error handling. Sometimes, escaping nested blocks is also acceptable. Never goto backwards. Avoiding goto in a dogmatic way will impair code quality in C.

I'd even add: anyone who thinks goto should be banned from C never programmed anything of value in C.

zirias@ · Oct 22, 2021

BTW, just for fun, FreeBSD source (on releng/13.0):

Code:

# grep -R goto /usr/src | wc -l
   71815

Dijkstra's paper is about the horrible "spaghetti code" you can construct with goto. There's no need to do so in C. But there's still a need for goto. "Modern" languages handle it with exceptions and finally blocks. I don't think this is a good idea. Error handling should be explicit. But without exceptions and without goto, your code will become an illegible mess.

Jose · Oct 23, 2021

Eternal_noob since you're interested in code purity, I suggest you look into Haskell.

unitrunker · Oct 23, 2021

There's an old trick that mimics try/finally in C without goto.

C:

/* Resource variables. */

int handle = -1;

char *bytes = NULL;

/*Try to allocate resources.*/

do

{

    handle = get_handle_to_some_resource();

    if (handle < 0)

        continue;



    bytes = allocate_some_object();

    if (!bytes)

            continue;



    do_some_work();

}

while (false);

/* Finally cleanup.*/

if (bytes) free(bytes);

if (handle > 0)

    close(handle);

I'm not fond of the technique but I've seen it used and it definitely works.

zirias@ · Oct 23, 2021

Yes, it works, and it's braindead. I see a loop. Oops, it's not really a loop? That's just unreadable code.

Dijkstra's paper mislead some people to avoid goto, just for the sake of avoiding goto. That was never the intention.

Jose · Oct 23, 2021

I can see the attraction of the simplicity of just using continue, but yeah it violates the POLA. There's no loop where there appears to be a loop. If only the break reserved word could be used to break out of arbitrary blocks...

It reminds me of a similar hack used to work around the limitations of macros that look like functions but are not functions

do{..}while(0) macro substitutions

groups.google.com

Not a fan of that one, either.

unitrunker · Oct 23, 2021

I omitted the macros for clarity. Actual code looks like this:

C:

/* Resource variables. */
int handle = -1;
char *bytes = NULL;

TRY
{
    handle = get_handle_to_some_resource();
    if (handle < 0)
        continue;

    bytes = allocate_some_object();
    if (!bytes)
            continue;

    do_some_work();
}
FINALLY
{
    if (bytes) free(bytes);
    if (handle > 0)
        close(handle);
}

Disclaimer: I'm not a fan of heavy macro use. This is just something I've seen done. When you work on other people's code for a living, you run into tricks like this.

zirias@ · Oct 23, 2021

Yes, I've seen these things, too.

It's still a stupid "look! I'm clever! no goto!", this time even hiding control structures behind macros. A C programmer looking at this will first wonder about the semantics of this construct. And what's that "continue" doing there without a loop? To really get what's happening, there's no other choice than looking at the macros themselves. How much extra time did it take you to figure it out when you've first found it?

It's ridiculous, all this weird stuff just to avoid the one clear and obvious structure, using goto.

zirias@ · Oct 23, 2021

Jose said:
It reminds me of a similar hack used to work around the limitations of macros that look like functions but are not functions

do{..}while(0) macro substitutions

groups.google.com

Not a fan of that one, either.

This is, IMHO, "less bad". Yes, it's the same dirty abuse of a loop, but unfortunately, if you want to implement one function of an API as a macro, there's often no alternative to that hack

zirias@ · Oct 23, 2021

I revisited the topic once again (correct/reliable handling of a pidfile can be surprisingly complex!) and the result looks like this: https://github.com/Zirias/remusock/commit/f769e5e9c8cd7009950a721eacbe36f10075b1c0

The main reason for the change is that there was still a race condition on the pidfile if another instance was still in process of starting up. Not sure whether I completely solved it, but getting the lock as early as possible (and passing it down to the children following the fork()s), as well as also acquiring a lock before attempting to remove a "stale" pidfile, should now be pretty reliable

While at it, I restructured the code a bit, so the pidfile and lock handling is separated out into two local functions, this should improve readability. Jose I also looked into the "extra returns" issue, but found I only used them for the "tail ends" of the respective parent processes – and I think this kind of makes sense…

Kaminar · Oct 23, 2021

Zirias said:
This is, IMHO, "less bad". Yes, it's the same dirty abuse of a loop, but unfortunately, if you want to implement one function of an API as a macro, there's often no alternative to that hack

I almost always put "if" branches into the curly brackets. If you stick with this rule, you won't have issues like in:

do{..}while(0) macro substitutions

groups.google.com

There are really only a few exceptions, where it isn't good to do it.

zirias@ · Oct 23, 2021

Kaminar, agreed so far, this styleguide prevents the problem, but this way, using the macro as if it was a function is not fully transparent to the consumer, so I think this isn't an option e.g. for libraries where you don't know the consumer's coding style. It definitely is a clean way to deal with it in your own codebase.

zirias@ · Oct 24, 2021

Mr. Salty said:
Doesn't a lock introduce a performance hit

Sure, the way it's implemented now, both children might block for a very short time on acquiring the lock (until the parent releases it by closing the fd to the pidfile, which is "immediate").
I doubt it would be measurable. I'm pretty sure it's irrelevant for a one-time action (service startup).

Mr. Salty said:
or potentially introduce deadlocks?

This would require at least two locks being involved.

Mr. Salty said:
Would it be better to use atomic variables?

How would they allow me to reliably detect a still running instance of the same daemon?

zirias@ · Oct 25, 2021

Just released remusock 1.2 (with only the improved pidfile handling) and updated the FreeBSD port accordingly.

As I need the tool on a Debian box as well, I hacked together a systemd unit. I now feel like not publishing it. systemd wants me to use a "simple" mode with the daemon running in foreground (which remusockd can do, of course), only problem is, there's no way to check for successful startup. systemd wants me to solve this by using a "notifying" mode where my daemon is supposed to call some systemd API (wtf? alternative: dbus. again: wtf?). I used the (strongly discouraged) "forking" mode instead, which works just fine with a well-behaved daemon. Then, for configuring the service, systemd wants me to use "snippets" in unit-syntax or something like that. Doesn't work for me, I just hardcoded the command line options I need for my usecase.

It's just crap. Whoever wants to use remusock with systemd, please write your own unit file (and I recommend to use the "discouraged" forking mode).

Ok, sorry for the rant.

mrbeastie0x19 · Oct 25, 2021

use of goto is fine, the famous paper is more about the unreadability in languages like BASIC, and yes systemd is a mess, you have to write things for systemd, not port them after, it is very opinionated software.

zirias@ · Oct 25, 2021

mrbeastie0x19 said:
use of goto is fine, the famous paper is more about the unreadability in languages like BASIC

Exactly. Or a bit more broadly, it's about the spaghetti-code created by using goto as a replacement for every other (more specific) control structure.

SCNR:

"Idiomatic goto":

Code:

    int rc = -1;
    Foo *foo = 0;

    [...]
    if (!(foo = createFoo())) goto cleanup;
    [...]

    rc = 0;
cleanup:
    if (foo) destroyFoo(foo);
    return rc;

Not to be confused with "idiotic goto":

Code:

    if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
        goto fail;
        goto fail;

zirias@ · Oct 26, 2021

Great, found kind of a schroedinbug the day after release…

I checked /usr/include/x86/signal.h on FreeBSD: no volatile there either. Seems it just doesn't break at -O2 (and I never tested -O3)

Jose · Oct 26, 2021

Signpost up ahead - your next stop the optimized code twilight zone.