Solved "Options" objects in C

zirias@ · May 21, 2023

Bobi B. said:
Later, in code, you can handle compatibility gracefully.

Yes, as already mentioned above ... although I'm not sure how exactly it could go wrong in practice, this is non-conforming code. Strictly speaking, passing an older/shorter version of the struct means passing a pointer to an incompatible type and all that C has to say about that is: Undefined behavior.

Bobi B. said:
Or you can use shared library versioning (isn't this what it is for?)

Sure you handle breaking changes with library versioning. Typically, the major version number is part of your SONAME (so the dynamic linker looks for it), then all you have to do on a breaking change is bump the major version.

But: You certainly don't want to do that for every added feature. It requires all consumers to at least re-build. I really want to avoid breaking the ABI unless it's really necessary.

That said, I'm almost done implementing my idea above (fully opaque options objects, but keep just one thread-local instance of each). I guess that will be quite usable, let's see ?

zirias@ · May 22, 2023

Hm, guess this turns out actually quite readable ? E.g., here's some startup function when using the library ... the RunOpts configure behavior for both Daemon_run() and Service_run() (which is called later inside daemonrun()):

C:

SOLOCAL int Tlsc_run(const Config *config)
{
    cfg = config;

    PSC_RunOpts_init(daemonrun, 0, Config_pidfile(cfg));
    PSC_RunOpts_runas(Config_uid(cfg), Config_gid(cfg));

    if (Config_verbose(cfg))
    {
        PSC_Log_setMaxLogLevel(PSC_L_DEBUG);
    }
    if (Config_daemonize(cfg))
    {
        PSC_Log_setSyslogLogger(LOGIDENT, LOG_DAEMON, 1);
    }
    else
    {
        PSC_RunOpts_foreground();
        PSC_Log_setFileLogger(stderr);
    }

    return PSC_Daemon_run();
}

kpedersen · May 22, 2023

It looks like a decent solution from a tech point of view. I think maybe developers would like that API.

Me personally, for a library, I am a little bit creeped out by the "apparently global" config options. I wouldn't be able to assume they are thread local just by looking at them.

What you don't want is your users to start to work around an (incorrectly) perceived state driven design (read: ratty OpenGL) API via weirdness like:

Code:

TcpClientOpts_enableTls(0, 0); // no client certificate needed
Connection *client = Connection_createTcpClient();
TcpClientOpts_enableTls(0, 1); // reset state

Or worse, they just assume your API isn't thread-safe. But again, that could just be me. I don't recall seeing similar in the wild outside of OpenGL (which isn't thread safe), but perhaps this is more common.

zirias@ · May 22, 2023

kpedersen, I guess that's a job for a sane API documentation (which is still a TODO here ...) ?

I'm even re-thinking whether I need thread-local at all. In all my own services, I wouldn't need it, because all these objects are only created on a single thread anyways (and you can't create them on worker threads from the pool anyways, another thing that needs clear documentation).

But then, it would be possible to launch completely separate main event loops from multiple threads (manually created). Not sure why anyone would ever want to do that, but then, these config objects would indeed need to be thread-local.

zirias@ · May 22, 2023

zirias@ said:
it would be possible to launch completely separate main event loops from multiple threads

Just challenged that, computer says no ?.

It could be supported by making more "global" state thread-local. But I currently really don't see how it would be useful.

In a nutshell, my code is thread-safe where it has to be (which is mostly passing jobs to the threadpool and getting their results back ... you can even create a new job from within a worker thread, this is needed sometimes). But most of the framework objects must live on the "main thread" anyways, which is a consequence of the general design (a huge event loop for asynchronous request processing on a single thread).

So, in general, I think thread-local options objects might be a suitable way. For my specific case here, I guess I can indeed even drop the thread-local. The important thing will be to properly document the thread model, so consumers know what they can/should do and what not ....

Crivens · May 23, 2023

Maybe this is of help?

zirias@ · May 23, 2023

It starts to take form ... here's a little TLS-enabled "hello server" using the new lib interface:

C:

#include <poser/core.h>

static const uint8_t hello[] = "Hello!\n";

void sent(void *receiver, void *sender, void *args)
{
    (void)receiver;
    (void)args;

    PSC_Connection_close(sender, 0);
}

void newclient(void *receiver, void *sender, void *args)
{
    (void)receiver;

    PSC_Server *server = sender;
    PSC_Connection *client = args;

    PSC_Connection_write(client, hello, sizeof hello - 1, client);
    PSC_Event_register(PSC_Connection_dataSent(client), 0, sent, 0);
}

int service(void *data)
{
    (void)data;

    PSC_Service_init();

    PSC_ThreadOpts_init(8);
    PSC_ThreadPool_init();

    PSC_TcpServerOpts_init(8080);
    PSC_TcpServerOpts_bind("localhost");
    PSC_TcpServerOpts_enableTls("/tmp/cert/cert.pem", "/tmp/cert/key.pem");
    PSC_Server *server = PSC_Server_createTcp();
    PSC_Event_register(PSC_Server_clientConnected(server), 0, newclient, 0);

    int rc = PSC_Service_run();

    PSC_Server_destroy(server);
    PSC_ThreadPool_done();
    PSC_Service_done();
    return rc;
}

int main(void)
{
    PSC_Log_setFileLogger(stderr);
    PSC_RunOpts_init(service, 0, 0);
    PSC_RunOpts_foreground();
    return PSC_Daemon_run();
}

edit ... and yes, it works

Code:

$ socat OPENSSL:localhost:8080,verify=0 STDOUT
Hello!
$

JAW · May 24, 2023

kpedersen said:
Me personally, for a library, I am a little bit creeped out by the "apparently global" config options. I wouldn't be able to assume they are thread local just by looking at them.

I also share this sentiment from kpedersen.

Personally, if I were a user of the lib I think it would be more straighforward if the lib used "normal" opaque objects (as the OP originally proposed), even if under the hood that pattern requires some additional overhead (allocs, extra functions calls/boilerplate). Most of the time these overheads are in fact negligeable unless it can be proven otherwise with real world use cases and profiling.

Also, sometimes I tend to provide some extra "constructors" for common use cases e.g.

C:

MyOptions *MyOptions_Create(void);
MyOptions *MyOptions_CreateWithDimensions(width, height);

etc...

zirias@ · May 24, 2023

JAW said:
I also share this sentiment from kpedersen.

Well, a "sentiment" alone isn't too convincing ...

JAW said:
additional overhead (allocs, extra functions calls/boilerplate). Most of the time these overheads are in fact negligeable

Just to make it clear, I'm much more bothered by the boilerplate in consumer code than by possible runtime drawbacks. I think in the tiny lab example above, the code looks cleaner and more straight-forward. E.g. compare these two functions creating a server:

C:

PSC_Server *createServer(void)
{
    PSC_TcpServerOpts_init(8080);
    PSC_TcpServerOpts_bind("localhost");
    PSC_TcpServerOpts_enableTls("/tmp/cert/cert.pem", "/tmp/cert/key.pem");
    return PSC_Server_createTcp();
}

vs

C:

PSC_Server *createServer(void)
{
    PSC_TcpServerOpts *opts = PSC_TcpServerOpts_create(8080);
    PSC_TcpServerOpts_bind(opts, "localhost");
    PSC_TcpServerOpts_enableTls(opts, "/tmp/cert/cert.pem", "/tmp/cert/key.pem");
    PSC_Server *server = PSC_Server_createTcp(opts);
    PSC_TcpServerOpts_destroy(opts);
    return server;
}

As long as PSC_Server copies everything it needs from the options, there's no technical issue making it static. If it's needed on multiple threads, thread_local will immediately solve that as well. I see of course how the second form is what someone used to "object-oriented C" will immediately understand...

Still I'm thinking about actual (technical) drawbacks of this approach of course. So far, I found one thing: With static options objects, a consumer can't store some fully configured options object to create multiple objects from it later. I'm not sure yet, but this might be a reason to go back to the "full boilerplate" version.

zirias@ · May 24, 2023

Well, thanks for all the input so far. I think I found a solution now I'm satisfied with:

I have some "static classes" that need flexible (and possibly extensible) options. For these, I'll use static options classes. As they all need to be used on the main thread anyways, there isn't even a need for thread_local.
Options objects for objects constructed at runtime will need to be instantiated (and therefore also destroyed). Although this is not strictly necessary, it enables reuse of them and (more importantly) looks like what people will expect ?

Not really related here, but I eliminated the need for boilerplate somewhere else, by providing a configurable and simple standard method to launch a service.

With these changes, my simple "hello server" now changed to this code:

C:

#include <poser/core.h>
#include <stdlib.h>

static PSC_Server *server;
static const uint8_t hello[] = "Hello!\n";

void sent(void *receiver, void *sender, void *args)
{
    (void)receiver;
    (void)args;

    PSC_Connection_close(sender, 0);
}

void newclient(void *receiver, void *sender, void *args)
{
    (void)receiver;
    (void)sender;

    PSC_Connection *client = args;

    PSC_Connection_write(client, hello, sizeof hello - 1, client);
    PSC_Event_register(PSC_Connection_dataSent(client), 0, sent, 0);
}

void startup(void *receiver, void *sender, void *args)
{
    (void)receiver;
    (void)sender;

    PSC_TcpServerOpts *opts = PSC_TcpServerOpts_create(8080);
    PSC_TcpServerOpts_bind(opts, "localhost");
    PSC_TcpServerOpts_enableTls(opts,
            "/tmp/cert/cert.pem", "/tmp/cert/key.pem");
    server = PSC_Server_createTcp(opts);
    PSC_TcpServerOpts_destroy(opts);

    if (!server)
    {
        PSC_EAStartup_return(args, EXIT_FAILURE);
        return;
    }
    PSC_Event_register(PSC_Server_clientConnected(server), 0, newclient, 0);
}

void shutdown(void *receiver, void *sender, void *args)
{
    (void)receiver;
    (void)sender;
    (void)args;

    PSC_Server_destroy(server);
}

int main(void)
{
    PSC_RunOpts_enableDefaultLogging(0);
    PSC_RunOpts_foreground();
    PSC_Event_register(PSC_Service_prestartup(), 0, startup, 0);
    PSC_Event_register(PSC_Service_shutdown(), 0, shutdown, 0);
    return PSC_Service_run();
}

A "non-lab" example can be seen in the updated code of security/tlsc here: https://github.com/Zirias/tlsc/blob/master/src/bin/tlsc/tlsc.c (I'll release a new version of it as soon as I added a few more features to the lib and finally documented its API).

Thanks again for the input ?

Jose · May 25, 2023

zirias@ said:
Still I'm thinking about actual (technical) drawbacks of this approach of course. So far, I found one thing: With static options objects, a consumer can't store some fully configured options object to create multiple objects from it later. I'm not sure yet, but this might be a reason to go back to the "full boilerplate" version.

I work near a large C++ application, and the folks who work on that are busy trying to remove all static initialization. The reason why is that it's impossible to do lazily. It's a user-interactive process, and they want to be responsive to the user as quickly as possible and so defer initialization of non-essential components for as long as possible. It also runs on resource-constrained hardware so throwing CPU at it is not an option.

zirias@ · May 25, 2023

Jose said:
I work near a large C++ application, and the folks who work on that are busy trying to remove all static initialization. The reason why is that it's impossible to do lazily.

That's a well-known issue, but I think it only really applies to C++. Both languages "default-initialize" static objects on startup, but that just means filling the segment where they are stored with 0-bytes, which does not take any relevant time. And if you want, you can have static initializers with just fixed plain values.

But in C++, for objects of a class with a non-trivial static constructor, this is also executed directly on startup, and sure, if that's a costly thing, here's the problem.

C doesn't know constructors, so if you need any non-trivial initialization, you have to call it manually anyways.

But then, it's nothing to worry about for these "options objects", they're just simple/dumb data structures. The reason I don't want to expose the structures really is to allow adding features without breaking the library ABI.

Crivens · May 25, 2023

zirias@ said:
That's a well-known issue, but I think it only really applies to C++. Both languages "default-initialize" static objects on startup,

Errr, no.
consider this:

Code:

if (condition) {
  static ComplexClass WhatsItsName(ComplexConstructorArgs);
  ...

The static variable is initialized when first reached by control flow in C++, and the compiler has to keep track and generate tracking code that this is done only once. Did someone say thread safety? Yes, this is one of the absolute heisenbugs to find. And once you have seen constructors where the *beeping beep BEEP* PROTOTYPE has a signature of 1400 lines!!!, you can see that the window can be wide open for the midden to hit the windmill in one run, and maybe never again.

kpedersen · May 25, 2023

Crivens said:
static variable

For C++ I tend to use the following static analysis program:

Code:

echo "You have $(grep -R static * | wc -l)" bugs in your project"

zirias@ · May 26, 2023

Crivens said:
The static variable is initialized when first reached by control flow in C++,

I think you're confusing something here.

First, "default initialization" has nothing to do with constructors at all. It just sets objects of static storage duration to zero. This ALWAYS happens before even entering main(), in both C and C++.

Second, I'm pretty sure this example is not what Jose talked about. You're showing a weird case here, a variable of static storage duration, but locally scoped. Then, indeed, its constructor will only run when program execution first enters its scope. I think people writing this kind of code should be shot

. What I was referring to was static storage duration in global scope (the only somewhat safe way to do it), and then, there's indeed no way to do that lazy.

BTW, reminds me yet again why I think C++ should never be used .... ?

Crivens · May 26, 2023

zirias@ said:
First, "default initialization" has nothing to do with constructors at all. It just sets objects of static storage duration to zero. This ALWAYS happens before even entering main(), in both C and C++

That is usually done by placing it in .bss by the linker. So this happens even before the first instruction runs. Once the program starts, the global constructors run for things that are not compile time constants, which C++ allows. Here the problem of ordering comes up. It's a mess.

zirias@ · May 26, 2023

Crivens said:
That is usually done by placing it in .bss by the linker. So this happens even before the first instruction runs.

.bss is a "virtual" segment (it doesn't exist in the binary image), so some code has to zero it. Many C runtimes do exactly that before calling main(). Having the linker or even the OS provide zeroed memory would be fine as well. The language standards say nothing about the how, they just require objects of static storage duration being "default initialized" (or, if a static initializer is present, statically initialized) before main() is entered.

Crivens said:
Once the program starts, the global constructors run for things that are not compile time constants, which C++ allows. Here the problem of ordering comes up.

Yes, that's this further step that comes with non-trivial "construction" in C++. As long as these objects are in global scope and have no mutual dependencies (and none of them will fire up threads ?), it can be handled. I just said what you showed in your example is a lot worse, a locally scoped object of static storage duration with non-trivial construction ?

Crivens said:
It's a mess.

Always one of my thoughs when discussing or looking at C++ ?

zirias@ · May 26, 2023

Well, I was confident enough about the API design now to actually take the time and document every public piece (horrible work, but will even be helpful to me in the future) ?

poser: poser – a C framework for POsix SERvices

kpedersen · May 26, 2023

Very cool. I was wondering what was the rationale behind "poser" as a name. Now I know.

Now there is just one thing left for you to do:

You should rewrite it in Rust (TM)

Jose · May 26, 2023

zirias@ said:
You're showing a weird case here, a variable of static storage duration, but locally scoped.

This was a super common idiom in C libraries back before threads became prevalent. Now doing it in C++...

zirias@ · May 26, 2023

kpedersen said:
Very cool. I was wondering what was the rationale behind "poser" as a name. Now I know.

Hehe, every open-source project is instantly better when it has some silly name ?

kpedersen said:
Now there is just one thing left for you to do:

There was something left to do: Hunt down some bug/regression I introduced while "improving" my code. Seems to be fixed now, some member initialization was missing ?. On the way to find it, I fixed quite some other subtle bugs ....

So now there is something left to do: Test this in realistic usage for at least some days, so I can be confident I found all the newly introduced bugs or regressions

(and then: release it!)

kpedersen said:
You should rewrite it in Rust (TM)

Haha, what a horrible troll ??

zirias@ · May 27, 2023

zirias@ said:
Test this in realistic usage for at least some days,

Made a lot of progress now on the "testing" part:

First, I tested the main new feature (TLS server support) a lot using dedicated test programs (of course also doing all sorts of "weird" things, hehe).

Then, I updated https://github.com/Zirias/tlsc to use the posercore library and use the updated version for my real use case for a while now. After finding and fixing some bugs, it now works reliably. I also used that to test the improved "client certificate" feature, by connecting to libera irc with a valid certificate identifying some nick

Finally, I also updated https://github.com/Zirias/remusock to use the new lib. That was the project where almost 3 years ago, most of this code was initially created (only the logging and event-handling parts existed earlier). It's a special case, because the actual "protocol" there is one giant monster module which I even felt bad about back then, and now, when looking at it, I can directly spot bugs ... and it really stands out now that all the other code (of better quality) is gone ? – so, work left to do there, probably reimplement it more or less from scratch. Anyways, this is a nice test case for some more complex usage which also makes use of "local UNIX" connections. And so far, the updated tool also works reliably for my use case ?

So, all in all, I think I can indeed release a first version of "poser" (only including the "posercore" lib) quite soon!

Of course, it would be very nice to automate testing .... but .....

Testing this kind of code "in isolation" (aka: unit testing) seems close to impossible. Or how would you ever mock e.g. the behavior of POSIX/BSD sockets?
Automating "integration tests" (that e.g. include a real network communication) seems also pretty tough ...

Solved "Options" objects in C

zirias@

zirias@

kpedersen

zirias@

zirias@

Crivens

Administrator

zirias@

JAW

zirias@

zirias@

Jose

zirias@

Crivens

Administrator

kpedersen

zirias@

Crivens

Administrator

zirias@

zirias@

kpedersen

Jose

zirias@

zirias@