C sysctl equivalent/alternative to access user land program's state?

... or, with other words, a ready-made or an easy way of accessing the state of a running program, possibly to reconfigure a running program, similarly to the way sysctl(3) does for kernel code?

Recently I stumbled upon Linux Appliance Design book, where authors made up pretty clever way of interacting with a running program: they wrote a little library to emulate a PostgreSQL server, one can connect with psql to; this way you can see your program's state and to change it, as well, from any language/platform where PostgreSQL connector is available. Basically you, the programmer, choses which program state to export and what toggles to allow to be changed in runtime. Later someone, developer, tester, interface program, can access this state and can change it in runtime. The book is solid and well thought; IMHO a good read (and, surprisingly, they're giving it for free).

Nevertheless, I've been wondering what the alternatives are. I know one can always sit and write something, a socket or two, message queue, HTTP interface, whatever. But does such project/library already exists, out there, to fill that gap? What would you use?
 
This is an old question, in two senses: You asked it two weeks ago and got no answer, and anyone who uses daemons or servers has had to grapple with it for ages. The term "IPC" or Inter-Process-Communication is one of the buzz words describing it.

Let's begin with a non-answer, to show the pitfalls: You could find out where the "variables" of the program are, by examining the load map of the program, and then you could read/write binary data right into it. There are several ways of doing it; one is to use the same technique that debuggers use to run programs (read the source code to gdb to find out how, ha ha), or just run the program with all of its memory coming from a shared memory pool (for example using SysV shmem), and have a cooperating program reach in there. The reason this is insane is: the state of a program is typically much more complex than a single variable. If you reach in and change one integer, string or floating-point value, you probably leave the original program in an inconsistent state, causing it to "crash" soon after (here when I say "crash" I don't just mean dump core and exit, but perhaps act weirdly, or do things that it shouldn't do). By having a second program reach into the first one with shared memory, you just created a multi-threaded monster, with all the attendant problems of synchronization, locking, state consistency, atomicity, and so on.

The reason there is no single cut and dried answer is that it is complicated, which means there is no "one size fits all" answer. I think the biggest factor in designing an answer is: How much internal state does the program have, compared to how much state needs to be passed into/out of it? If the program is internally very complex (gigabytes of interwoven data structures being used by hundreds of internal threads), and only needs minimal adjustment from outside (like a 1-byte adjustment know that goes from 0 to 255 and only gets changed once an hour), a very simple technique could work: have your program re-read a single-byte configuration file in a well-known location /etc/myprog.conf every second, done. On the other extreme, if your program is really simple and has little internal state (like the more program), then just design it to exit immediately after each task, and just restart it each time, perhaps with different parameters.

Unfortunately, the real problems are in the middle. Here is what I'm doing these days: I'm finding that most real-world problems require multi-threaded solutions, because doing IO is too slow. That means that any program needs to be designed around common state, how to lock that state and how to change it consistently and atomically under lock, and which thread does what. At that point, it's relatively easy to add one extra thread that listens to commands and gives out information. That extra thread then needs a communication interface. For a while I used named pipes in well-known locations (for command input and status output), and a small second user-interface program that talks to the human from a command like and uses the pipes to send/receive stuff to the main program. Since I program mostly in python at home, the way to transmit complex information is to send whole python objects, using the "pickle" mechanism to serialize them into blocks of bytes. This kind of works, but has restrictions: the second program has to be on the same machine, you can only have one of them running at a time. And python's pickle has a zillion little problems that keep annoying me. One of my holiday vacation projects that I want to tackle is to replace that with a network socket and a protocol, and then in my main program have a thread per socket to handle commands and status. That seems easy, but immediately opens the question how to implement the network protocol. Having done this professionally before, I know the complexities and pitfalls, which is why I really don't want to do it myself. So I'm planning to use the open source gRPC mechanism (with both ends implemented in python) next time.
 
Back
Top