What happens to a process when it can't keep up with writing logs?

Buck · Oct 16, 2021

Consider the following scenario: I placed httpd logs on a very slow device, yet the rest of the system is quite fast. And I start serving lots of pages, very fast, where the log partition just can't keep up, i.e. serving a page completes in way less time than writing a single log entry. What happens to a process? Will it wait until the line(s) had been written before serving another request? Do different packages handle this differently (mail, web, db...)? What will the difference be if I use syslogd logging for that process vs native logging? Does filesystem matter at all (speed of handling files notwithstanding)?

eternal_noob · Oct 16, 2021

I am sure the log writing is done from a separate / dedicated thread, so no problems here.

ct85711 · Oct 16, 2021

Some packages handle logging on their own (I believe apache does their own logging directly), where others will send it through the system's logging system. Otherwise, I assume if the logging system isn't able to keep up, some messages may end up being dropped.

mer · Oct 16, 2021

using syslog for logging:
The program sends a chunk of memory over to another process, basically over a socket on localhost. That is effectively a "zero copy" so a buffer from process A gets pushed to syslogd. Once that completes successfully, process A can continue. There is a context switch at some point where syslogd reads the buffer and writes it to the device. Process A is still running (if you ignore context switch) so syslogd is not slowing it down.

If Process A has a dedicated thread to log to a device, that thread is running in the same process space so would likely interrupt the current "processing" thread. But you'd still have a similar "buffer from here given to there and then there writes to the device". If a program creates multiple threads, they run in the same process space so very little copying of buffers, just hand it off so not much of a performance hit.

The thread/process that does physical writes to the slow device will be slow. Everything else should be fine until the producers (the threads logging) overwhelm the consumer (thread writing to device). Consumer can't process the next input until it finishes writing to device.

gpw928 · Oct 17, 2021

ct85711 said:
Otherwise, I assume if the logging system isn't able to keep up, some messages may end up being dropped.

There's a loop in syslogd function fprintlog_write that deals with failed writes. What happens depends on the target of the write (e.g. socket or file) and the value of errno(2). There is some effort to recover from transient errors. But non-transient errors will generally cause messages to be dumped (and, in some cases, log files to be abandoned).

Geezer · Oct 17, 2021

Buck said:
I placed httpd logs on a very slow device, yet the rest of the system is quite fast.

What sort of device is so slow for the basic process of writing text to a file? Get rid of it!

zirias@ · Oct 17, 2021

It depends on the architecture of the software. For writing a log message, there's always an I/O operation involved, either on the network or a local socket (e.g. in case of syslog, so the "problem" with the slow disk would be syslog's problem), or directly on disk. For simplicity, let's assume we're writing directly to disk.

Then, the simplest method would be a blocking I/O call, like write(2). If this is done inside the thread serving the request, this thread would just have to wait until the log message is actually written. So, one possible result of your scenario would be that the service will just slow down.

Of course, that would be a bad design, because even if logging is fast, you don't want to wait for it serving client requests. That's just unnecessary, writing the log message isn't essential for the job of serving the client. So, in a better design, logging is somehow offloaded to other threads. One possibility would be to pass jobs to a thread pool, which is what I did in remusock. The number of threads is limited, and there's a queue for waiting jobs that's also limited. If all threads are busy and the queue is full, the job just won't be queued, so this log message will never be written. In general, that's the other possible result of your scenario: you will experience missing log messages. IMHO, that's much more likely, if the service is well-written.

Buck · Oct 17, 2021

Thanks for the replies you all. So basically it boils down to - if the design is good, then the queue is overflowing and messages are simply being discarded. Otherwise it's slowing things down. I would guess that it's univiersally the former approach (good design), because otherwise the service would completely stall if something happened to the logging device, especially releveant if it's on a network. Also in case of syslogd, there could be multiple outputs, if I remember correctly, can't guarantee all of them are always ready.

So in case of sylogd there's nothing it can potentially do (like crash and restart or otherwise deliberately slow down responses) to interfere with the process that's using it for logging?

Oh, and the question is purely theoretical, I'm not having issues with logging at all, just wanted to understand the architecture a bit.

zirias@ · Oct 17, 2021

Buck said:
So in case of sylogd there's nothing it can potentially do (like crash and restart or otherwise deliberately slow down responses) to interfere with the process that's using it for logging?

Not sure whether I understand that question correctly, but I'll try to give an answer. The programmer will normally just use the syslog(3) function. What happens depends on local configuration.

Either it's configured to log to a remote server, then UDP is used and this is "fire and forget". Normally, it will arrive and be logged, but in case of any error, nothing will happen.

Or (maybe more likely), logging is done by the local syslog daemon, then the message is passed via the local logging socket. This is a bit more interesting, because two things could go wrong:

Syslog might have a full queue itself, then this will block until syslog can accept the message
Syslog might not run at all, then connecting to the socket will fail.

A well-written service should be resilient to both. For the first problem, see my answer above – in a good design, logging to syslog is done in separate threads, just as logging directly to a file. And of course, any errors should be ignored, because logging is non-essential for the service's job.

Buck · Oct 17, 2021

Zirias, I was thinking of configuration where messages are piped to syslog, like in case of apache:
ErrorLog "|/usr/bin/logger -p local6.err -t httpd"
But it's probably not changing the general concept anyway.

zirias@ · Oct 17, 2021

Buck said:
Zirias, I was thinking of configuration where messages are piped to syslog, like in case of apache:
ErrorLog "|/usr/bin/logger -p local6.err -t httpd"
But it's probably not changing the general concept anyway.

This probably just means apache is using a pipe(2) to write(2) to (maybe using popen(3) to simplify the code, but that's just a detail), and you should always expect that such things might block or even fail. So, you're right, it doesn't change anything about the general concept: writing logs is non-essential, so, best practice is to ignore errors and design your service so it will never slow down the essential functionality.

edit: talking about a large and wide-spread project like apache, I'd assume they "got it right". But I wouldn't assume that for every service/daemon out there. Writing a good daemon isn't exactly easy, it's easy to write one that works fine until something "unexpected" (like, slow logging) happens. You need knowledge and experience to get it right. POSIX offering "async" semantics, except when it doesn't (like e.g. with syslog(3)), isn't too helpful either, you have to get creative working around that (e.g. by implementing your own thread-pool). If you follow old tutorials for writing a daemon, you'll probably find a model that fork(2)s a process for each and every client, and does blocking I/O exclusively. That's nice and simple, but not scalable (forking thousands of processes for a busy service isn't the greatest idea) and suffers from this very problem: anything slow (like logging) would slow down servicing the client…

edit2: Avoiding to fork a process for every client doesn't necessarily mean to have a thread for every client. You could just go for an event-driven design (and, I'd recommend to do so). POSIX offers a simple mechanism for that with e.g. pselect(2). Unfortunately, this doesn't scale to a huge number of clients either (the set of file descriptors monitored is, well, limited). If that is a problem, you have to resort to non-portable solutions, like e.g. kqueue(2) on FreeBSD, or epoll(7) on Linux. There are libraries, like e.g. libevent, offering a portable API for all these platform-specific solutions.

ralphbsz · Oct 17, 2021

Thanks for that message, explaining that "it depends". Just one addition:

Zirias said:
Then, the simplest method would be a blocking I/O call, like write(2). If this is done inside the thread serving the request, this thread would just have to wait until the log message is actually written.

Note that a typical write() call (which is behind most logging and printing calls) does not actually wait until the data is on disk. It simply holds the written data in a kernel buffer. Writing to disk happens later, asynchronously in the kernel. Therefore, write() calls for small amounts of data tend to be very fast. Compared to all the other work a common network server has to do, writing one log line per request is a minor effect, and unlikely to be the thing that is blocking.

When I say "typical" above, I mean: If the file is opened without special flags, and the writing is done via just a write() call. It is possible to open a file in a sync mode, or to follow each write() call with an fsync() call, in which case the application (the server) will block until the data is actually on disk.

zirias@ · Oct 17, 2021

ralphbsz, AFAIR, POSIX guarantees that after a successful write(2), a read(2) will return the new data. So, it's the kernel's problem after returning ? (anything else is "implementation detail")

ralphbsz · Oct 17, 2021

Right. But POSIX doesn't consider the case of the OS crashing. So after a successful write(), followed by a system crash (for example caused by a power outage or hard reset), a read() does not need to return the written data. POSIX is simply silent on that detail.

And therefore, normal file systems implement write() by just putting the written data into a kernel buffer (known as a dirty buffer). The read() code in the file system knows that it has to consult dirty buffers before going to disk, so this works correctly. The performance implications are obvious: If we were to write synchronously (the write() call has to wait for the disk drive), then on a system with just one disk, we would be able to do ~100 small writes (of 100 bytes each, typical for logging) per second; the limiting factor is the number of random seeks and rotations the disk can do. With the buffered asynchronous design in the file system, the same system could do about one million such small writes; the limiting factor would be the sequential bandwidth of the disk.

zirias@ · Oct 17, 2021

That's all correct, but even kernel buffers will reach a limit if the device is indeed as slow as outlined by OP's scenario

– and when that happens, you'll have blocking behavior. But thanks for adding that detail!

edit: I still rest my case, a well-behaved daemon/service must consider writing logs might block, and should have a strategy to deal with that (ultimately dropping the log message).

Buck · Oct 18, 2021

One other question that I had was whether filesystem mattered at all. There are writes and reads being done at the API level, but maybe there's another layer of abstraction (filesystem) involved that complicates this process? Or is this particular problem only measured by raw disk speed parameters?

ct85711 · Oct 18, 2021

For the most part, no the filesystem will not have much of an effect. Though, adding encryption & compressions can have a noticable effect on read/writing speeds. The one part that the filesystem will have an effect, is when the drive's free space runs out; hence why the recommendation of keeping the drive to <80% full.

mark_j · Oct 18, 2021

Buck said:
Consider the following scenario: I placed httpd logs on a very slow device, yet the rest of the system is quite fast. And I start serving lots of pages, very fast, where the log partition just can't keep up, i.e. serving a page completes in way less time than writing a single log entry. What happens to a process? Will it wait until the line(s) had been written before serving another request? Do different packages handle this differently (mail, web, db...)? What will the difference be if I use syslogd logging for that process vs native logging? Does filesystem matter at all (speed of handling files notwithstanding)?

All your questions depend on the design of the program. write can be blocking or non-blocking, but essentially a progammer calls and tests the value returned and the errno (if required), and that is all they're concerned about. The next layer does the work. The next layer can be vfs, network, serial etc.

Using syslog means it will resist writing duplicate records in succession, but otherwise it will do what any other network writable program does; write it away and let the next layer worry about it. The same as above.

Filesystems can matter as some are slower than others, but, if you're a half decent programmer you ensure you cater for this eventuality. Having said that, writing logs on an extremely slow device is a user problem, not a programmers. You can't program around stupidity.