From my limited understanding the big complaint with init
is that it can't load processes in parallel. Seems to me like a small justification for replacing it with something largely more complicated, more monolithic, and more prone to issues (aside from its poorly written code). So what's the big advantage, some milliseconds saved at boot?
This complaint about the traditional init is old, and correct. There have been research/hacking papers presented at conferences as early as the late 90s or early 2000s that demonstrated that a parallel init can be much faster. Often the saving is 3x or something like that. Although sometimes the difference sometimes ends up being minor, because with spinning disks, system bootup may be disk limited, with the disk nearly continuously busy with random seeks. There was an interesting study (I think a PhD thesis) about how to improve that by grouping files on disk by access order during boot, which also demonstrated great improvement. And in those days, it often took 5 minutes for a system to boot (in particular a server with many services), so a speedup by a factor of 3 is VERY significant.
Today with SSDs (very little seek time), the landscape has changed: During booting, a system is no longer purely random IO limited. Yet, many init tasks today end up being network limited, and those latencies still exist. On the other hand, machines today hav multiple cores, so having parallel tasks during init is both less important and more important than it used to be.
So the important thing is: faster init is not "milliseconds" or "seconds", it can amount to several minutes, and it can be the long pole in the tent for recovery time. This stuff matters in many environments.
There are way more use cases for fast booting than just embedded computing. One particularly near to my heart is kernel development: for lack of good debugging techniques, one often ends up crashing the kernel, collecting a dump, rebooting, analyzing the dump, and trying again. In the old days (say ~98), that was often a 5-10 minute cycle. In one particular example, a colleague and me got that down to 27 seconds without hardware changes, and the difference in productivity is very significant. Another important reason for fast boot times is asynchronous servers: If a web page takes 30 seconds to load, many users (in particular internal users) will be mildly upset, but not walk away; if a web page takes 10 minutes to load, users will change their workflow, and productivity is disrupted. Having really fast boot times often makes it unnecessary to have redundant servers or reliable hardware (for example for planned maintenance), if instead one can crash the server and it comes back up really fast. On particular example of great cost savings: handling power outages. It used to be that serious computers had both UPSes (to handle short power outages) and diesel generators, which kick in after a minute of outage. Today, with fast boot times and better generators, we can often get rid of the UPSes: if the power fails, the server just crashes, within 10 seconds the diesel is up, and another 30 seconds later the server is back online. And the cost saving of that is large, since UPSes (with their short-lived batteries and regular battery maintenance) are expensive. Obviously, this requires looking at the whole system; having a UEFI BIOS that takes 3 minutes to initialize the motherboard, or needing to run fsck for 5 minutes does not work in such an environment.
Disclaimer: I have no idea whether fast boot times are actually what drove Lennart to work on systemd; I've never asked him for his motivation.