Altering rcorder and /etc/rc to gain faster startup

elephant · Aug 20, 2020

olli@ said:
Running the rc scripts on large servers may take somewhat longer, especially when several jails have to be started. But again, starting a jail is I/O-bound. And it should also be noted that – unlike desktop machines – servers are not booted every day in the morning, so then it doesn’t matter that much. Just my opinion, of course.

I think this will help for servers with a large number of jails. The time savings is small for a minimal install - 2 seconds. For larger systems, it will be more.

The nice thing is the change is actually quite small for sh(1) changes - about two paragraphs of RC script edits.

I rewrote the C code for rcorder(8) because I wanted unit tests. The resulting binary is about 5kb larger.

Another nice thing is backwards compatibility. You can run new rc script with old rcorder or old rc script with new rcorder. Everything still works. No speed improvement but you can boot.

I'm curious how much faster to see the login screen for mate or slim.

About netwait ... other rc scripts that don't require it can continue to run.

Mjölnir · Aug 20, 2020

unitrunker said:
-j is the number of tasks, not number of threads or cores. These tasks are largely I/O bound. On an ancient Core 2 Duo I found 99 worked better than 30.

This surprises me. This looks like an effect that on a spinning disk, it's more likely your data comes under the moving spindle the more data you demand (in the request-queue). So then, the #concurrency should not be derived from #threads, but from I/O capabilities of the system? Did you test on spinning disks or SSD? At least on UFS, inserting an I/O scheduler should give a reasonable speedup. Still I find it more natural to derive #concurrency from the threading capabilities of the system.

-j defaults to 1 (please see the wiki page above). Enabling concurrency is opt-in.

I mean: without -j default = 1. With -j without argument: default as above, else take the argument the user/admin gives to -j, then apply the limitation to a reasonable maximum.

I had not considered more than 99 tasks so that is a very good point.

Yes, very large systems might as well have enormous I/O beyond our wildest imagination. They might be able to start more than 100 jails in parallel.

olli@ as I already noted in another thread (nosh(8)?), there have been several PoCs to parallelize init/rc with the help of make(1) (on some FreeBSD projects web page?). Would be no problem to move make(1) to /bin & symlink in /usr/bin if it's needed on boot.

olli@ · Aug 20, 2020

mjollnir said:
olli@ as I already noted in another thread (nosh(8)?), there have been several PoCs to parallelize init/rc with the help of make(1) (on some FreeBSD projects web page?). Would be no problem to move make(1) to /bin & symlink in /usr/bin if it's needed on boot.

I’m pretty sure that make(1) is not going to be moved to the root file system. The binary is almost 1 MB (on my stable/12 amd64 machine). The whole size of the /bin directory is just 2.2 MB, so moving make(1) there would increase it by almost 50 %. rcorder(8) is just 13 KB and would need only very little changes. This is also more efficient.

olli@ · Aug 20, 2020

unitrunker said:
I think this will help for servers with a large number of jails. The time savings is small for a minimal install - 2 seconds. For larger systems, it will be more.

No, there will be no extra time saving for jails, because the jails framework already has a jail_parallel_start setting, so they are already started in parallel. See the appropriate section in the rc.conf(5) manual page.

About netwait ... other rc scripts that don't require it can continue to run.

Well, the stuff that comes afterwards requires a working network. That’s the purpose of netwait.

elephant · Aug 20, 2020

olli@ said:
No, there will be no extra time saving for jails, because the jails framework already has a jail_parallel_start setting, so they are already started in parallel. See the appropriate section in the rc.conf(5) manual page.

I see I've been missing out as I never knew about this.

Well, the stuff that comes afterwards requires a working network. That’s the purpose of netwait.

I should clarify. Only RC scripts that require netwait (directly or indirectly) need to actually wait. Others can go concurrently. The netwait task can begin sooner while allowing some tasks to continue.

2 scripts require netwait. 8 scripts require NETWORKING. This won't stop the rest of the RC ecosystem.

olli@ · Aug 20, 2020

unitrunker said:
I should clarify. Only RC scripts that require netwait (directly or indirectly) need to actually wait. Others can go concurrently. The netwait task can begin sooner while allowing some tasks to continue.

2 scripts require netwait. 8 scripts require NETWORKING. This won't stop the rest of the RC ecosystem.

Practically everything that comes afterwards requires networking, directly or indirectly. When the netwait script is running, everything else has to wait. Keep in mind that things like NFS or NIS might be required to proceed, for example. Of course, if the machine is set up in way that no network is required for anything, then there wouldn’t be such a problem (and the user wouldn’t have netwait enabled anyway).

elephant · Aug 20, 2020

olli@ said:
Practically everything that comes afterwards requires networking, directly or indirectly.

I get that. The point is netwait can start waiting much sooner which means the tasks that follow also run sooner.

Anyone can see the ordering:

$ rcorder /etc/rc.d/* /usr/local/etc/rc.d/*

Edit: On a simple install, there are 19 tasks ahead of FILESYSTEMS, 59 ahead of NETWORKING, 109 ahead of DAEMON, and 131 ahead of LOGIN.

When stuck in a traffic jam, it's all the cars in front of you that slow your progress.

olli@ and mjollnir - thank you both for your feedback. Your input has been very helpful.

mark_j · Aug 20, 2020

mjollnir said:
Well, as I wrote above: All this has been tried & evaluated multiple times by others. Issues to solve:

To enshure a service is up, it's not sufficient that a service is running (PID exists): It might need some additional time to do some housework, or a master task/thread might fire up worker tasks/threads. The shell's wait only checks the PID of the service's rc script to die. This does not mean in all cases that the service is up? To solve that reliably, a standardized communication between a service & it's rc script has to be introduced. The service needs to tell the service manager it's status ("I'm up & ready to serve"). MAYBE this is solved by convention: the service rc script does not return before the service is ready (up). I just don't know.

I'm curious to read about your experience.

This is heading into the domain of launchd (and therefore systemd) where IPC is used to co-ordinate process startup.
That's new init system territory.

mark_j · Aug 20, 2020

olli@ said:
Practically everything that comes afterwards requires networking, directly or indirectly. When the netwait script is running, everything else has to wait. Keep in mind that things like NFS or NIS might be required to proceed, for example. Of course, if the machine is set up in way that no network is required for anything, then there wouldn’t be such a problem (and the user wouldn’t have netwait enabled anyway).

Could there be a case to create a memory disk and gobble up as many scripts as possible during netwait? This might give you a second or two.
I think the slowness is down to interpreted sh at least in part; this is why launchd dispensed with it.

elephant · Aug 21, 2020

I found these earlier attempts.

GitHub - buganini/rcexecr: Parallel rc.d scripts executer for FreeBSD

Parallel rc.d scripts executer for FreeBSD. Contribute to buganini/rcexecr development by creating an account on GitHub.

github.com

⚙ D3715 Add "rcorder -p".

reviews.freebsd.org

olli@ · Aug 21, 2020

mark_j said:
Could there be a case to create a memory disk and gobble up as many scripts as possible during netwait? This might give you a second or two.
I think the slowness is down to interpreted sh at least in part; this is why launchd dispensed with it.

No, the overhead of parsing the shell scripts is negligible:

Code:

$ cd /etc/rc.d
$ /usr/bin/time sh -c 'echo * | xargs -n1 sh -nx'
        0.18 real         0.10 user         0.07 sys

And even if it would save a second or two, that would be rather meaningless and not worth the efforts at all.

When I booted up my workstation today, I noted the time stamps as follows, relative to switching it on:

Code:

T+00:00   power button pressed
T+00:47   rc.d begins
T+00:53   rc.d finished
T+00:57   xdm login window appears

So, all together it took almost a minute from switching the machine on to being able to log in.
Only 6 seconds of that is caused by the rc(8) framework.

Spending hours (or even days) trying to fix a “problem” that doesn’t really exist appears to be a terrible waste of time that could be spent for useful things instead.

Mjölnir · Aug 21, 2020

mark_j said:
[...] I think the slowness is down to interpreted sh at least in part; this is why launchd dispensed with it.

The statically linked /rescue/sh starts up faster than the dynamic default one. When it's fired up often, these differences add up to a significant time saving. IIRC this trick is/can be used in the build(7) system. In make.conf(5): MAKE_SHELL?=/rescue/sh. Would be a good idea to do that in rc(8), too.

olli@ · Aug 21, 2020

mjollnir said:
The statically linked /rescue/sh starts up faster than the dynamic default one. When it's fired up often, these differences add up to a significant time saving. IIRC this trick is/can be used in the build(7) system. In make.conf(5): MAKE_SHELL?=/rescue/sh. Would be a good idea to do that in rc(8), too.

No, it would not.
I invite you to do a little more research before making such suggestions, or even try it out in practice. It’s not difficult to replace /bin/sh with /rescue/sh if you’re curious to give it a try.

The difference in startup time between /bin/sh and /rescue/sh is rather small, and it only occurs when they are explicitly invoked for execution via the execve(2) system call, which happens when you enter the command at the shell prompt, for example. In this case, static binaries start up a little faster because the runtime linker has less work to do.

But the rc(8) framework does not do that. It opens a subshell (using the fork(2) system call) and then sources the rc script within that subshell. The runtime linker is not involved at all, so it does not make a difference whether the binary is static or dynamic.

By the way, the build system doesn’t use /rescue/sh, as far as I can see. The only purpose of /rescue is to be used for disaster recovery, when the root file system got damaged (so it saves you from having to use a bootable USB stick for recovery). Also, it was once meant to be used on very small, space-limited embedded systems, but that purpose isn’t meaningful anymore today. It should also be noted that /rescue is completely optional and not required for normal operation; it can be enabled or disabled via /etc/src.conf.

leebrown66 · Aug 21, 2020

olli@ said:
So, all together it took almost a minute from switching the machine on to being able to log in.

<rant on>
The last time I booted one of our Linux servers which is systemd, I got a login prompt almost immediately after the kernel was loaded, impressively fast to be honest.

However I then had to wait several minutes, before networking was available (lots of VLANS with CARP starting services like DHCP, etc), so I sat there and literally twiddled my thumbs.

Not being so familiar with the new linux stuff, I did a tail -f /var/log/messages, so at least I would be entertained by the startup information, but nothing appeared until the entire boot sequence was completed, then many thousands of lines were appended at once.

So what;s the point of being able to login, when the system is still booting, I just don't understand.
<rant off>

I'm still waiting for the box to die so I can replace it with FreeBSD.

elephant · Aug 22, 2020

Here's an update. Fixed a few bugs which invalidated my earlier test runs. I now have better numbers - both with and without concurrency enabled. Start up time is 8 seconds (in the area managed by /etc/rc). Increasing concurrency did not change that. 8 seconds is pretty quick so that's fine.

Concurrency is controlled by /etc/rc.conf(5) variable 'rc_task_limit'.

I started adding five second "placebo" services to /usr/local/etc/rc.d/ and enabling them in /etc/rc.conf(5).
With one service enabled - startup time was 13 seconds - with or without concurrency.

With two services enabled - startup time was 18 seconds for rc_task_limit=1 and 13 seconds for rc_task_limit=15.

With three services enabled - startup time was 23 seconds for rc_task_limit=1 and 13 seconds for rc_task_limit=15.

Setting rc_task_limit higher than 15 saw no improvement.

There's definitely some savings there but only if you have lots of extra services.

unitrunker/rcorder - FreeBSD Wiki

rcorder

wiki.freebsd.org

Mjölnir · Sep 15, 2020

TWIMC: from today's FreeBSD's <svn-src-stable@freebsd.org> Digest: MFC r365449: Add a few features to rcorder(8):
o Add -p option that enables grouping items that can be processed in parallel. [...]
Thx!

Alain De Vos · Sep 15, 2020

When is it foreseen that the man page will be updated ?

Mjölnir · Sep 15, 2020

Alain De Vos said:
When is it foreseen that the man page will be updated ?

It was updated today... Since 12.2 was already branched off 12-STABLE, it will be in a -RELEASE in 12.3-REL I guess. If you sync your 12-STABLE source tree now, build & install, you have it today...

elephant · Sep 15, 2020

That does not include the changes to /etc/rc to take advantage of rcorder -p.