Altering rcorder and /etc/rc to gain faster startup

OP
U

unitrunker

Aspiring Daemon

Reaction score: 204
Messages: 500

Running the rc scripts on large servers may take somewhat longer, especially when several jails have to be started. But again, starting a jail is I/O-bound. And it should also be noted that – unlike desktop machines – servers are not booted every day in the morning, so then it doesn’t matter that much. Just my opinion, of course.
I think this will help for servers with a large number of jails. The time savings is small for a minimal install - 2 seconds. For larger systems, it will be more.

The nice thing is the change is actually quite small for sh(1) changes - about two paragraphs of RC script edits.

I rewrote the C code for rcorder(8) because I wanted unit tests. The resulting binary is about 5kb larger.

Another nice thing is backwards compatibility. You can run new rc script with old rcorder or old rc script with new rcorder. Everything still works. No speed improvement but you can boot.

I'm curious how much faster to see the login screen for mate or slim.

About netwait ... other rc scripts that don't require it can continue to run.
 
Last edited:

mjollnir

Daemon

Reaction score: 831
Messages: 1,267

-j is the number of tasks, not number of threads or cores. These tasks are largely I/O bound. On an ancient Core 2 Duo I found 99 worked better than 30.
This surprises me. This looks like an effect that on a spinning disk, it's more likely your data comes under the moving spindle the more data you demand (in the request-queue). So then, the #concurrency should not be derived from #threads, but from I/O capabilities of the system? Did you test on spinning disks or SSD? At least on UFS, inserting an I/O scheduler should give a reasonable speedup. Still I find it more natural to derive #concurrency from the threading capabilities of the system.
-j defaults to 1 (please see the wiki page above). Enabling concurrency is opt-in.
I mean: without -j default = 1. With -j without argument: default as above, else take the argument the user/admin gives to -j, then apply the limitation to a reasonable maximum.
I had not considered more than 99 tasks so that is a very good point.
Yes, very large systems might as well have enormous I/O beyond our wildest imagination. They might be able to start more than 100 jails in parallel.

olli@ as I already noted in another thread (nosh(8)?), there have been several PoCs to parallelize init/rc with the help of make(1) (on some FreeBSD projects web page?). Would be no problem to move make(1) to /bin & symlink in /usr/bin if it's needed on boot.
 

olli@

Aspiring Daemon
Developer

Reaction score: 848
Messages: 801

olli@ as I already noted in another thread (nosh(8)?), there have been several PoCs to parallelize init/rc with the help of make(1) (on some FreeBSD projects web page?). Would be no problem to move make(1) to /bin & symlink in /usr/bin if it's needed on boot.
I’m pretty sure that make(1) is not going to be moved to the root file system. The binary is almost 1 MB (on my stable/12 amd64 machine). The whole size of the /bin directory is just 2.2 MB, so moving make(1) there would increase it by almost 50 %. rcorder(8) is just 13 KB and would need only very little changes. This is also more efficient.
 

olli@

Aspiring Daemon
Developer

Reaction score: 848
Messages: 801

I think this will help for servers with a large number of jails. The time savings is small for a minimal install - 2 seconds. For larger systems, it will be more.
No, there will be no extra time saving for jails, because the jails framework already has a jail_parallel_start setting, so they are already started in parallel. See the appropriate section in the rc.conf(5) manual page.
About netwait ... other rc scripts that don't require it can continue to run.
Well, the stuff that comes afterwards requires a working network. That’s the purpose of netwait. ;)
 
OP
U

unitrunker

Aspiring Daemon

Reaction score: 204
Messages: 500

No, there will be no extra time saving for jails, because the jails framework already has a jail_parallel_start setting, so they are already started in parallel. See the appropriate section in the rc.conf(5) manual page.
I see I've been missing out as I never knew about this.

Well, the stuff that comes afterwards requires a working network. That’s the purpose of netwait. ;)
I should clarify. Only RC scripts that require netwait (directly or indirectly) need to actually wait. Others can go concurrently. The netwait task can begin sooner while allowing some tasks to continue.

2 scripts require netwait. 8 scripts require NETWORKING. This won't stop the rest of the RC ecosystem.
 

olli@

Aspiring Daemon
Developer

Reaction score: 848
Messages: 801

I should clarify. Only RC scripts that require netwait (directly or indirectly) need to actually wait. Others can go concurrently. The netwait task can begin sooner while allowing some tasks to continue.

2 scripts require netwait. 8 scripts require NETWORKING. This won't stop the rest of the RC ecosystem.
Practically everything that comes afterwards requires networking, directly or indirectly. When the netwait script is running, everything else has to wait. Keep in mind that things like NFS or NIS might be required to proceed, for example. Of course, if the machine is set up in way that no network is required for anything, then there wouldn’t be such a problem (and the user wouldn’t have netwait enabled anyway).
 
OP
U

unitrunker

Aspiring Daemon

Reaction score: 204
Messages: 500

Practically everything that comes afterwards requires networking, directly or indirectly.
I get that. The point is netwait can start waiting much sooner which means the tasks that follow also run sooner.

Anyone can see the ordering:

$ rcorder /etc/rc.d/* /usr/local/etc/rc.d/*

Edit: On a simple install, there are 19 tasks ahead of FILESYSTEMS, 59 ahead of NETWORKING, 109 ahead of DAEMON, and 131 ahead of LOGIN.

When stuck in a traffic jam, it's all the cars in front of you that slow your progress.

olli@ and mjollnir - thank you both for your feedback. Your input has been very helpful.
 

mark_j

Aspiring Daemon

Reaction score: 346
Messages: 659

Well, as I wrote above: All this has been tried & evaluated multiple times by others. Issues to solve:
  • To enshure a service is up, it's not sufficient that a service is running (PID exists): It might need some additional time to do some housework, or a master task/thread might fire up worker tasks/threads. The shell's wait only checks the PID of the service's rc script to die. This does not mean in all cases that the service is up? To solve that reliably, a standardized communication between a service & it's rc script has to be introduced. The service needs to tell the service manager it's status ("I'm up & ready to serve"). MAYBE this is solved by convention: the service rc script does not return before the service is ready (up). I just don't know.
I'm curious to read about your experience.
This is heading into the domain of launchd (and therefore systemd) where IPC is used to co-ordinate process startup.
That's new init system territory.
 

mark_j

Aspiring Daemon

Reaction score: 346
Messages: 659

Practically everything that comes afterwards requires networking, directly or indirectly. When the netwait script is running, everything else has to wait. Keep in mind that things like NFS or NIS might be required to proceed, for example. Of course, if the machine is set up in way that no network is required for anything, then there wouldn’t be such a problem (and the user wouldn’t have netwait enabled anyway).
Could there be a case to create a memory disk and gobble up as many scripts as possible during netwait? This might give you a second or two.
I think the slowness is down to interpreted sh at least in part; this is why launchd dispensed with it.
 

olli@

Aspiring Daemon
Developer

Reaction score: 848
Messages: 801

Could there be a case to create a memory disk and gobble up as many scripts as possible during netwait? This might give you a second or two.
I think the slowness is down to interpreted sh at least in part; this is why launchd dispensed with it.
No, the overhead of parsing the shell scripts is negligible:
Code:
$ cd /etc/rc.d
$ /usr/bin/time sh -c 'echo * | xargs -n1 sh -nx'
        0.18 real         0.10 user         0.07 sys
And even if it would save a second or two, that would be rather meaningless and not worth the efforts at all.

When I booted up my workstation today, I noted the time stamps as follows, relative to switching it on:
Code:
T+00:00   power button pressed
T+00:47   rc.d begins
T+00:53   rc.d finished
T+00:57   xdm login window appears
So, all together it took almost a minute from switching the machine on to being able to log in.
Only 6 seconds of that is caused by the rc(8) framework.

Spending hours (or even days) trying to fix a “problem” that doesn’t really exist appears to be a terrible waste of time that could be spent for useful things instead.
 

mjollnir

Daemon

Reaction score: 831
Messages: 1,267

[...] I think the slowness is down to interpreted sh at least in part; this is why launchd dispensed with it.
The statically linked /rescue/sh starts up faster than the dynamic default one. When it's fired up often, these differences add up to a significant time saving. IIRC this trick is/can be used in the build(7) system. In make.conf(5): MAKE_SHELL?=/rescue/sh. Would be a good idea to do that in rc(8), too.
 

olli@

Aspiring Daemon
Developer

Reaction score: 848
Messages: 801

The statically linked /rescue/sh starts up faster than the dynamic default one. When it's fired up often, these differences add up to a significant time saving. IIRC this trick is/can be used in the build(7) system. In make.conf(5): MAKE_SHELL?=/rescue/sh. Would be a good idea to do that in rc(8), too.
No, it would not.
I invite you to do a little more research before making such suggestions, or even try it out in practice. It’s not difficult to replace /bin/sh with /rescue/sh if you’re curious to give it a try.

The difference in startup time between /bin/sh and /rescue/sh is rather small, and it only occurs when they are explicitly invoked for execution via the execve(2) system call, which happens when you enter the command at the shell prompt, for example. In this case, static binaries start up a little faster because the runtime linker has less work to do.

But the rc(8) framework does not do that. It opens a subshell (using the fork(2) system call) and then sources the rc script within that subshell. The runtime linker is not involved at all, so it does not make a difference whether the binary is static or dynamic.

By the way, the build system doesn’t use /rescue/sh, as far as I can see. The only purpose of /rescue is to be used for disaster recovery, when the root file system got damaged (so it saves you from having to use a bootable USB stick for recovery). Also, it was once meant to be used on very small, space-limited embedded systems, but that purpose isn’t meaningful anymore today. It should also be noted that /rescue is completely optional and not required for normal operation; it can be enabled or disabled via /etc/src.conf.
 

leebrown66

Well-Known Member

Reaction score: 152
Messages: 429

So, all together it took almost a minute from switching the machine on to being able to log in.
<rant on>
The last time I booted one of our Linux servers which is systemd, I got a login prompt almost immediately after the kernel was loaded, impressively fast to be honest.

However I then had to wait several minutes, before networking was available (lots of VLANS with CARP starting services like DHCP, etc), so I sat there and literally twiddled my thumbs.

Not being so familiar with the new linux stuff, I did a tail -f /var/log/messages, so at least I would be entertained by the startup information, but nothing appeared until the entire boot sequence was completed, then many thousands of lines were appended at once.

So what;s the point of being able to login, when the system is still booting, I just don't understand.
<rant off>

I'm still waiting for the box to die so I can replace it with FreeBSD.
 
OP
U

unitrunker

Aspiring Daemon

Reaction score: 204
Messages: 500

Here's an update. Fixed a few bugs which invalidated my earlier test runs. I now have better numbers - both with and without concurrency enabled. Start up time is 8 seconds (in the area managed by /etc/rc). Increasing concurrency did not change that. 8 seconds is pretty quick so that's fine.

Concurrency is controlled by /etc/rc.conf(5) variable 'rc_task_limit'.

I started adding five second "placebo" services to /usr/local/etc/rc.d/ and enabling them in /etc/rc.conf(5).
With one service enabled - startup time was 13 seconds - with or without concurrency.

With two services enabled - startup time was 18 seconds for rc_task_limit=1 and 13 seconds for rc_task_limit=15.

With three services enabled - startup time was 23 seconds for rc_task_limit=1 and 13 seconds for rc_task_limit=15.

Setting rc_task_limit higher than 15 saw no improvement.

There's definitely some savings there but only if you have lots of extra services.

 

mjollnir

Daemon

Reaction score: 831
Messages: 1,267

When is it foreseen that the man page will be updated ?
It was updated today... Since 12.2 was already branched off 12-STABLE, it will be in a -RELEASE in 12.3-REL I guess. If you sync your 12-STABLE source tree now, build & install, you have it today...
 
OP
U

unitrunker

Aspiring Daemon

Reaction score: 204
Messages: 500

That does not include the changes to /etc/rc to take advantage of rcorder -p.
 
Top