Slow boot, long delay at BTX loader

I'm running a newly built 8.0-STABLE and I've installed root on ZFS according to the Wiki.
http://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/Mirror

When the system boots up it seems to stall for around 70 seconds at the following screen:

Code:
BTX Loader 1.00 BTX version is 1.02
Consoles: internal video/keyboard
BIOS drive A: is disk0
BIOS drive C: is disk1
BIOS drive D: is disk2
BIOS drive E: is disk3
BIOS drive F: is disk4
BIOS drive G: is disk5

No errors appear and eventually the system boots normally from the zfs root.
Any ideas why it's taking so long at this stage? what is actually going on at this point?

Thanks
Gianni
 
I'm also getting a long pause of 30 to 40 seconds (estimated) after the disks are listed by BTX. It is during the -\|/- (twirly characters) sequence and just before reporting memory information. The twirly characters do change every so often, but will pause 10 to 15 seconds at a time between some changes.

I'm running gptzfsboot with zpool version 14. It was installed with that pool version. It wasn't an upgraded version 13 to 14.

It is FreeBSD 8.0p2 v4 amd64 from http://mfsbsd.vx.sk/ (the zfsinstall ISO).

I'm using four 500GB SATA drives on an ICH10R in AHCI mode (not raid mode). They're setup as two mirrors in one pool (effectively RAID10 via zfs).

I'm happy to provide any information you need to compare versus your system to see if we have any commonalities that might explain the source of the trouble.
 
I timed it today and the total time waiting at BTX before the BIOS memory info line is displayed is 67 seconds, so it's a little longer than I estimated.

I think my main question is the same as that of the OP: What is going on during that time?

If that boot delay is just a necessary sacrifice to boot from ZFS, I suppose I am ok with that (ZFS is worth it), but I suspect that there is something less than optimal going on during boot with more than two large disks in the boot/root ZFS pool.

I'm generally ok reading code, but I've been out of the FreeBSD world since version 4.9, so I don't really remember where to find this code. If someone could point me to the right spot on the filesystem for the source code that is running during that delay, I could muddle through it and report back.
 
Have you tried to bisect[1] the changes in /usr/src/sys/boot? I'm pretty sure it was introduced in HEAD fairly recently, about a half year ago or less but ZFSBoot was available for more than a year. Not sure about -STABLE, but you can try to revert r198420 on -CURRENT.

[1] you can speed up testing by using qemu with real devices, e.g.# echo -D >/boot.config
# qemu -nographic -drive file=/dev/ada0 -drive file=/dev/ada1
 
john_doe said:
Have you tried to bisect[1] the changes in /usr/src/sys/boot?

I would if I had a version that I knew didn't have the trouble. This is the first and only root/boot on zfs that I've done.


john_doe said:
I'm pretty sure it was introduced in HEAD fairly recently, about a half year ago or less but ZFSBoot was available for more than a year.

By "it", do you mean the capability to boot from zfs in general or were you referring to a specific piece of code?

I may be just a confused newbie, but aren't gptzfsboot and zfsboot two different things? I got the impression from the sparse documentation I was able to find that gptzfsboot was a revamped zfsboot designed to find your zfs volumes on a gpt partition whereas zfsboot was for dedicated disks. Is that not correct?



john_doe said:
Not sure about -STABLE, but you can try to revert r198420 on -CURRENT.

I'm not running on -CURRENT and, unfortunately, this isn't a machine I can just tear down and reinstall.

I read through that and the only thing that stands out is "If multiple partitions exist on a disk, probe them all." So maybe what we have happening is gptzfsboot (or zfsboot??) is scanning the full contents of all partitions on all disks to find the zfs volume.

If that's what's happening, you'd think it would be pretty quick in my case if it were working correctly. There are only two partitions per disk as shown by
# gpart show:
Code:
=>       34  976773101  ad4  GPT  (466G)
         34        128    1  freebsd-boot  (64K)
        162  976772973    2  freebsd-zfs  (466G)

=>       34  976773101  ad6  GPT  (466G)
         34        128    1  freebsd-boot  (64K)
        162  976772973    2  freebsd-zfs  (466G)

=>       34  976773101  ad8  GPT  (466G)
         34        128    1  freebsd-boot  (64K)
        162  976772973    2  freebsd-zfs  (466G)

=>       34  976773101  ad10  GPT  (466G)
         34        128     1  freebsd-boot  (64K)
        162  976772973     2  freebsd-zfs  (466G)

john_doe said:
[1] you can speed up testing by using qemu with real devices, e.g.# echo -D >/boot.config
# qemu -nographic -drive file=/dev/ada0 -drive file=/dev/ada1

I just tried qemu as you suggested (good tip!) and I'm seeing less than 15 seconds time at the problem spot with 4GB of RAM like so:
# qemu -nographic -m 4G -drive file=/dev/ad4 -drive file=/dev/ad6 -drive file=/dev/ad8 -drive file=/dev/ad10

It's even faster with the default 128MB of RAM.

Would I be correct in my assessment that the difference lies in BIOS? We're using the same physical devices and the same boot code, so the only real difference is the BIOS. I suppose the type of disk device emulated by qemu might be different (old IDE vs. AHCI SATA), though.

There are a couple of things I can try in BIOS that might have an impact. I'll try those various things and report back.
 
Just a quick note that "-D" in /boot.config will cause the system to hang on boot before BTX shows the drives if a legacy serial port is not enabled in BIOS. I enabled a legacy serial port in BIOS and then removed /boot.config (it was empty other than "-D") so I could disable the legacy serial port again.

I'm happy to report that I found a workaround that cut that problem section from 67 seconds to 13 seconds, which is much more acceptable. In BIOS, there was a sub-option to AHCI mode to use the "Intel AHCI" or "BIOS native module". I was using "Intel AHCI". Changing it to "BIOS native module" eliminated the trouble.

Interestingly, dmesg still shows it detected as a "Intel ICH10 SATA300 controller", so there don't seem to be any noticeable differences once booted.
 
You're using quite recent hardware and there is probably no ps/s connectors anymore.

I had similar issues, switching ALT+{1...9} was painfully slow, boot was delayed
for no obvious reason until I've removed following devices from kernel

Code:
device atkbd
device atkbdc
device psm

Looks like kernel is trying to access non-existing keyboard.

If you're using USB keyboard/mouse, rebuild your kernel, install it, reboot.

It worked for meâ„¢
 
I came across a similar delay. Booting 9.0-CURRENT or 8_1 amd64 on a GigaByte EX58 motherboard causes BTX loader to spin on and off for 75sec before booting the zfs root kernel. If I change the motherboard SATA Controller Mode to IDE, the delay is only about 15sec with 6 drives attached. In AHCI or RAID mode, the delay returns.

Removing atkbd/atkbdc/psm from the kernel didn't make a difference for me.
 
I can confirm this behavior with AHCI enable/disable and BTX loading times.

Chipset: Intel® 3210 + ICH9R Chipset
OS: mfsBSD 8.1-RELEASE-amd64 special edition

.
 
I upgraded my system to 8.2 release and also create zfs root on usb stick. Everything works, computer boots at normal speed. The only problem is long delay right after BTX loader outputs avaliable disks. It tooks about 5-6 minutes before boot proces is started. While waiting, light on usb stick blinks from time to time. Here is [CMD=""]gpart show[/CMD] output

Code:
=>     34  3913597  da2  GPT  (1.9G)
       34     1886       - free -  (943K)
     1920      128    1  freebsd-boot  (64K)
     2048  3911583    2  freebsd-zfs  (1.9G)

=>     34  3913597  da3  GPT  (1.9G)
       34     1886       - free -  (943K)
     1920      128    1  freebsd-boot  (64K)
     2048  3911583    2  freebsd-zfs  (1.9G)
 
Hate to dredge up an old, possibly dead horse, but did anyone ever satisfactorily figure this out?

I'm running FreeBSD 8-STABLE, amd64, booting six SCSI disks as a RAIDZ2 pool. I'll time how long it takes, but it's in the order of a minute plus. 8 GB of RAM, as well. Pretty darned slow.

-bpl
 
Got a server with Areca 1320 recently, problem gets worst. Stuck at BTX Loader for more than 20 mins.

The server is running FreeBSD 9.1 RC3
 
belon_cfy said:
The zfsloader doesn't work :e
Server kept rebooting.

NOOO! I´m so sorry, did you manage to get the system up and running again?

Shit man, my bad. But I tried that myself without issue a while back, I don´t get why it would break like that... Maybe because I was using it on a 9.0 system? You do have amd64, right? Although at that point in the boot process, what arch you have shouldn´t really matter, but just to make sure, maybe you could try i386 instead?
https://pub.allbsd.org/FreeBSD-snapshots/i386-i386/10.0-HEAD-20121006-JPSNAP/stage/trees/boot/zfsloader

/Sebulon
 
Sebulon said:
NOOO! I´m so sorry, did you manage to get the system up and running again?

Shit man, my bad. But I tried that myself without issue a while back, I don´t get why it would break like that... Maybe because I was using it on a 9.0 system? You do have amd64, right? Although at that point in the boot process, what arch you have shouldn´t really matter, but just to make sure, maybe you could try i386 instead?
https://pub.allbsd.org/FreeBSD-snapshots/i386-i386/10.0-HEAD-20121006-JPSNAP/stage/trees/boot/zfsloader

/Sebulon
It is a test server without any data :e , no problem.

Yes, it is running FreeBSD 9.0 amd64 too. Does the zfsloader work on your FreeBSD 9?
 
belon_cfy said:
It is a test server without any data :e , no problem.

Yes, it is running FreeBSD 9.0 amd64 too. Does the zfsloader work on your FreeBSD 9?

PHEW!:)

Nah, I just created a VM with 9.0-RELEASE and tested inside of that with root on ZFS, and it worked, plus improved boot time. I imagine you could benefit even more, judging by the number of drives you have.

/Sebulon
 
Sebulon said:
PHEW!:)

Nah, I just created a VM with 9.0-RELEASE and tested inside of that with root on ZFS, and it worked, plus improved boot time. I imagine you could benefit even more, judging by the number of drives you have.

/Sebulon

Ops... my server is FreeBSD 9.1 RC3 . Will it works on 9.1 too?

Will try again on version9.0
 
belon_cfy said:
Ops... my server is FreeBSD 9.1 RC3 . Will it works on 9.1 too?

Will try again on version9.0

No idea, but it´d be worth a shot trying first the i386 version on 9.1 and if fail, try with 9.0 instead.

/Sebulon
 
You can also try to set the loader tunable hw.memtest.tests to 0 ([amd64, i386, pc98] A loader(8) tunable hw.memtest.tests has been added. This controls whether to perform memory testing at boot time or not. The default value is 1 (perform a memory test).[r224516])
 
Sebulon said:
No idea, but it´d be worth a shot trying first the i386 version on 9.1 and if fail, try with 9.0 instead.

/Sebulon

I tried both of the bootloader but non of that worked. Server still kept rebooting.
 
Back
Top