Updating to 11.1-RELEASE failed on EC2

uchman

New Member

Reaction score: 5
Messages: 15

Hi!

I have a few machines running on Amazon cloud (EC2) and I was going to update them to 11.1 today. But all of them hangs in the loader after the upgrade. (If I install a new machine with 11.0 and then updates it works just fine. But my old machines hangs).

I noticed this problem already in 11.0 and it was added to the ERRATA of that release see:
https://www.freebsd.org/releases/11.0R/errata.html#open-issues

But I thought it was resolved by the Errata Notice:
https://www.freebsd.org/security/advisories/FreeBSD-EN-16:18.loader.asc

So what I did. I was running FreeBSD 11.0-RELEASE-p7 on the server and did a normal build world as described in the handbook: https://www.freebsd.org/doc/handbook/makeworld.html
(actually i did install both kernel and world before reboot)

Then the machine doesnt boot because it hangs on the loader. It looks like this (ec2 screenshot):
Screen Shot 2017-07-28 at 11.31.43.png


Then booted from a snapshot of / taken on ec2. (/var /usr and so on live on zfs). That works fine and the machine boots with the old kernel and new world. But then of course nothing works because of the world/kernel mismatch.

To try to resolve this I did two things, first I copied the working /boot into my 11.1 disk image and tried to boot. This time it gets past the loader but fails to boot for reasons unknown to me. (ec2 screenshot is blank)

root@core:~ # mount /dev/xbd5a /mnt
root@core:~ # cd /mnt/
root@core:/mnt # cp -Rp boot boot.old
root@core:/mnt # cd /boot/
root@core:/boot # cp * /mnt/boot/


Then I tried to have the new /boot and copy over only the old (working) loader. This also get me past the loader step but it does not boot (and gives a blank screenshot page).

root@core:~ # mount /dev/xbd5a /mnt
root@core:~ # cd /mnt/
root@core:/mnt # mv boot boot.notwork
root@core:/mnt # mv boot.old boot
root@core:/mnt # cp /boot/loader /mnt/boot/

It looks like this (when in the loader)
Screen Shot 2017-07-28 at 12.01.00.png


I have attatched two files with list of files in /boot and their checksums before and after make installkernel && make installworld

Right now Im using my snapshot of / using 11.0 and have zfs rollbacked /usr while trying to figure out a solution for this.

One workaround might me to just install a new machine and copy everything over from the old volume. (Because it seems new installs works fine). But I would really like to find out whats actually the problem here.

Thanks.

/Peter.
 

Attachments

Top