Updating to 11.1-RELEASE failed on EC2

Hi!

I have a few machines running on Amazon cloud (EC2) and I was going to update them to 11.1 today. But all of them hangs in the loader after the upgrade. (If I install a new machine with 11.0 and then updates it works just fine. But my old machines hangs).

I noticed this problem already in 11.0 and it was added to the ERRATA of that release see:
https://www.freebsd.org/releases/11.0R/errata.html#open-issues

But I thought it was resolved by the Errata Notice:
https://www.freebsd.org/security/advisories/FreeBSD-EN-16:18.loader.asc

So what I did. I was running FreeBSD 11.0-RELEASE-p7 on the server and did a normal build world as described in the handbook: https://www.freebsd.org/doc/handbook/makeworld.html
(actually i did install both kernel and world before reboot)

Then the machine doesnt boot because it hangs on the loader. It looks like this (ec2 screenshot):
Screen Shot 2017-07-28 at 11.31.43.png


Then booted from a snapshot of / taken on ec2. (/var /usr and so on live on zfs). That works fine and the machine boots with the old kernel and new world. But then of course nothing works because of the world/kernel mismatch.

To try to resolve this I did two things, first I copied the working /boot into my 11.1 disk image and tried to boot. This time it gets past the loader but fails to boot for reasons unknown to me. (ec2 screenshot is blank)

root@core:~ # mount /dev/xbd5a /mnt
root@core:~ # cd /mnt/
root@core:/mnt # cp -Rp boot boot.old
root@core:/mnt # cd /boot/
root@core:/boot # cp * /mnt/boot/


Then I tried to have the new /boot and copy over only the old (working) loader. This also get me past the loader step but it does not boot (and gives a blank screenshot page).

root@core:~ # mount /dev/xbd5a /mnt
root@core:~ # cd /mnt/
root@core:/mnt # mv boot boot.notwork
root@core:/mnt # mv boot.old boot
root@core:/mnt # cp /boot/loader /mnt/boot/

It looks like this (when in the loader)
Screen Shot 2017-07-28 at 12.01.00.png


I have attatched two files with list of files in /boot and their checksums before and after make installkernel && make installworld

Right now Im using my snapshot of / using 11.0 and have zfs rollbacked /usr while trying to figure out a solution for this.

One workaround might me to just install a new machine and copy everything over from the old volume. (Because it seems new installs works fine). But I would really like to find out whats actually the problem here.

Thanks.

/Peter.
 

Attachments

  • 11.1.txt
    8.4 KB · Views: 394
  • working11.0.txt
    8.4 KB · Views: 410
I just upgraded an AWS-EC2 instance from FreeBSD RELEASE-11.0p11 to 11.1 and everything went smooth. I suggest updating your installations first from FreeBSD RELEASE-11.0p7 to ...p11, and then try again.

I upgraded a similar instance on Google Cloud as well and this is also working without any issues.

I only saw an unusual message for a minor upgrade after the 1st reboot and the 2nd repeat of freebsd-update install, namely:
Code:
Completing this upgrade requires removing old shared object files.
Please rebuild all installed 3rd party software (e.g., programs
installed from the ports tree) and then run "/usr/sbin/freebsd-update install"
again to finish installing updates.
Usually minor upgrades don't need this, anyway I reinstalled all ports and packages. In order to speed things up, I wrote a script, which installs the default stuff from binary packages and builds only those ports with non-default options from source. Pay attention to the shell variable $portslist, here I defined MY ports having non-default options. This needs to be adapted to other situations, of course.
Code:
#!/bin/sh

### the list of the ports that shall be updated from sources
portslist="\
audio/lame \
devel/apr1 \
devel/subversion \
mail/postfix \
mail/roundcube"

### fetching updates of the FreeBSD ports tree
/usr/bin/printf "\nFetching updates of the FreeBSD ports tree...\n"
/usr/sbin/portsnap fetch update
/usr/sbin/pkg version -v
/usr/sbin/pkg updating -d `date -v-2w +%Y%m%d`

### ask and in case of y|Y run the updating processes
/usr/bin/printf "\nDo you want to continue (y/n)? "
save_stty_state=$(stty -g); stty raw -echo; answer=$(head -c 1); stty $save_stty_state
if echo "$answer" | grep -iq "^y" ; then
   /usr/bin/printf "\n\n"
   /usr/sbin/pkg update

   portmake="$portslist"

   pkgupgrd=""
   pkgslist=`/usr/sbin/pkg version -o | /usr/bin/cut -f1 -w`
   for pkg in $pkgslist ; do
      for port in $portslist ; do
         if [ "$port" == "$pkg" ] ; then
            continue 2
         fi
      done

      pkgupgrd="$pkgupgrd $pkg"
   done

   /usr/bin/printf "\nUpdating ports...\n"
   if [ "$portmake" != "" ] ; then
      cwd=$PWD

      for port in $portmake ; do
         cd /usr/ports/$port
         /usr/bin/make deinstall clean
         /usr/bin/make install clean
      done

      cd "$cwd"
   else
      All installed ports are up-to-date.
   fi

   /usr/bin/printf "\nUpdating binary packages...\n"
   if [ "$pkgupgrd" != "" ] ; then
      /usr/sbin/pkg upgrade -fU $pkgupgrd
   else
      echo "All installed packages are up-to-date."
   fi

   /usr/bin/printf "\nCleaning up...\n"
   /usr/sbin/pkg clean -y

else

   /usr/bin/printf "\n\n"

fi
BTW, for me the highlight of 11.1 is, that my L2TP/IPsec service now works out of the box. No need anymore for patching and customizing a kernel for IPsec NAT-T. Even connections from Windows from behind a NAT do work without any hiccup. So, I am already quite satisfied with this release.
 
Last edited:
Back
Top