Update 10.3 to 11.0 Lost Swap Partition

Hello All,

After upgrading from 10.3 to 11.0 my two FreeBSD boxes were not booting anymore. Both had the same issue, which was easy to solve as long as you have direct access to the system. It prompted me at the mountroot> for a root device. In the past I used /dev/ad0s1a as / and /dev/ad0s1b as swap. Now ad0s1a was not available anymore and "?" gave me /dev/ada0s1 as the proper device. Fine. I changed the fstab and both are booting again. BUT

With 10.3 gpart show gave me:

Code:
=>      63  20971457  ada0  MBR  (10G)
        63  20971377     1  freebsd  [active]  (10G)
  20971440        80        - free -  (40k)

=>       0  20971377  ada0s1  BSD  (10G)
         0  16777216       1  freebsd-ufs  (8.0G)
  16777216   4194161       2  freebsd-swap  (2G)

after upgrading to 11.0
Code:
=>      63  20971457  ada0  MBR  (10G)
        63  20971377     1  freebsd  [active]  (10G)
  20971440        80        - free -  (40K)

I'm pretty sure I did something wrong. But what? Any ideas where my swap partition is?

In /dev/ I only find

Code:
0 crw-r-----  1 root  operator  0x4a Dec 16 16:22 /dev/ada0
0 crw-r-----  1 root  operator  0x4b Dec 16 16:22 /dev/ada0s1

Both systems are up and running. I would like to have my swap partitions back. Any suggestions?

Best Regards
Markus
 
Last edited by a moderator:
You should not be able to boot from ada0s1, it's a slice. A slice should contain partitions (a, b, etc). I'm not sure what happened here and how to resolve it. How did you do the upgrade?
 
Good Morning,

You should not be able to boot from ada0s1, it's a slice. A slice should contain partitions (a, b, etc). I'm not sure what happened here and how to resolve it. How did you do the upgrade?

Yes, thats what I thought, but I have no idea why it happened. It reproducible. I did it like it did upgrade last years too. Coming from 10.3. "freebsd-update -r 11.0-RELEASE upgrade", ... install ... reboot ... (first problem with the not available root partition) ... only the slice is available. All partitions are gone.

Environment: VirtualBox 5.0.xx on openSuSE 13.2 (actual patchlevel). FreeBSD on a VirtualDisk.

I have four FreeBSD boxes (firewalls) on this host and actually upgraded two of them. Same behavior on both.

Best Regards
Markus
 
Ok, it's good they're virtual, that'll make it easier to test. There are no snapshots with Virtualbox but you can create a 1-on-1 clone from a virtual machine. So create a clone and test the next upgrade with that. That'll give us some "playroom" without interfering with the working ones.

Were these disk images extended at some point in time? I'm thinking the partition table may have been dodgy and the upgrade just destroyed it. I've never seen it happen but we can't rule anything out.

Before running the update I would try to create a backup of the partition table: gpart backup ada0 > table_ada0_backup.txt. If things go south you should be able to use it to restore the table again; cat table_ada0_backup.txt | gpart restore -l ada0.
 
Now ad0s1a was not available anymore and "?" gave me /dev/ada0s1 as the proper device. Fine. I changed the fstab and both are booting again.

What is the size of the current filesystem ? the previous one was 8 GB, the updated one apparently is 10 GB, but I suspect you will find it is still 8 GB, the remaining being the swap space. That would be a further hint to understand what happened.
 
Hello Gentlemen,

Ok, it's good they're virtual, that'll make it easier to test. There are no snapshots with Virtualbox but you can create a 1-on-1 clone from a virtual machine. So create a clone and test the next upgrade with that. That'll give us some "playroom" without interfering with the working ones.

Were these disk images extended at some point in time? I'm thinking the partition table may have been dodgy and the upgrade just destroyed it. I've never seen it happen but we can't rule anything out.

Before running the update I would try to create a backup of the partition table: gpart backup ada0 > table_ada0_backup.txt. If things go south you should be able to use it to restore the table again; cat table_ada0_backup.txt | gpart restore -l ada0.

Actually I'm on customer site with no direct access to my BSD boxes. I will create a clone and update the cloned system when I return, tomorrow latest. Before I'm doing it, I will backup the partition table.

What is the size of the current filesystem ? the previous one was 8 GB, the updated one apparently is 10 GB, but I suspect you will find it is still 8 GB, the remaining being the swap space. That would be a further hint to understand what happened.

Your observation is correct. The virtual disk has a size of 10GB. ~8GB / and ~2GB swap. I cloned all boxes and the partitions and slices were never touched. Actual size is still ~8GB.

Code:
root@janus:~ # df -h
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/ada0s1    7.7G    5.0G    2.2G    70%    /
Just because I was a bit curious I made a backup of one of the not upgraded systems and of one of the upgraded. Both are showing exactly the same content:

Code:
MBR 4
1 freebsd       63 20971377   [active]

I upgraded the base system (where the clones are based on) from ... I assume 8.x, maybe 7.x ... may this is the reason? Could be, that the partitioning was done with fdisk in the past.

Best Regards
Markus
 
Your observation is correct. The virtual disk has a size of 10GB. ~8GB / and ~2GB swap. I cloned all boxes and the partitions and slices were never touched. Actual size is still ~8GB.

I can think at two type of problems:

a) partition table was changed when the system was upgraded
b) partition table didn't changed and the upgraded system misreport its content.

If you cloned your boxes from one original image, they should have all the same partition table, in which case you could backup the partition table from one of the unaffected clones and restore it on one of the upgraded one and see if that restore the swap. (you will have to revert back the fstab changes too).

Follow SirDice advice for backup/restore the partition table, that would be an aid to discriminate between case a) and b).
 
I can think at two type of problems:
Follow SirDice advice for backup/restore the partition table, that would be an aid to discriminate between case a) and b).

I did as you can see in my last post. There is no difference between the partition tables. I assume the problem is older and occurs now with the update from 10.3 to 11.0 first. If I backup the partition table on a not updated system its the same as on an updated system.

Mars: FreeBSD 9.3
Code:
root@mars:~ # gpart backup ada0
MBR 4
1 freebsd       63 20971377   [active]

Janus: FreeBSD 11.0
Code:
root@janus:~ # gpart backup ada0
MBR 4
1 freebsd       63 20971377   [active]

But we have a difference if I make a backup of the table of ada0s1 ..

Mars:
Code:
root@mars:~ # gpart backup ada0s1
BSD 8
1  freebsd-ufs        0 16777216
2 freebsd-swap 16777216  4194161

Janus:
Code:
root@janus:~ # gpart backup ada0s1
gpart: No such geom: ada0s1.

Mars is an 9.3 system, Janus was a 10.3 system. I still have on 10.3 left, but no access actual. The question is now, what happens if I restore the partition table from Janus to Mars. I will test that and report if I'm back in the office because I need direct access as Mars is the VPN Access GW and if it stops, I'm offline. :)

Best Regards
Markus
 
Do you have devices /dev/ada0s1a and /dev/ada0s1b? I had a similar problem and my root device is /dev/ada0s1a and swap device /dev/ada0s1b now.
 
Hello Daniel,

Do you have devices /dev/ada0s1a and /dev/ada0s1b? I had a similar problem and my root device is /dev/ada0s1a and swap device /dev/ada0s1b now.

unfortunatly not.

Mars, has them.

Code:
root@mars:~ # ls -l /dev/ad*
lrwxr-xr-x  1 root  wheel        4 Dec  6 21:08 /dev/ad0 -> ada0
lrwxr-xr-x  1 root  wheel        6 Dec  6 21:08 /dev/ad0s1 -> ada0s1
lrwxr-xr-x  1 root  wheel        7 Dec  6 21:08 /dev/ad0s1a -> ada0s1a
lrwxr-xr-x  1 root  wheel        7 Dec  6 21:08 /dev/ad0s1b -> ada0s1b
crw-r-----  1 root  operator  0x43 Dec  6 21:08 /dev/ada0
crw-r-----  1 root  operator  0x45 Dec  6 21:08 /dev/ada0s1
crw-r-----  1 root  operator  0x47 Dec  6 21:08 /dev/ada0s1a
crw-r-----  1 root  operator  0x49 Dec  6 21:08 /dev/ada0s1b

But Janus has not

Code:
root@janus:~ # ls -l /dev/ad*
crw-r-----  1 root  operator  0x48 Dec 14 07:53 /dev/ada0
crw-r-----  1 root  operator  0x49 Dec 14 07:53 /dev/ada0s1

Best Regards
Markus
 
Update:

Daniel_R has a similar problem and his post ends up with the hint to use bsdlabel. Not a bad idea. Look what "Janus" shows if I execute

Code:
root@janus:~ # bsdlabel ada0s1
# /dev/ada0s1:
8 partitions:
#          size     offset    fstype   [fsize bsize bps/cpg]
  a:   16777216          0    4.2BSD        0     0     0
  b:    4194161   16777216      swap
  c:   20971377          0    unused        0     0     # "raw" part, don't edit

There are the "lost" partitions. The question is now, why this were not shown after the update and why they were not correctly imported. Where is the missing link? Did I something wrong, or is it a kind of a bug?

Best Regards
Markus
 
Ok. So, the only question which remains is why doesn't the BSD slices show up as files in /dev. Have you tried running fsck on /dev/ada0s1a?
 
So, the only question which remains is why doesn't the BSD slices show up as files in /dev. Have you tried running fsck on /dev/ada0s1a?
That's not possible if the partition doesn't show up in the first place ;)
 
Well, at least he would get an error message. If it is just the standard "file not found" error message it wouldn't be of interest. :)
However, the OP has a very non standard situation, so it would be interesting to see if fsck gave that normal error message, or something more unusual. Cleraly, bsdlabel is able to see the partitions...
 
I'd use the data from bsdlabel(8) and, carefully(!), recreate them using gpart(8). If done correctly, with the exact same values, things should be back to normal again.

Normal caveats regarding backups and all will apply of course.
 
Hello Gentlemen,

As expected

Code:
root@janus:~ # fsck /dev/ada0s1a
fsck: cannot open `/dev/ada0s1a': No such file or directory

ends up in an error message. Something which isn't there cannot be checked. Unfortunately I didn't get it managed to stay in my office today again and do some tests. I will clone a drive this evening and recreate the partitions with gpart afterwards. Quick and easy job. However, it was very good luck that the root partition was first, if the swap partition would be first the system won't boot with /dev/ada0s1 ...

Does someone has an idea why this happened? I have a lot of FreeBSD firewall boxes running at different sites and I need to update them by remote.

Best Regards
Markus
 
Is the problem reproducible? In other words, can you install 10.3 on a new machine, upgrade to 11.0 and get the problem?
If so, it might be worthwhile to reproduce your actual steps here, so others can test it.
 
Is the problem reproducible?

As I read from post #3 seems yes:
I have four FreeBSD boxes (firewalls) on this host and actually upgraded two of them. Same behavior on both.

/dev/ad0s1a as / and /dev/ad0s1b as swap. Now ad0s1a was not available anymore and "?" gave me /dev/ada0s1 as the

I noted the different notation ad0 vs. ada0, and initially I though at typos ... a confirmation would clarify, because there was an old notation used previously (although that should have been ad4 ...).

As far as I can see from bsdlabel source code, that information are read directly from the disk, and that let me think that effectively the partition table didn't changed.

You are using virtualbox, a quick test would be to attach an updated disk (as second disk) to a not updated box, and see how the disk is recognized from the previous system.
 
A short update. I restored a snapshot of Saturn with revision 10.3, made clone and restored rev 11.0 again. Afterwards I stripped everything from the clone which isn't needed. All ports, and packages, configs. Now the clone is a simple FreeBSD 10.3 box with one NIC (DHCP). I will now make a snapshot and then I will do the upgrade again. Status with 10.3:

Code:
root@saturn-clone:~ # gpart show
=>      63  20971457  ada0  MBR  (10G)
        63  20971377     1  freebsd  [active]  (10G)
  20971440        80        - free -  (40K)

=>       0  20971377  ada0s1  BSD  (10G)
         0  16777216       1  freebsd-ufs  (8.0G)
  16777216   4194161       2  freebsd-swap  (2.0G)

root@saturn-clone:~ # uname -a
FreeBSD saturn-clone 10.3-RELEASE-p11 FreeBSD 10.3-RELEASE-p11 #0: Mon Oct 24 18:49:24 UTC 2016     [email]root@amd64-builder.daemonology.net[/email]:/usr/obj/usr/src/sys/GENERIC  amd64

root@saturn-clone:~ # mount
/dev/ad0s1a on / (ufs, local)
devfs on /dev (devfs, local, multilabel)

root@saturn-clone:~ # df -h
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/ad0s1a    7.7G    1.8G    5.3G    25%    /
devfs          1.0K    1.0K      0B   100%    /dev

root@saturn-clone:~ # swapinfo
Device          1K-blocks     Used    Avail Capacity
/dev/ada0s1b      2097080        0  2097080     0%

root@saturn-clone:~ # ls -l /dev/ad*
lrwxr-xr-x  1 root  wheel        4 Dec 21 01:08 /dev/ad0 -> ada0
lrwxr-xr-x  1 root  wheel        6 Dec 21 01:08 /dev/ad0s1 -> ada0s1
lrwxr-xr-x  1 root  wheel        7 Dec 21 01:08 /dev/ad0s1a -> ada0s1a
lrwxr-xr-x  1 root  wheel        7 Dec 21 01:08 /dev/ad0s1b -> ada0s1b
crw-r-----  1 root  operator  0x45 Dec 21 01:08 /dev/ada0
crw-r-----  1 root  operator  0x47 Dec 21 01:08 /dev/ada0s1
crw-r-----  1 root  operator  0x4a Dec 21 01:08 /dev/ada0s1a
crw-r-----  1 root  operator  0x4c Dec 21 01:08 /dev/ada0s1b
Best Regards
Markus
 
got same problem after upgrading from 10.4 to 11.1..
if you specify
mountroot> ufs:/dev/ada0s1
at boot while it fails to mount root, it will mount "/" normally, but /dev/ will only contain files like so:
# ls /dev/ada0*
/dev/ada0 /dev/ada0s1

solution is mount root fs with right partition name (e.g. /dev/ada0s1a) keeping in mind not to forget tomake corrections in /etc/fstab..

freebsd_001.jpg

freebsd_002.jpg
 
Back
Top