ZFS corrupt GPT table

Hello,

we have just finished our XEN server maintenance: some cleaning, RAM upgrade, BIOS upgrade. The OS is not changed: 13.1-RELEASE-p9 ( I didn't run freebsd-upgrade nor pkg upgrade).
After RAM upgrade and the chassis was cleaned out the dmesg showed:

Dec 29 10:31:45 dom0 kernel: GEOM: zvol/zroot/guest_tms_data: the secondary GPT table is corrupt or invalid.
Dec 29 10:31:45 dom0 kernel: GEOM: zvol/zroot/guest_tms_data: using the primary only -- recovery suggested.
Dec 29 10:31:45 dom0 kernel: GEOM: ufsid/62b43bf75b235530: the secondary GPT table is corrupt or invalid.
Dec 29 10:31:45 dom0 kernel: GEOM: ufsid/62b43bf75b235530: using the primary only -- recovery suggested.

I use mirror and ZFS on root:
Code:
root@dom0:~ # gpart show
=>       40  937703008  ada0  GPT  (447G)
         40       1024     1  freebsd-boot  (512K)
       1064        984        - free -  (492K)
       2048    4194304     2  freebsd-swap  (2.0G)
    4196352  933505024     3  freebsd-zfs  (445G)
  937701376       1672        - free -  (836K)

=>       40  937703008  ada1  GPT  (447G)
         40       1024     1  freebsd-boot  (512K)
       1064        984        - free -  (492K)
       2048    4194304     2  freebsd-swap  (2.0G)
    4196352  933505024     3  freebsd-zfs  (445G)
  937701376       1672        - free -  (836K)

root@dom0:~ # zpool status
  pool: zroot
 state: ONLINE
  scan: scrub repaired 0B in 00:02:21 with 0 errors on Fri Dec 29 12:05:51 2023
config:

        NAME        STATE     READ WRITE CKSUM
        zroot       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada0p3  ONLINE       0     0     0
            ada1p3  ONLINE       0     0     0

errors: No known data errors

I think some data are corrupted on the disks.

I found this thread about the problem: https://forums.freebsd.org/threads/gpt-table-corrupt.52102/

What I don't understand how can the GPT table point to a ZVT volume, because zvol/zroot/guest_tms_data is a volume under the ZROOT!
Despite the error message everything seems ok on the system.
How can I fix this problem? Should I follow the steps above in the linked thread?
 
Was the disk expanded at one point? That might explain the corrupted table, GPT also writes some data at the end of the disk. If the disk is then extended the end of the disk moves and the GPT metadata won't be there anymore. Just run gpart recover.

The issue in the other thread is about using the whole disk for ZFS, which is not same problem in your case.

You do need to upgrade though, 13.1 is now end-of-life and not supported anymore.
 
It was not expanded, I didn't do any change on disk level just RAM and BIOS upgrade. The error message appeared before BIOS upgrade.
13.1 is now end-of-life -> I know it, and I keep my guest OSes up to date, but this is the hypervisor and the upgrade is more stressful for me this is why I postponed it, but it is planned to upgrade to 14 (from the scratch).

What would be the correct recovery commands?
Code:
gpart recover /dev/ada0
and
gpart recover /dev/ada1
I just not want to screw it up, thank you for your answer!
 
It is strange:

Code:
root@dom0:~ # gpart recover /dev/ada0
ada0 recovering is not needed
root@dom0:~ # gpart recover /dev/ada1
ada1 recovering is not needed
 
Oh, right. Looking more closely at the error messages, it seems the partition table on one of your ZVOLs is the corrupted one. Specifically this one: zvol/zroot/guest_tms_data. The ufsid/62b43bf75b235530 probably refers to the same volume.
 
zvol/zroot/guest_tms_data doesn't contains GPT table it just contains UFS. It is just a volume for store data.

on the hypervisor:
Code:
root@dom0:~ # zfs list
NAME                                           USED  AVAIL     REFER  MOUNTPOINT
...
zroot/guest_tms                               22.3G   292G     6.94G  -
zroot/guest_tms@backup                         623M      -     6.78G  -
zroot/guest_tms_data                           103G   352G     28.3G  -
zroot/tmp                                      140K   277G      140K  /tmp
...

on the guest:
Code:
root@tms:~ # gpart show
=>      40  31457200  xbd0  GPT  (15G)
        40    532480     1  freebsd-boot  (260M)
    532520  28827648     2  freebsd-ufs  (14G)
  29360168   1572864     3  freebsd-swap  (768M)
  30933032    524208        - free -  (256M)

root@tms:~ # mount
/dev/xbd0p2 on / (ufs, local, soft-updates, journaled soft-updates)
devfs on /dev (devfs)
fdescfs on /dev/fd (fdescfs)
procfs on /proc (procfs, local)
/dev/xbd1 on /tms (ufs, NFS exported, local, soft-updates)

root@tms:~ # gpart show /dev/xbd1
gpart: No such geom: /dev/xbd1.
root@tms:~ #

The zroot/guest_tms is the /dev/xbd0 on the guest and contains GPT table and the OS.
The zroot/guest_tms_data is the /dev/xbd1 on the guest and contains just UFS.
 
The zroot/guest_tms_data is the /dev/xbd1 on the guest and contains just UFS.
Then there likely once was a partition table on it. The metadata is still around and this gets picked up by the host.
 
It may be, I don't remember, may I tried to do GPT on it when I installed the system.
But what is strange for me why it started to report this after RAM upgrade (or maybe after BIOS upgrade)! I checked the archived dmesg and this message appeared today. The server runs more than one and half years back.
However, it seems I don't have to worry about it!
Thank you for your help!
 
Back
Top