UFS corrupt or invalid GPT detected - can it be saved?

Hopefully someone can help me out. I've searched but not found any solutions for my situation.
OS: freebsd 12x
problem: running 2 hard drives in a software raid (JBOD) formatted with UFS. I recently found the "array" is up, but I'm missing data.
looking in the logs I found this. I understand there is a primary and backup GPT. it looks like the primary is corrupted but the backup is working? since it's spanned across 2 disks, I'm not having any luck in how to backup and restore or fix. gpart recover says it doesn't need to be recovered. but I'm still getting that error and still missing data. so looking for suggestions.
thanks
Code:
GEOM: ada0: GPT rejected -- may not be recoverable.
Trying to mount root from ufs:/dev/md0 []...
GEOM_CONCAT: Device BACKUP created (id=3754403038).
GEOM_CONCAT: Disk ada1 attached to BACKUP.
GEOM_CONCAT: Disk ada0 attached to BACKUP.
GEOM_CONCAT: Device concat/BACKUP activated.
GEOM: ada0: corrupt or invalid GPT detected.
GEOM: ada0: GPT rejected -- may not be recoverable.
additional information if it's helpful.
Code:
Geom name: concat/BACKUP
modified: false
state: OK
fwheads: 255
fwsectors: 63
last: 19534951230
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: concat/BACKUPp1
   Mediasize: 10001894965248 (9.1T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e2
   efimedia: HD(1,GPT,6f2d871e-d2a8-11e5-ad14-0cc47ab37c84,0x40,0x48c5fb2c0)
   rawuuid: 6f2d871e-d2a8-11e5-ad14-0cc47ab37c84
   rawtype: 516e7cb6-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 10001894965248
   offset: 32768
   type: freebsd-ufs
   index: 1
   end: 19534951167
   start: 64
Consumers:
1. Name: concat/BACKUP
   Mediasize: 10001895047168 (9.1T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e3
Code:
gpart show concat/BACKUP
=>         34  19534951197  concat/BACKUP  GPT  (9.1T)
           34           30                 - free -  (15K)
           64  19534951104              1  freebsd-ufs  (9.1T)
  19534951168           63                 - free -  (32K)
Code:
geom concat list
Geom name: BACKUP
State: UP
Status: Total=2, Online=2
Type: AUTOMATIC
ID: 3754403038
Providers:
1. Name: concat/BACKUP
   Mediasize: 10001895047168 (9.1T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e3
Consumers:
1. Name: ada1
   Mediasize: 5000947524096 (4.5T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e4
   Start: 5000947523584
   End: 10001895047168
2. Name: ada0
   Mediasize: 5000947524096 (4.5T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e4
   Start: 0
   End: 5000947523584
 
Last edited by a moderator:
If mounting the file system read-only doesn't work, periodic(8) is running a daily script which should have saved the partition table of concat/BACKUP into a backup file (/var/backups/gpart.concat_BACKUP.bak).

You can use that backup to restore the partition table.

To make things eventually not worse I would operate on a clone of the concat/BACKUP device first.

For example, if mounted umount(8) concat/BACKUP, dd(1) /dev/concat/BACKUP into an .img file, mdconfig(8) that .img, apply gpart restore -F device < backup_file (see EXAMPLES section of the manual).
 
the boot process complains on gpt on ada0, which it shouldn't because gpt shoud pe on /dev/concat/BACKUP
ada0 can have either the primary (if its the 1st disk in the concat) or the backup (if its the last one (which appears to be))
 
Can you show us the gpart show ada0 and ada1 so we have an idea of the disk layout? Also share the diskinfo ada0 and ada1.
 
the boot process complains on gpt on ada0, which it shouldn't because gpt shoud pe on /dev/concat/BACKUP
The error message might be misleading and should not primarily be focused on. I've tested in a VM (and double-checked), concatenated two disks (da0, da1), created a GPT partition table on concat/BACKUP, then rebooted without loading the geom_concat kernel module.

I found this in dmesg(8):
Code:
GEOM: da0: corrupt or invalid GPT detected.
GEOM: da0: GPT rejected -- may not be recoverable.
After loading the kernel module the device is accessible.

ada0 can have either the primary (if its the 1st disk in the concat) or the backup (if its the last one (which appears to be))
In either case the partition table should be recoverable from the backup of daily periodic.

Let's check which partition tables are backed up, ghvader1 please post output of ls -l /var/backups/gpart*

Also I would run sysutils/smartmontools to check the disks on their health.
 
Can gpart(8) work with ada0 whilst GPT rejected is reported?
Yes, you can work with disk/provider that has corrupted primary or backup header. You can also recover it with gpart recover.

Concated provider is not a provider where bootcode is, it's most likely on ada0/ada1 disk. I don't understand why OP chose to create another GPT layout on gconcat provider though. It makes sense to use /dev/concat/BACKUP as / without any GPT structures.
According to the OP's message error is coming from GEOM_CONCAT already so I'd assume module is already loaded.

I asked for gpart list on actual disks to see the partition layout and diskinfo to check the size (it is actually shown in gconcat output). I would recommend to check the primary and backup GPT on ada0 manually (with dd) to (attempt to) see what's the nature of the corruption.

Also don't start blindly restoring GPT headers because you may be sorry. Playing on the dd image of those disks is a good idea as mentioned above, especially if you value the data (but then you should not be using concat at all without actual backup in place).
 
Yes, you can work with disk/provider that has corrupted primary or backup header. You can also recover it with gpart recover.

Concated provider is not a provider where bootcode is, it's most likely on ada0/ada1 disk. I don't understand why OP chose to create another GPT layout on gconcat provider though. It makes sense to use /dev/concat/BACKUP as / without any GPT structures.
According to the OP's message error is coming from GEOM_CONCAT already so I'd assume module is already loaded.

I asked for gpart list on actual disks to see the partition layout and diskinfo to check the size (it is actually shown in gconcat output). I would recommend to check the primary and backup GPT on ada0 manually (with dd) to (attempt to) see what's the nature of the corruption.

Also don't start blindly restoring GPT headers because you may be sorry. Playing on the dd image of those disks is a good idea as mentioned above, especially if you value the data (but then you should not be using concat at all without actual backup in place).

Thank you all for taking the time to respond to my post. _martin in my original post, is all the gpart list info, I will include additional requested info in this post.
I have a zfs array and a zfs backup array, this was a situation where i had 2x 5tb drives laying around, so i made them a backup to my backups :) except I put my VM's on this software array for testing and just left them there because they worked. if it wasn't for those, I would just scrap it. but I have data on those VM's that i was hoping to save. they appear missing though.
If the backup partition is being used and it doesn't show my VM's I guess I'm out of luck.

file -s /dev/concat/BACKUPp1
/dev/concat/BACKUPp1: Unix Fast File system [v2] (little-endian) last mounted on /mnt/Backup, last written at Fri Mar 11 21:14:48 2022, clean flag 0, readonly flag 0, number of blocks 2441868888, number of data blocks 2365201252, number of cylinder groups 15236, block size 32768, fragment size 4096, average file size 16384, average number of files in dir 64, pending blocks to free 0, pending inodes to free 0, system-wide uuid 0, minimum percentage of free blocks 8, TIME optimization
diskinfo ada0
ada0 512 5000947524096 9767475633 4096 0 9689955 16 63
diskinfo ada1
ada1 512 5000947524096 9767475633 4096 0 9689955 16 63
 
file -s /dev/concat/BACKUPp1
/dev/concat/BACKUPp1: Unix Fast File system [v2] (little-endian) last mounted on /mnt/Backup, last written at Fri Mar 11 21:14:48 2022, clean flag 0, readonly flag 0, number of blocks 2441868888, number of data blocks 2365201252, number of cylinder groups 15236, block size 32768, fragment size 4096, average file size 16384, average number of files in dir 64, pending blocks to free 0, pending inodes to free 0, system-wide uuid 0, minimum percentage of free blocks 8, TIME optimization

this looks ok
try to mount it readonly
 
in my original post, is all the gpart list info
Does it mean you created this concated volume on two raw disks (without any layout) ?

If so by creating GPT on logical volume (concat/BACKUP) you may have created a weird situation where primary GPT is written into ada0 and backup into ada1. And hence why system is complaining about it. But if you can access the /dev/concat/BACKUPp1 I'd assume it's mountable. Also like a "false alarm", meaning you should be able to mount this FS manually.
 
(Is my mind playing tricks on me? I can't visualise FFS in the context of any ZFS array …)



Postscript, I did originally see this in the opening post:



– still, sorry, I can't visualise the use of ZFS …
it was for context i guess. I have a ZFS array, a ZFS backup array and because I had 2 extra drives laying around, (different than all my zfs drives) I created a JBOD software array to get a 10TB fs, which is used as a backup to my backups. the data is already in 2 other places so i don't care about most of it. but i did move my VM's there for testing and that's really what i care about as i didn't have backups of those.
 
Back
Top