ZFS Pool Lost by GPT

Hey guys,

in last 4 days i learned 1 importend lessons.

1. Have a backup of your importend files in the safe!

I have read in the last 2 days topics about my problem but iam not sure that i got the right solution.
First, whats happend?

Since 2014 i started with zfs and use the same Hardware HomeBuild Server, 16Bays, HBA works fine und solid.
I decidet to buy a biger Machine, 24bays, 64 GB RAM ECC, 10G Network and a asr-71605 Raid Controller.

My first action was to export the Pool from Freebsd 11 to get a clean system.
Installed Freebsd 12 on the new machine and import the Pool, Upgrade the pool and resilver it.
Perfect, everything looks good and healthy!

Now i decided to look in the Raidcontroller bios and the controller mode is RAID: Expose RAW .


Ok, the paper tells me, when i changed it to hba nothing will happend, so i changed the mode, big mistake.
After the reboot, Freebsd lost all drives, the name was changed and the dmesg tells me.

GEOM: diskid/DISK-WD-WMC300033578: corrupt or invalid GPT detected. GEOM: diskid/DISK-WD-WMC300033578: GPT rejected -- may not be recoverable.

on all 16 disks

The pool was down, now i reboot the machine and revert my settings in the Raidcontroller without any effect.

I got some panic and search for my problem. First, i shutdown the server, disconnect all drives, boot up, delete the pool in bsd, make a shutdown connect all drives and boot up.
But the zpool import castor was not working, no pools found.

Gdisk tells me:
root@castor:~ # gdisk /dev/da1 GPT fdisk (gdisk) version 1.0.4 Partition table scan: MBR: not present BSD: not present APM: not present GPT: not present Creating new GPT entries in memory. Command (? for help): i No partitions Command (? for help): p Disk /dev/da1: 3907029168 sectors, 1.8 TiB Sector size (logical): 512 bytes Disk identifier (GUID): 222061DC-4D53-4B94-A3E2-7245FEC56346 Partition table holds up to 128 entries Main partition table begins at sector 2 and ends at sector 33 First usable sector is 34, last usable sector is 3907029134 Partitions will be aligned on 2048-sector boundaries Total free space is 3907029101 sectors (1.8 TiB) Number Start (sector) End (sector) Size Code Name

zdb tells me not the right things:
Pool name is castor, not storage, its a raidz2

root@castor:~ # zdb -l /dev/da0 ------------------------------------ LABEL 0 ------------------------------------ failed to unpack label 0 ------------------------------------ LABEL 1 ------------------------------------ failed to unpack label 1 ------------------------------------ LABEL 2 ------------------------------------ version: 5000 name: 'Storage' state: 0 txg: 39 pool_guid: 9982131470648287908 hostname: 'lucy' top_guid: 10799481239905619071 guid: 13308286758891230642 vdev_children: 1 vdev_tree: type: 'raidz' id: 0 guid: 10799481239905619071 nparity: 1 metaslab_array: 34 metaslab_shift: 34 ashift: 12 asize: 1991818346496 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 6874939737376970087 path: '/dev/ada0p3' phys_path: '/dev/ada0p3' whole_disk: 1 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 13308286758891230642 path: '/dev/ada1p3' phys_path: '/dev/ada1p3' whole_disk: 1 create_txg: 4 children[2]: type: 'disk' id: 2 guid: 3334074266236166235 path: '/dev/ada2p3' phys_path: '/dev/ada2p3' whole_disk: 1 create_txg: 4 children[3]: type: 'disk' id: 3 guid: 7583283394416999456 path: '/dev/ada3p3' phys_path: '/dev/ada3p3' whole_disk: 1 create_txg: 4 features_for_read: com.delphix:hole_birth com.delphix:embedded_data ------------------------------------ LABEL 3 ------------------------------------ failed to unpack label 3

zdb can read da0 and da1, all other disk i got
root@castor:~ # zdb -l /dev/da2 ------------------------------------ LABEL 0 ------------------------------------ failed to unpack label 0 ------------------------------------ LABEL 1 ------------------------------------ failed to unpack label 1 ------------------------------------ LABEL 2 ------------------------------------ failed to unpack label 2 ------------------------------------ LABEL 3 ------------------------------------ failed to unpack label 3
gpart found nothing

but hexDump found segments of the zfs

root@castor:~ # hexdump -C /dev/da0 | more 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00103fd0 00 00 00 00 00 00 00 00 11 7a 0c b1 7a da 10 02 |.........z..z...| 00103fe0 3f 2a 6e 7f 80 8f f4 97 fc ce aa 58 16 9f 90 af |?*n........X....| 00103ff0 8b b4 6d ff 57 ea d1 cb ab 5f 46 0d db 92 c6 6e |..m.W...._F....n| 00104000 01 01 00 00 00 00 00 00 00 00 00 01 00 00 00 24 |...............$| 00104010 00 00 00 20 00 00 00 07 76 65 72 73 69 6f 6e 00 |... ....version.| 00104020 00 00 00 08 00 00 00 01 00 00 00 00 00 00 13 88 |................| 00104030 00 00 00 24 00 00 00 20 00 00 00 04 6e 61 6d 65 |...$... ....name| 00104040 00 00 00 09 00 00 00 01 00 00 00 06 63 61 73 74 |............cast| 00104050 6f 72 00 00 00 00 00 24 00 00 00 20 00 00 00 05 |or.....$... ....| 00104060 73 74 61 74 65 00 00 00 00 00 00 08 00 00 00 01 |state...........| 00104070 00 00 00 00 00 00 00 00 00 00 00 20 00 00 00 20 |........... ... | 00104080 00 00 00 03 74 78 67 00 00 00 00 08 00 00 00 01 |....txg.........| 00104090 00 00 00 00 00 18 ca 8c 00 00 00 28 00 00 00 28 |...........(...(| 001040a0 00 00 00 09 70 6f 6f 6c 5f 67 75 69 64 00 00 00 |....pool_guid...| 001040b0 00 00 00 08 00 00 00 01 69 d6 13 7d 14 fe 0a 46 |........i..}...F| 001040c0 00 00 00 24 00 00 00 20 00 00 00 06 68 6f 73 74 |...$... ....host| 001040d0 69 64 00 00 00 00 00 08 00 00 00 01 00 00 00 00 |id..............| 001040e0 84 0f b9 3d 00 00 00 28 00 00 00 28 00 00 00 08 |...=...(...(....| 001040f0 68 6f 73 74 6e 61 6d 65 00 00 00 09 00 00 00 01 |hostname........| 00104100 00 00 00 06 63 61 73 74 6f 72 00 00 00 00 00 24 |....castor.....$| 00104110 00 00 00 28 00 00 00 08 74 6f 70 5f 67 75 69 64 |...(....top_guid| 00104120 00 00 00 08 00 00 00 01 00 86 12 5f 4f 7f 9a ae |..........._O...| 00104130 00 00 00 20 00 00 00 20 00 00 00 04 67 75 69 64 |... ... ....guid| 00104140 00 00 00 08 00 00 00 01 34 e1 99 ae 78 9d ac 91 |........4...x...| 00104150 00 00 00 2c 00 00 00 28 00 00 00 0d 76 64 65 76 |...,...(....vdev| 00104160 5f 63 68 69 6c 64 72 65 6e 00 00 00 00 00 00 08 |_children.......| 00104170 00 00 00 01 00 00 00 00 00 00 00 02 00 00 0b c0 |................| 00104180 00 00 00 38 00 00 00 09 76 64 65 76 5f 74 72 65 |...8....vdev_tre| 00104190 65 00 00 00 00 00 00 13 00 00 00 01 00 00 00 00 |e...............| 001041a0 00 00 00 01 00 00 00 24 00 00 00 20 00 00 00 04 |.......$... ....| 001041b0 74 79 70 65 00 00 00 09 00 00 00 01 00 00 00 05 |type............| 001041c0 72 61 69 64 7a 00 00 00 00 00 00 20 00 00 00 20 |raidz...... ... | 001041d0 00 00 00 02 69 64 00 00 00 00 00 08 00 00 00 01 |....id..........| 001041e0 00 00 00 00 00 00 00 00 00 00 00 20 00 00 00 20 |........... ... | 001041f0 00 00 00 04 67 75 69 64 00 00 00 08 00 00 00 01 |....guid........| 00104200 00 86 12 5f 4f 7f 9a ae 00 00 00 24 00 00 00 20 |..._O......$... | 00104210 00 00 00 07 6e 70 61 72 69 74 79 00 00 00 00 08 |....nparity.....| 00104220 00 00 00 01 00 00 00 00 00 00 00 02 00 00 00 2c |...............,| 00104230 00 00 00 28 00 00 00 0e 6d 65 74 61 73 6c 61 62 |...(....metaslab| 00104240 5f 61 72 72 61 79 00 00 00 00 00 08 00 00 00 01 |_array..........| 00104250 00 00 00 00 00 00 00 22 00 00 00 2c 00 00 00 28 |......."...,...(| 00104260 00 00 00 0e 6d 65 74 61 73 6c 61 62 5f 73 68 69 |....metaslab_shi| 00104270 66 74 00 00 00 00 00 08 00 00 00 01 00 00 00 00 |ft..............| 00104280 00 00 00 25 00 00 00 24 00 00 00 20 00 00 00 06 |...%...$... ....| 00104290 61 73 68 69 66 74 00 00 00 00 00 08 00 00 00 01 |ashift..........| 001042a0 00 00 00 00 00 00 00 0c 00 00 00 24 00 00 00 20 |...........$... | 001042b0 00 00 00 05 61 73 69 7a 65 00 00 00 00 00 00 08 |....asize.......| 001042c0 00 00 00 01 00 00 0e 8d a1 c0 00 00 00 00 00 24 |...............$| 001042d0 00 00 00 20 00 00 00 06 69 73 5f 6c 6f 67 00 00 |... ....is_log..| 001042e0 00 00 00 08 00 00 00 01 00 00 00 00 00 00 00 00 |................| 001042f0 00 00 00 28 00 00 00 28 00 00 00 0a 63 72 65 61 |...(...(....crea| 00104300 74 65 5f 74 78 67 00 00 00 00 00 08 00 00 00 01 |te_txg..........| 00104310 00 00 00 00 00 00 00 04 00 00 0a 1c 00 00 01 20 |............... | 00104320 00 00 00 08 63 68 69 6c 64 72 65 6e 00 00 00 14 |....children....| 00104330 00 00 00 08 00 00 00 00 00 00 00 01 00 00 00 20 |............... | 00104340 00 00 00 20 00 00 00 04 74 79 70 65 00 00 00 09 |... ....type....| 00104350 00 00 00 01 00 00 00 04 64 69 73 6b 00 00 00 20 |........disk... | 00104360 00 00 00 20 00 00 00 02 69 64 00 00 00 00 00 08 |... ....id......| 00104370 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 20 |............... | 00104380 00 00 00 20 00 00 00 04 67 75 69 64 00 00 00 08 |... ....guid....| 00104390 00 00 00 01 34 e1 99 ae 78 9d ac 91 00 00 00 2c |....4...x......,| 001043a0 00 00 00 28 00 00 00 04 70 61 74 68 00 00 00 09 |...(....path....| 001043b0 00 00 00 01 00 00 00 0f 2f 64 65 76 2f 67 70 74 |......../dev/gpt| 001043c0 2f 64 69 73 6b 30 30 00 00 00 00 34 00 00 00 30 |/disk00....4...0| 001043d0 00 00 00 09 70 68 79 73 5f 70 61 74 68 00 00 00 |....phys_path...| 001043e0 00 00 00 09 00 00 00 01 00 00 00 0f 2f 64 65 76 |............/dev| 001043f0 2f 67 70 74 2f 64 69 73 6b 30 30 00 00 00 00 28 |/gpt/disk00....(| 00104400 00 00 00 28 00 00 00 0a 77 68 6f 6c 65 5f 64 69 |...(....whole_di|

i also tryed zpool import -a
zpool import -D castor
zpool import castor


Is it possible to restore the pool when i create a new GPT Partion an each disk?
I hope somewere is reading this thread and can me help.

thank you

Jens
 
Hello Guys,

used anyone the tool TestDisk? Its a live system to recover lost partitions.

Maybe it can help to get out from the desaster?

regards

Jens
 
The big question is this: What did you change to cause the problem?

Here is what I understand from your description.
  • You moved the disks physically from the old server to the new server.
  • On the new machine, the disks are connected to an Adaptec ASR-71605 controller.
  • At this point, everything still worked. But we don't know how you had formatted the disks (GPT or whole disk, MBR, and so on, ZFS configuration).
  • You never told us how many disks you have, and how the RAID controller is configured. In particular, is it configured for hardware RAID?
  • Then you looked in the RAID controller BIOS, and now the controller is configured for "expose RAW", which probably means that it does not do any RAID itself, but instead every physical disk is given to the FreeBSD OS.
  • The crucial question is: Did you change the RAID controller configuration? Because if yes, the content of the disks will either vanish or be changed.
  • Because now, ZFS no longer works, because the disks that it sees are no longer formatted correctly, they don't have GPT tables.
At this point, I fear that further messing with reformatting the disks will create disaster (which may have already happened).

For us to help you, you have to fill in some of that missing information above. You probably want to get the RAID controller back to its original configuration if it was changed. Unfortunately, I know nothing about how Adaptec controllers work (and even for LSI controllers, I only know how they work in JBOD mode).

In general, it is considered a bad idea to run a hardware RAID controller underneath ZFS. Your current configuration is probably a good idea, but changing the configuration may have destroyed your data.
 
Hello Ralphbsz,

big thank you for the replay.
Your Answers:
In the Server are 16 Drives and it is a RaidZ2
Formatet as GPT
Yes i changed the Raid Mode from expose RAW to HBA but i never create a Raid Array e.t.c the controller don´t touch the disks, they take a passthroug in RAID: Expose RAW

I looked with Gparted to the Disks, the ZFS Partition will be found.
With testDisk i take a scan over the whole disk to found the secondary gpt.

My question, when i create a new gpt i solved my problem? The Partition will not be affectet?

gpart create -s gpt /dev/sda
 
6855

Could this the GPT Table?
 
Yes i changed the Raid Mode from expose RAW to HBA but i never create a Raid Array e.t.c the controller don´t touch the disks, they take a passthroug in RAID: Expose RAW

It doesn't appear intellegible to me what that controller may or may not do. It seems to employ some creepy intelligence of its own to do what it might consider useful - which usually is NOT useful.

In the Server are 16 Drives and it is a RaidZ2
Formatet as GPT

This means, ZFS does just expect to find 16 objects (vdevs - which are practically nothing else than bunches of contiguous sectors on disk, i.e. "files"). These objects can be whole disks, or, in this case rather, partitions within GPT.
The question is: is there a way to get (enough of) these objects back untampered.

I looked with Gparted to the Disks, the ZFS Partition will be found.
With testDisk i take a scan over the whole disk to found the secondary gpt.

My question, when i create a new gpt i solved my problem? The Partition will not be affectet?

gpart create -s gpt /dev/sda

Nobody knows. What I would do is, not touch the havoc anymore. Attach one of the disks to a raw connector (where there is no stupidly intelligent "raid" controller in between), and copy it to an empty disk (which is also on a raw connector), Then start to play with the copy, only.
Then, figure if the ZFS object (i.e. the one of these 16 objects) can be found. If we're lucky and the disk is indeed not mangled with raid controller crap, it should be possible to identify that object (and then maybe construct a GPT that will properly address it).

I suppose zdb can recognize such a single object as part of the original pool, and report the original design of the pool.
One could even do that manually: when looking into my pools, on all of the devices (even on l2arc cache devices) I find the string "version" at offset 0x4018. And after that follows the pool data, i.e. name etc., quite like this one:

root@castor:~ # hexdump -C /dev/da0 | more 00104010 00 00 00 20 00 00 00 07 76 65 72 73 69 6f 6e 00 |... ....version.| 00104020 00 00 00 08 00 00 00 01 00 00 00 00 00 00 13 88 |................| 00104030 00 00 00 24 00 00 00 20 00 00 00 04 6e 61 6d 65 |...$... ....name| 00104040 00 00 00 09 00 00 00 01 00 00 00 06 63 61 73 74 |............cast| 00104050 6f 72 00 00 00 00 00 24 00 00 00 20 00 00 00 05 |or.....$... ....|

If this is not some artefact, then it should be a vdev, probably starting at 0x00100000. Pool name is castor. One might probably copy this out with dd, beginning at 0x00100000, put it on whatever disk right from the start, and that might be accepted as a vdev.
 
Back
Top