ZFS Recover data from zroot pool

Dear Members,

I hope someone can help me with that.

One of our servers ran out of disk space (yeah, booooh!), it was set up with zfs, one single disk (2TB). It ist part of a 5 node cluster cloud storage.

By trying to expand the storage with a new disk, I screwed up the zpool boot loader and had to reinstall the whole OS. Here I am with a pool with two disks:
Code:
# zfs list
NAME                 USED  AVAIL  REFER  MOUNTPOINT
zroot               1.66G  3.48T    96K  /zroot
zroot/ROOT          1.05G  3.48T    96K  none
zroot/ROOT/default  1.05G  3.48T  1.05G  /
zroot/tmp           68.5K  3.48T  68.5K  /tmp
zroot/usr            630M  3.48T    96K  /usr
zroot/usr/home        96K  3.48T    96K  /usr/home
zroot/usr/ports      629M  3.48T   629M  /usr/ports
zroot/usr/src         96K  3.48T    96K  /usr/src
zroot/var            586K  3.48T    96K  /var
zroot/var/audit       96K  3.48T    96K  /var/audit
zroot/var/crash       96K  3.48T    96K  /var/crash
zroot/var/log        113K  3.48T   113K  /var/log
zroot/var/mail      92.5K  3.48T  92.5K  /var/mail
zroot/var/tmp       92.5K  3.48T  92.5K  /var/tmp

My intention was to recover the files from the damaged former zroot pool by mounting the "full" disk of the old pool and copy the files into a new location.

Unfortunately I am not able to import/mount/whatever the old disk into the new zpool:

Code:
mount  /dev/ada2p3 /riakData
mount: /dev/ada2p3: Invalid argument

Code:
# ls -l /dev/ada2*
crw-r-----  1 root  operator  0x84 Nov 26 17:14 /dev/ada2
crw-r-----  1 root  operator  0x87 Nov 26 17:14 /dev/ada2p1
crw-r-----  1 root  operator  0x89 Nov 26 17:14 /dev/ada2p2
crw-r-----  1 root  operator  0x8b Nov 26 17:14 /dev/ada2p3

Code:
# zpool import
   pool: zroot
     id: 7653252978644668067
  state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
    devices and try again.
   see: http://illumos.org/msg/ZFS-8000-6X
config:

    zroot                     UNAVAIL  missing device
     diskid/DISK-Z1X295YXp3  ONLINE

    Additional devices are known to be part of this pool, though their
    exact configuration cannot be determined.

No, I don't have the "missing device" anymore.

In short:

Old system:
zroot: disk_A

Attempt to increase space with disk_B
zroot: disk_A, disk_B => not bootable anymore, reinstall

New system:
zroot: disk_C, disk_B (yes, the disk from the old system)

disk_A not mountable in any way

Thank you very much for any advice !!
 
It's not 100% clear but I get the impression you've done the following:

1) Added disk_B to old pool, creating a stripe (effectively RAID0) between disk_A and disk_B
2) Then moved disk_B to a new system and created a new pool using disk_B and disk_C

As you may know, you can not lose any disks from a stripe/RAID0 without data loss. This is why RAID0 is pretty much universally discouraged unless you have a specific requirement for it. Additionally, if the data is the slightest bit important it should be backed up, even if the pool has redundancy. ZFS makes this easy, just zfs send the data to another ZFS system.

Code:
mount /dev/ada2p3 /riakData
You can't mount a zpool, you need to import it.

Code:
# zpool import
pool: zroot
id: 7653252978644668067
state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://illumos.org/msg/ZFS-8000-6X
config:

zroot UNAVAIL missing device
diskid/DISK-Z1X295YXp3 ONLINE

Additional devices are known to be part of this pool, though their
exact configuration cannot be determined
Here's your main issue. As above I assume you attached disk_B to create a stripe, and have now pulled disk_B out and used it in a new pool (You probably had to use -f to make the new pool as well, because it usually complains if it finds existing ZFS labels on the disk). ZFS is refusing to import the old pool because it knows you added another disk to it that is now missing. There is no supported way out of this other than restoring from backup.

To be honest I can't see why it should of stopped booting after adding disk_B, although obviously I don't know exactly what you did. It may well have been fixable at that point. Unfortunately it looks like you've kept going without really understanding what you're doing, and have probably destroyed the old pool beyond any simple recovery.

Edit,
Just to re-iterate, because I can't stress how important this is, back up your data! You have 5 machines, storing anywhere up to 2TB each, which is a hell of a lot of data, and I assume it's at least mildly important if you're asking for recovery help. My most important system has a pool with just under 1TB of data, replicated twice, and I still worry about it. Your new system has 4TB just by itself, and you've doubled the risk by striping 2 disks. Just some unused workstation with a bunch on on-board sata ports is enough. Fill it with big disks in RAID-Z2 and get a scheduled send/recv going. It's incredibly quick once you've got the initial full sync out of the way (unless you're changing hundreds of gigs a day).

The only time I would agree with not backing up is if the cloud storage system is replicating between 3 or more storage servers at the application level - In which case you've taken a specific decision to move redundancy from the server level to application. People like Backblaze do this (don't think they use ZFS) and their application handles everything. If a server fails it pro-actively duplicates all the data that was on that system somewhere else from the additional copies they still have, in order to maintain the number of copies they've told the system to keep.
 
Dear usdmatt,

thank you very much for your explanation.
Yes, your assumption is right, this server is part of a system which handles redundancy at application layer, this is why we don't use mirrored disks - we just set up a new box and attach it to the cluster, if something fails with the hardware.

This is why I kept going to try things, that I've never done before.

So, I have no chance to recover the files from this disk of the old pool ?

Thank you
 
Back
Top