zpool split in two

A

Anonymous

Guest
Just tell me upfront, am I SOL?

I'll admit, I'm new to zfs and I didn't RTFM. I've been using it for a little over a year, but our FreeBSD 8.0 system crashed on me, and I lost some binaries including zfs tools. I tried fixing with Fixit but had no such luck so I rebuilt world and kernel on a fresh hard drive. The old system had zpool raidz containing da0 and da1 (which are actually two links to an array of 16 drives) and I needed to remount these in a hurry. I asked the guy who had first set it up for us (not me) for some advice and all I could get out of him was "Google it" after I had already been doing so for 4 hours.

Unfortunately I started acting on my Google finds before I fully knew what I was doing. I started trying to "mount" the drives...

I know now that I should have been using zpool import, but before I had come to such a revelation I had already tried to force a mount of da0, creating a new "tank" pool (same name as the old one. In doing this I have removed da0 from the original tank pool, creating a new one. When I realized that wasn't what I had wanted it was already too late. I've since destroyed that pool and this is where we stand:

Code:
# zpool import
   pool: tank
     id: 4433502968625883981
  state: UNAVAIL
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
    see: http://www.sun.com/msg/ZFS-8000-5E
config:

         tank        UNAVAIL  insufficient replicas
           da1       ONLINE

if I list destroyed pools:

Code:
# zpool import -D
   pool: tank
     id: 12367720188787195607
  state: ONLINE (DESTROYED)
action: The pool can be imported using its name or numeric identifier.
config:

         tank        ONLINE
           da0       ONLINE


if I debug each drive...
bad drive:

Code:
# zdb -l /dev/da0
--------------------------------------------
LABEL 0
--------------------------------------------
     version=13
     name='tank'
     state=2
     txg=50
     pool_guid=12367720188787195607
     hostid=2180312168
     hostname='proj.bullseye.tv'
     top_guid=6830294387039432583
     guid=6830294387039432583
     vdev_tree
         type='disk'
         id=0
         guid=6830294387039432583
         path='/dev/da0'
         whole_disk=0
         metaslab_array=23
         metaslab_shift=36
         ashift=9
         asize=6998387326976
         is_log=0
--------------------------------------------
LABEL 1
--------------------------------------------
     version=13
     name='tank'
     state=2
     txg=50
     pool_guid=12367720188787195607
     hostid=2180312168
     hostname='proj.bullseye.tv'
     top_guid=6830294387039432583
     guid=6830294387039432583
     vdev_tree
         type='disk'
         id=0
         guid=6830294387039432583
         path='/dev/da0'
         whole_disk=0
         metaslab_array=23
         metaslab_shift=36
         ashift=9
         asize=6998387326976
         is_log=0
--------------------------------------------
LABEL 2
--------------------------------------------
     version=13
     name='tank'
     state=2
     txg=50
     pool_guid=12367720188787195607
     hostid=2180312168
     hostname='proj.bullseye.tv'
     top_guid=6830294387039432583
     guid=6830294387039432583
     vdev_tree
         type='disk'
         id=0
         guid=6830294387039432583
         path='/dev/da0'
         whole_disk=0
         metaslab_array=23
         metaslab_shift=36
         ashift=9
         asize=6998387326976
         is_log=0
--------------------------------------------
LABEL 3
--------------------------------------------
     version=13
     name='tank'
     state=2
     txg=50
     pool_guid=12367720188787195607
     hostid=2180312168
     hostname='proj.bullseye.tv'
     top_guid=6830294387039432583
     guid=6830294387039432583
     vdev_tree
         type='disk'
         id=0
         guid=6830294387039432583
         path='/dev/da0'
         whole_disk=0
         metaslab_array=23
         metaslab_shift=36
         ashift=9
         asize=6998387326976
         is_log=0
and good drive:

Code:
# zdb -l /dev/da1
--------------------------------------------
LABEL 0
--------------------------------------------
     version=13
     name='tank'
     state=0
     txg=4
     pool_guid=4433502968625883981
     hostid=2180312168
     hostname='zproj.bullseye.tv'
     top_guid=11718615808151907516
     guid=11718615808151907516
     vdev_tree
         type='disk'
         id=1
         guid=11718615808151907516
         path='/dev/da1'
         whole_disk=0
         metaslab_array=23
         metaslab_shift=36
         ashift=9
         asize=7001602260992
         is_log=0
--------------------------------------------
LABEL 1
--------------------------------------------
     version=13
     name='tank'
     state=0
     txg=4
     pool_guid=4433502968625883981
     hostid=2180312168
     hostname='zproj.bullseye.tv'
     top_guid=11718615808151907516
     guid=11718615808151907516
     vdev_tree
         type='disk'
         id=1
         guid=11718615808151907516
         path='/dev/da1'
         whole_disk=0
         metaslab_array=23
         metaslab_shift=36
         ashift=9
         asize=7001602260992
         is_log=0
--------------------------------------------
LABEL 2
--------------------------------------------
     version=13
     name='tank'
     state=0
     txg=4
     pool_guid=4433502968625883981
     hostid=2180312168
     hostname='zproj.bullseye.tv'
     top_guid=11718615808151907516
     guid=11718615808151907516
     vdev_tree
         type='disk'
         id=1
         guid=11718615808151907516
         path='/dev/da1'
         whole_disk=0
         metaslab_array=23
         metaslab_shift=36
         ashift=9
         asize=7001602260992
         is_log=0
--------------------------------------------
LABEL 2
--------------------------------------------
     version=13
     name='tank'
     state=0
     txg=4
     pool_guid=4433502968625883981
     hostid=2180312168
     hostname='zproj.bullseye.tv'
     top_guid=11718615808151907516
     guid=11718615808151907516
     vdev_tree
         type='disk'
         id=1
         guid=11718615808151907516
         path='/dev/da1'
         whole_disk=0
         metaslab_array=23
         metaslab_shift=36
         ashift=9
         asize=7001602260992
         is_log=0
--------------------------------------------
LABEL 3
--------------------------------------------
     version=13
     name='tank'
     state=0
     txg=4
     pool_guid=4433502968625883981
     hostid=2180312168
     hostname='zproj.bullseye.tv'
     top_guid=11718615808151907516
     guid=11718615808151907516
     vdev_tree
         type='disk'
         id=1
         guid=11718615808151907516
         path='/dev/da1'
         whole_disk=0
         metaslab_array=23
         metaslab_shift=36
         ashift=9
         asize=7001602260992
         is_log=0

Notice the different hostnames? The good drive (da1) still says the old hostname "zproj" and the "bad" drive (da0) says the new hostname "proj"

Can anyone tell me if my ignorance (and the lack of professional assistance) have totally screwed me here? Since they're a RAID is there any chance I may still be able to recover this data? Better yet, is there a way to get this pool back together to its former glory?

I really need some guidance with this. Any help is greatly appreciated.
 
Was it really a raidz vdev of only 2 devices? Sounds unlikely, as you need 3 drives for raidz1, 4 drives for raidz2, and 5 drives for raidz3. More likely is that it's a mirror vdev. If it really is a raidz vdev, then you're screwed.

If it is a mirror vdev, then you can force the import of the other device, and bring it up degraded. To do so, remove the da0 device from the system, so that zfs won't see it at all.

Then, just do a # zpool import tank and it should import the pool in degraded mode. If that works, you can convert the vdev back to a single device via # zpool detach tank da0, then export the pool to save the changes.

Then remove the da1 device from the system, add the da0 device back, import the "broken" tank pool, and # zpool destroy tank to remove all traces of the "tank" name from the da0 device.

Next, add both devices, and import the "tank" pool. It should come up with only da1 listed as part of the pool. Finally, you can # zpool attach tank da0 to re-add the da0 device to the pool, thus turning it into a mirror vdev again. And then wait while it resilvers the data back onto da0.
 
As I said I didn't set it up and I'm unfamiliar with all this. From what I gathered I thought it was raidz, but if you say that's not possible that gives me some hope.

I'll give your suggestion a shot. At this point it's the only thing I have to go on.
 
A note to keep in mind for the future: keep notes!
Whether you do it via a wiki, a blog or simply on paper - keep notes when you set up your systems (printouts of commands, configuration files, etc). It will save you lots of time.
 
tingo said:
A note to keep in mind for the future: keep notes!
Whether you do it via a wiki, a blog or simply on paper - keep notes when you set up your systems (printouts of commands, configuration files, etc). It will save you lots of time.

Agreed. I wish the guy who set this up had provided some notes. I have some emails, but they don't get too much into technical setup. Luckily I've been keeping my own wiki notes of configs on the system, so rebuilding won't be a problem, but I can't really live with data loss. (Did I mention this wasn't fully backed up x()
 
The old system had zpool raidz containing da0 and da1 (which are actually two links to an array of 16 drives) and I needed to remount these in a hurry

Any chance that this ZFS mirror was set up by two different hardware raid arrays?
 
phoenix said:
If it is a mirror vdev, then you can force the import of the other device, and bring it up degraded. To do so, remove the da0 device from the system, so that zfs won't see it at all.

No such luck:
Code:
# zpool import tank
cannot import 'tank': one or more devices is currently unavailable
# zpool import -f tank
cannot import 'tank': one or more devices is currently unavailable
That is with da0 detached (and da1 becomes da0). It still recognizes that another device should be there. Does that give any hints to whether it is a raidz or a mirror? Are there any other args that might force it to import degraded?
 
The pool might have been a pool that didn't have any redundancy (that's what gkontos above is hinting at), just two devices da0 and da1 striped and the underlying hardware RAID handling the redundancy individually for the two devices. If that's the case it's going to be very tough to recover anything from remaining disks.
 
After following advice from this forum and elsewhere I've determined that there isn't much I can do about da0. As far as the zfs system is concerned it is blank. I'm really kicking myself over this because I know if I had read certain things before starting to try others I know I wouldn't have had any problems.

Hopefully the raid tables all that's been overwritten, it's not like the entire disks were formatted. Everything has been taken to a data recovery center, so we'll see what happens.

Let mine be the example that scares any other newbies straight: thoroughly read the manual before trying anything and backup everything.
 
^^ I'd add the following:

With the availability of free hypervisors like Xen, VMware player, Virtualbox, VirtualPC, etc - test in a virtual environment first!

i.e., RTFM - then get comfortable playing with the new technology in test on a VM, and then consider rolling to physical hardware. Free hypervisor + free OS = no real excuse these days.
 
Good point, throAU. I would add also that for the 'hands on' type of people, this is especially good, because you can read just a bit of the manual, play around, and then continue reading, snapshot, play some more, etc..
 
Back
Top