ZFS Panic Galore

kpa · May 6, 2012

It starts to look like there's something corrupted pretty badly in the pool, I can not provide much more help unfortunately. You could try your luck at the freebsd-fs mailing list.

hackish · May 6, 2012

Thanks for your help. I'll take it up with them. In the meantime I'm dumping the filesystem to a file to test it like that. If it still croaks I'll sign up for FreeBSDCon and take it down there on an external drive.

Beeblebrox · May 6, 2012

cannot import 'email': pool already exists

1. Backup your pool somewhere
2. connect the HDD with the pool to the amd64 9 version you have installed
# zpool list
3. Is email listed? Does it say ONLINE or FAULTED?
4. If online or faulted, 1st=general info 2nd=partition and zpool error info 3rd=run the repair steps.
# zpool get all email
# zpool satus -v email
# zpool scrub email
5. If you get kernel panic: The 2nd command above showed the info for the slice. Why was the pool online or visible without an import to begin with? REBOOT and (assuming pool email has no sub-datasets?):
# zfs get all email
# zfs set canmount=noauto email
# umount -t zfs email
# zfs export email
Now you have more control over the pool since you can mount/unmount as you like. Now to start recovery. Some reading first though, so as not to make any mistakes during the process:
- Read the section "Repairing ZFS Storage Pool-Wide Damage" in http://docs.oracle.com/cd/E19963-01/html/821-1448/gbbwl.html
- This is also good: http://docs.oracle.com/cd/E19082-01/817-2271/gavwg/index.html. Suggests running
# zpool history email
To see where and how exactly the error messages start showing up. I strongly urge you to do a full read of the second link before beginning the procedure.

Wasn't sure if the altroot was an option or path...

The mountpoint in import needs to be folder name, not device.

From post #15

hackish · May 6, 2012

Beeblebrox said:
1. Backup your pool somewhere

In process:
# dd if=/dev/ad16s1g > zfsimage.dat
Hope I'm doing it right...

2. connect the HDD with the pool to the amd64 9 version you have installed
# zpool list
3. Is email listed? Does it say ONLINE or FAULTED?

2 is done albeit a bit complicated since I have 2 pools named email.
# zpool list
Does not show the other pool, but
# zpool import
shows

Code:

  pool: email
    id: 10433152746165646153
 state: ONLINE
status: The pool was last accessed by another system.
action: The pool can be imported using its name or numeric identifier and
        the '-f' flag.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

        email       ONLINE
          ada1s1g   ONLINE

4. If online or faulted, 1st=general info 2nd=partition and zpool error info 3rd=run the repair steps.
# zpool get all email

Given that it doesn't find the pool on that disk step 4 fails.

Beeblebrox · May 6, 2012

2 is done albeit a bit complicated since I have 2 pools named email.

cannot import 'email': pool already exists

Apologies, but at this point, you deserve a "facepalm", buddy!
How can you expect to mount 2 ZFS pools by the same name at the same time? Let's re-name the pool you want to restore / rescue:
# zpool import -f -R <folder_name> email <newname>
folder_name is a folder-path in root (/), newname can be anything you desire. To make it easy, you can make both variables the same so that next time you import it mounts automatically to <newname>.

hackish · May 6, 2012

Please see post #25

# zpool import -f -R /altroot 10433152746165646153 olddata

The only reason there are 2 pools named email is that one is the original that I'm trying to get the data off.

Beeblebrox · May 6, 2012

Ah! Missed that one...

reason there are 2 pools named email is..

Yeah, I figured that out.

Post 25 also states that when you import the pool that way, the system crashes. The 1st link I posted has several different methods worth trying in order to get the pool imported. One is:
# zpool import -f -o readonly=on -R /newname email newname

Hope you get it sorted out...

t1066 · May 6, 2012

Would you try a minimalistic approach? Remove the new disk. Boot your system into single user mode preferably using a 9.0 release CDROM. Run

# zpool import

to see if your pool is recognized. If so, run

# zpool import email

or run

# zpool import -f email

if the above command fail. If either one of the above import works, you should then scrub the pool.

hackish · May 6, 2012

t1066:
I have tried this a number of times.
As soon as ZFS starts up the kernel panics. Looking at the backtrace it seems to happen as soon as the system tries to auto-scrub. I was looking at # zpool scrub -s but I think it will be a race. As soon as the filesystem copy is done I'll take a few more cracks at it. With the filesystem dumped to a file via dd I think it will be easier to "play" with.

Beeblebrox · May 6, 2012

it seems to happen as soon as the system tries to auto-scrub.

That's why I suggested the read-only mount (or any other method which will prevent the auto-scrub from running). Then you can hopefully (maybe) copy the data off from the pool.

hackish · May 6, 2012

Beeblebrox said:
That's why I suggested the read-only mount (or any other method which will prevent the auto-scrub from running). Then you can hopefully (maybe) copy the data off from the pool.

Yes, good point, I didn't think of that. As soon as the dd has completed it's 450Gb dump I'll try it out.

hackish · May 7, 2012

Beeblebrox, thanks for your help. Mounting the system as read-only allowed me to read all the data from the volume. 100% of it was recovered and no files were corrupted. I've kept an image of the damaged filesystem so on my own time I can try and find/fix the bug in the kernel.

Beeblebrox · May 7, 2012

Glad to hear!
Quick note I reserved for the end: You were using MBR partition structure on your first HDD; hope you are using GPT structure on your second / new HDD.

ZFS on MBR results in a logical volume inside a logical partition (ada1s1g) and, though not 100% sure on this, such setup may have contributed to the initial problem. ZFS should play much nicer with with an allocated partition named ada1p<n> which you'll get under GPT. Good Luck...