ZFS pool unavailable, disk is there but can't re-attach it

Some kind of bizarre power event caused this ZFS pool to break. The motherboard blew and I've dropped the drives into a similar alternate frame but the drives are reassigned and I don't know enough about ZFS to respecify the drives.

I didn't set this up and I think it's a bit bizarre that we have a mix of SATA1/2 devices and as well GPT and MBR partition tables. These should have been scrubbed of previous partition tables and configured as identical as possible, in my opinion.

I have tested the two drives independently, as far as a scan with SeaTools on the same interface and there is nothing to prevent this pool from going online that I can see.

One of the drives (the one that is still coming fully on line), /dev/ada1 has a corrupted primary partition table, however, it is coming up successfully in the pool and some files are accessible after the 18 hour first online command as issued below and after a zpool clear. For the time being, I have re-started the online command below and I'll see if that drive doesn't come on line fully after more rebuild time.

Is it normal that the ZFS online command is non-verbose? And, does it normally take more than 18 hours for a 2 TB rebuild?

ZFS appears to have decided the /dev/ada3s1 drive was 'off' and won't let me put it back online. I have tried clearing the pool and then: # zpool online zShare /dev/ada3s1 and this appears to hang (even after waiting 18 hours). Should I be waiting longer?

Code:
root@server:/root # uname -a
FreeBSD server 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243825: Tue Dec  4 09:23:10 UTC 2012     root@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64
Code:
root@server:/root # zpool list -v
NAME                    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
zShare                 2.72T   796G  1.94T    28%  1.00x  UNAVAIL  -
  ada1                  928G   362G   566G         -
  8683733800792668130  1.81T   433G  1.39T     16.0E
Code:
root@server:/root # zpool status -xv
  pool: zShare
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://illumos.org/msg/ZFS-8000-HC
  scan: none requested
config:

        NAME                   STATE     READ WRITE CKSUM
        zShare                 UNAVAIL      0     0     0
          ada1                 ONLINE       0     0     0
          8683733800792668130  UNAVAIL      0     0     0  was /dev/ada3s1

errors: Permanent errors have been detected in the following files:

        zShare:<0x7386>
Code:
root@server:/root # gpart list
Geom name: ada0
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 250069646
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: ada0p1
   Mediasize: 65536 (64k)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 17408
   Mode: r0w0e0
   rawuuid: 3040ef81-bebc-11e2-ab2c-d85d4c803337
   rawtype: 83bd6b9d-7f41-11dc-be0b-001560b84f0f
   label: (null)
   length: 65536
   offset: 17408
   type: freebsd-boot
   index: 1
   end: 161
   start: 34
2. Name: ada0p2
   Mediasize: 123480244224 (115G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 82944
   Mode: r1w1e1
   rawuuid: 30412652-bebc-11e2-ab2c-d85d4c803337
   rawtype: 516e7cb6-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 123480244224
   offset: 82944
   type: freebsd-ufs
   index: 2
   end: 241172513
   start: 162
3. Name: ada0p3
   Mediasize: 4294967296 (4.0G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 3221242880
   Mode: r1w1e0
   rawuuid: 3041c217-bebc-11e2-ab2c-d85d4c803337
   rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 4294967296
   offset: 123480327168
   type: freebsd-swap
   index: 3
   end: 249561121
   start: 241172514
Consumers:
1. Name: ada0
   Mediasize: 128035676160 (119G)
   Sectorsize: 512
   Mode: r2w2e3

Geom name: ada2
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 625142447
first: 63
entries: 4
scheme: MBR
Providers:
1. Name: ada2s1
   Mediasize: 263176704 (251M)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 32256
   Mode: r0w0e0
   attrib: active
   rawtype: 131
   length: 263176704
   offset: 32256
   type: linux-data
   index: 1
   end: 514079
   start: 63
2. Name: ada2s2
   Mediasize: 10248698880 (9.6G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 263208960
   Mode: r0w0e0
   rawtype: 130
   length: 10248698880
   offset: 263208960
   type: linux-swap
   index: 2
   end: 20531069
   start: 514080
3. Name: ada2s3
   Mediasize: 40970119680 (38G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 1921973248
   Mode: r0w0e0
   rawtype: 131
   length: 40970119680
   offset: 10511907840
   type: linux-data
   index: 3
   end: 100550834
   start: 20531070
4. Name: ada2s4
   Mediasize: 268588293120 (250G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 4237387264
   Mode: r0w0e0
   rawtype: 131
   length: 268588293120
   offset: 51482027520
   type: linux-data
   index: 4
   end: 625137344
   start: 100550835
Consumers:
1. Name: ada2
   Mediasize: 320072933376 (298G)
   Sectorsize: 512
   Mode: r0w0e0

Geom name: ada3
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 3907029167
first: 63
entries: 4
scheme: MBR
Providers:
1. Name: ada3s1
   Mediasize: 2000397795328 (1.8T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r0w0e0
   rawtype: 131
   length: 2000397795328
   offset: 1048576
   type: linux-data
   index: 1
   end: 3907028991
   start: 2048
Consumers:
1. Name: ada3
   Mediasize: 2000398934016 (1.8T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r0w0e0
 
Sorry to inform you about this but your pool is probably gone forever as it does not have any redundancy, in other words it's a striped RAID0 pool. There is no such thing as a rebuild of non-redundant ZFS pools, on your system ZFS is only waiting for the good disk/partition to be attached and if it doesn't find the proper disk or partition it will wait forever. The ZFS term for a rebuild is "resilvering" and it applies only to mirror or RAID-Z vdevs. If a vdev of a pool is under a resilver operation the zpool status output will show "resilvering" on the vdev.
 
Damn, I've got about 40% of the files off of it, and the most vital files have been checked, and are complete. I honestly felt really lucky when I got a slough of complete files.
 
By the way, /dev/ada3s1 is there and available, but for some reason ZFS doesn't 'see' it entirely.

Issuing the command # zpool zShare online /dev/ada3s1 hangs, but there is disk activity and after a few hours the /zShare directory became populated and I have been able to copy some files, but not all; in this case, ZFS doesn't issue an error, it just hangs.

Frankly, some files, especially the vital files I've already recovered, is better than no files.
 
Just another question, the drive which ZFS indicates as being unavailable is shown in gpart like this:

Code:
Geom name: ada3
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 3907029167
first: 63
entries: 4
scheme: MBR
Providers:
1. Name: ada3s1
   Mediasize: 2000397795328 (1.8T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r0w0e0
   rawtype: 131
   length: 2000397795328
   offset: 1048576
   type: linux-data
   index: 1
   end: 3907028991
   start: 2048
Consumers:
1. Name: ada3
   Mediasize: 2000398934016 (1.8T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r0w0e0

The only thing I see, is that ada3s1 is showing up as linux-data, which isn't right, before the fault this was in the zpool, so if it's reverted to a backup partition table or some such thing, are they any diagnostics I can run to query ZFS on what it doesn't like about the disk now?

Gpart indicates the drive is okay.

Any help is greatly appreciated, I'd love to get all the files.
 
Ask on the freebsd-fs mailing list. This kind of recovery is out of scope for these forums unless by luck there happens to be someone here who has done something similar.
 
Back
Top