Solved ZFS recover: is it possible to omit importing one pool?

tarkhil

Member

Reaction score: 19
Messages: 67

Hello

I'm working on the broken zpool.

All attemps to import, including zpool import -F -T 12855264 -R /mnt -f rpool, resulted in

Dec 4 12:56:54 freebsd kernel: Solaris: WARNING: can't open objset 1035, error 5
Dec 4 12:56:54 freebsd ZFS[10143]: pool I/O failure, zpool=rpool error=97
Dec 4 12:56:54 freebsd ZFS[10147]: pool I/O failure, zpool=rpool error=97

Code:
cannot import 'rpool': I/O error
        Destroy and re-create the pool from
        a backup source.

zdb suggest that one pool is damaged beyond recovery:

zdb -d -e rpool

Code:
Dataset mos [META], ID 0, cr_txg 4, 1.02G, 304 objects
Dataset rpool/samba [ZPL], ID 100, cr_txg 135431, 276K, 9 objects
Dataset rpool/ROOT/pve-1 [ZPL], ID 515, cr_txg 10, 4.85G, 74685 objects
Dataset rpool/ROOT [ZPL], ID 259, cr_txg 8, 96K, 7 objects
Dataset rpool/secure/vm-103-state-good [ZVOL], ID 661, cr_txg 2925304, 2.94G, 2 objects
Dataset rpool/secure/vm-101-disk-1 [ZVOL], ID 1441, cr_txg 1445964, 188G, 2 objects
Dataset rpool/secure/vm-104-disk-0@good [ZVOL], ID 1285, cr_txg 2925290, 83.1G, 2 objects
Dataset rpool/secure/vm-104-disk-0 [ZVOL], ID 413, cr_txg 222940, 84.7G, 2 objects
Dataset rpool/secure/vm-104-state-good [ZVOL], ID 173, cr_txg 2925283, 438M, 2 objects
Dataset rpool/secure/subvol-105-disk-0@good [ZPL], ID 521, cr_txg 2925300, 1.36G, 36077 objects
failed to hold dataset 'rpool/secure/subvol-105-disk-0': Input/output error
Dataset rpool/secure/vm-101-disk-2 [ZVOL], ID 942, cr_txg 1445966, 114G, 2 objects
Dataset rpool/secure/vm-103-disk-0@good [ZVOL], ID 1027, cr_txg 2925319, 18.1G, 2 objects
Dataset rpool/secure/vm-103-disk-0 [ZVOL], ID 431, cr_txg 916036, 59.6G, 2 objects
Dataset rpool/secure/vm-101-disk-0 [ZVOL], ID 448, cr_txg 1445962, 160K, 2 objects
Dataset rpool/secure/vm-107-disk-0 [ZVOL], ID 285, cr_txg 219462, 16.9G, 2 objects
Dataset rpool/secure/vm-101-disk-3 [ZVOL], ID 654, cr_txg 2329396, 83.1G, 2 objects
Dataset rpool/secure [ZPL], ID 145, cr_txg 90, 200K, 7 objects
Dataset rpool/data/vm-100-state-good [ZVOL], ID 394, cr_txg 1864406, 114M, 2 objects
Dataset rpool/data/vm-106-disk-0@good [ZVOL], ID 1287, cr_txg 3889962, 3.97G, 2 objects
Dataset rpool/data/vm-106-disk-0 [ZVOL], ID 781, cr_txg 1890481, 27.5G, 2 objects
Dataset rpool/data/vm-100-disk-0@good [ZVOL], ID 518, cr_txg 1864412, 88.5M, 2 objects
Dataset rpool/data/vm-100-disk-0 [ZVOL], ID 266, cr_txg 36631, 88.6M, 2 objects
Dataset rpool/data/vm-106-state-good [ZVOL], ID 668, cr_txg 3889956, 358M, 2 objects
Dataset rpool/data [ZPL], ID 387, cr_txg 9, 96K, 6 objects
Dataset rpool [ZPL], ID 54, cr_txg 1, 10.3G, 26 objects
MOS object 403 (bpobj) leaked
MOS object 522 (DSL deadlist map) leaked
MOS object 1032 (zap) leaked
MOS object 1033 (DSL props) leaked
MOS object 1034 (DSL directory child map) leaked
MOS object 1035 (zap) leaked
MOS object 1036 (DSL dataset snap map) leaked
MOS object 1038 (zap) leaked
Verified large_blocks feature refcount of 0 is correct
Verified large_dnode feature refcount of 0 is correct
Verified sha512 feature refcount of 0 is correct
Verified skein feature refcount of 0 is correct
userobj_accounting feature refcount mismatch: 7 consumers != 8 refcount
encryption feature refcount mismatch: 14 consumers != 15 refcount
project_quota feature refcount mismatch: 7 consumers != 8 refcount

Is it possible to just skip broken dataset and attempt to read the rest?
 

ralphbsz

Son of Beastie

Reaction score: 2,517
Messages: 3,378

I don't know a way of restricting import to skip datasets; it works at the granularity of a pool. It might be possible using ZBD, by modifying metadata to make the damaged pool look like it doesn't even exist.

But I have a question: You are having lots of IO errors. Is that something that could be fixed? Do you know the cause of the IO errors? If it is something like a loose cable or inadequate power, that could be improved. If it is a disk drive with errors on the platter (actual read errors from the disk), that could be overwritten with zeroes. To diagnose IO errors, start by looking at dmesg or /var/log/messages for the causes of the errors.
 
OP
tarkhil

tarkhil

Member

Reaction score: 19
Messages: 67

I don't know a way of restricting import to skip datasets; it works at the granularity of a pool. It might be possible using ZBD, by modifying metadata to make the damaged pool look like it doesn't even exist.

But I have a question: You are having lots of IO errors. Is that something that could be fixed? Do you know the cause of the IO errors? If it is something like a loose cable or inadequate power, that could be improved. If it is a disk drive with errors on the platter (actual read errors from the disk), that could be overwritten with zeroes. To diagnose IO errors, start by looking at dmesg or /var/log/messages for the causes of the errors.
ZBD - did you mean zdb? man zdb does not mention such a trick as 'modify metadata'.
Regarding errors - I don't understand. dd reads all the disk without any errors at all.
 

ralphbsz

Son of Beastie

Reaction score: 2,517
Messages: 3,378

Sorry, I mean zdb, not zbd. Typo.

Errors: In the output you show at the top I see a few error 97 = EINTEGRITY, and at least one 5 = EIO, for pool rpool. If afterwards you were able to read the data with dd, that might mean that the error is intermittent.
 

Alain De Vos

Son of Beastie

Reaction score: 870
Messages: 2,827

Boot with an usb stick and mount the pool. As the pool was mounted by another os it should normally not be imported automatic ?
 
OP
tarkhil

tarkhil

Member

Reaction score: 19
Messages: 67

Sorry, I mean zdb, not zbd. Typo.

Errors: In the output you show at the top I see a few error 97 = EINTEGRITY, and at least one 5 = EIO, for pool rpool. If afterwards you were able to read the data with dd, that might mean that the error is intermittent.
Okay. Do I have any way to try reading as much as possible without, well, purchasing a license ($799) for some soft claiming to be able to read broken ZFS?
 

Alain De Vos

Son of Beastie

Reaction score: 870
Messages: 2,827

Zpool which don't import doesn't look good as there is redundancy in zpool meta data.
 

ralphbsz

Son of Beastie

Reaction score: 2,517
Messages: 3,378

Define "broken". Which really means: Before deciding how to proceed, you have to try to get some root cause analysis.

If the root cause is an IO error, fix that as much as possible, for example by overwriting any sectors that are unreadable (the disk might revector, if it is a disk-internal error), or making a bit-identical copy of the raw disk with the unreadable sectors replaced by a fill pattern. After that, zpool import might get further, if it doesn't have to contend with IO errors.

Otherwise, if the root cause is that on-disk data is readable (without IO errors) but has become corrupted, due to whatever reason, then my choice would be to find someone who has a lot of ZFS internals experience, and can change the metadata by hand to salvage whatever can be salvaged.

I know that commercial software for ZFS recovery exists. I have no idea how good it is, which also implies that I have no idea whether it's worth the $799 you mentioned.
 
OP
tarkhil

tarkhil

Member

Reaction score: 19
Messages: 67

Define "broken". Which really means: Before deciding how to proceed, you have to try to get some root cause analysis.

If the root cause is an IO error, fix that as much as possible, for example by overwriting any sectors that are unreadable (the disk might revector, if it is a disk-internal error), or making a bit-identical copy of the raw disk with the unreadable sectors replaced by a fill pattern. After that, zpool import might get further, if it doesn't have to contend with IO errors.

Otherwise, if the root cause is that on-disk data is readable (without IO errors) but has become corrupted, due to whatever reason, then my choice would be to find someone who has a lot of ZFS internals experience, and can change the metadata by hand to salvage whatever can be salvaged.

I know that commercial software for ZFS recovery exists. I have no idea how good it is, which also implies that I have no idea whether it's worth the $799 you mentioned.
After hardware error, the remains of the data are not mountable. I want to mount what is still readable and read. zdb hints that most of the data is intact.
Can you now tell me something more usable than "find a specialist"?
 

Alain De Vos

Son of Beastie

Reaction score: 870
Messages: 2,827

Realise it could be a case of FUBAR. The output of
Code:
zpool get guid
Or trying to find the guid with zdb
 

ralphbsz

Son of Beastie

Reaction score: 2,517
Messages: 3,378

After hardware error, the remains of the data are not mountable.
Have you dealt with the hardware errors, and made sure that at the block- or sector-level, whatever is readable is readable without errors, and the stuff that was unreadable has had its read errors "fixed"? I think without that, no progress will be made. Once that is done, the missing parts of the block-level metadata will have to be hand-patched, and that will require ZFS internals expertise. No, I don't know how to hire someone who can help, but it should be possible to find the identity of some ZFS developers.

To be blunt: After a hardware error, it is usually time to go to the redundant copy. That might be RAID or a similar technology, or it might be the backup. If your storage hardware was set up so a hardware problem could destroy data, it might be too late to get the data back now.
 

T-Daemon

Daemon

Reaction score: 993
Messages: 1,891

Have a look at the following article, maybe it's of some help:

 

ShelLuser

Son of Beastie

Reaction score: 2,123
Messages: 3,797

Late response, I'm also on mobile so a little limited in the search & paste I can do.

Try importing the pool in readonly mode, it's a separate option which I didn't seem to see up there. This prevents the ZFS processes from writing (this usually happens a lot), and sometimes gives you the little edge.
 
OP
tarkhil

tarkhil

Member

Reaction score: 19
Messages: 67

Have you dealt with the hardware errors, and made sure that at the block- or sector-level, whatever is readable is readable without errors, and the stuff that was unreadable has had its read errors "fixed"? I think without that, no progress will be made. Once that is done, the missing parts of the block-level metadata will have to be hand-patched, and that will require ZFS internals expertise. No, I don't know how to hire someone who can help, but it should be possible to find the identity of some ZFS developers.

To be blunt: After a hardware error, it is usually time to go to the redundant copy. That might be RAID or a similar technology, or it might be the backup. If your storage hardware was set up so a hardware problem could destroy data, it might be too late to get the data back now.
Right now, disk is readable by dd without any errors or slowdowns. RAID... there WAS a RAID, but due to administrative misfortune, second disk is lost totally.
 
OP
tarkhil

tarkhil

Member

Reaction score: 19
Messages: 67

Late response, I'm also on mobile so a little limited in the search & paste I can do.

Try importing the pool in readonly mode, it's a separate option which I didn't seem to see up there. This prevents the ZFS processes from writing (this usually happens a lot), and sometimes gives you the little edge.
-o readonly -N did not make any difference
Have a look at the following article, maybe it's of some help:

Looking right now
Realise it could be a case of FUBAR. The output of
Code:
zpool get guid
Or trying to find the guid with zdb
zdb shows lots of seemingly intact things. There's not a problem to find out guid.
 
OP
tarkhil

tarkhil

Member

Reaction score: 19
Messages: 67

Have a look at the following article, maybe it's of some help:

Ultimately, that helped!
sysctl vfs.zfs.spa.load_verify_metadata=0
sysctl vfs.zfs.spa.load_verify_data=0

and I've imported, mounted and can try to save what's not lost.

ZFS holds like Red Army in Stalingrad.
 
Top