ZFS Help on zpool missing suddenly

Hi All:
I meet a great issue on import zpool from one day the hardware crashed (replaced the raid card finally), for my scenario, I do not use multiple HDs to form raidz, I use a physical RAID card to form a hardware RAID-5 instead, so I use a single partition for zpool create pool_name /dev/da0p4, and now the symptoms are;

Code:
zpool import pool_name
zpool import -F pool_name

will cause kernel panic and reboot automatically too.

Code:
# zdb
cannot open '/boot/zfs/zpool.cache': No such file or directory

but however, I can perform a scan on this pool

Code:
zdb -AAA -F -X -cc -L -e -mm pool_name

...
...
...
shows many lines

Traversing all blocks to verify checksums ...

3.54M completed (   3MB/s) estimated time remaining: 68hr 13min 42sec
173G completed (   6MB/s) estimated time remaining: 27hr 58min 58sec
579G completed (   5MB/s) estimated time remaining: 13hr 59min 19sec
836G completed (   4MB/s) estimated time remaining: 0hr 00min 00sec        block traversal size 897927784960 != alloc 897922660864 (unreachable -5124096)

        bp count:         6873514
        ganged count:           0
        bp logical:    899358429696      avg: 130844
        bp physical:   897602511872      avg: 130588 compression:   1.00
        bp allocated:  897927784960      avg: 130635 compression:   1.00
        bp deduped:             0    ref>1:      0 deduplication:   1.00
        SPA allocated: 897922660864     used:  6.70%

        additional, non-pointer bps of type 0:        909
        Dittoed blocks on same vdev: 23265

anyways I could remount/import (even read only) the pool to get the data back?
 
Fine I search this forum and find this

Code:
zpool import -o readonly=on poolname

gratefully helped me.
 
Is the problem fixed? Can you report what caused it, so we can learn from it?
... I do not use multiple HDs to form raidz, I use a physical RAID card to form a hardware RAID-5 instead, ...
That is a bad idea. ZFS has a RAID system built in that is way better than using a hardware or outboard RAID with it. There are multiple reasons for that. Let me think of the biggest ones: First, ZFS by design doesn't have the "RAID write hole" a.k.a. "small update penalty", but it does like to make small writes. So performance will be better by using its own write mechanism (which is safe and efficient), instead of a hardware mechanism (which is also safe, but can't be efficient, since it doesn't know that ZFS small writes are non-overwriting). Second reason: If a drive fails, ZFS only has to resilver = reconstruct data that is allocated, while a hardware RAID has to do all the space in the array. Which makes RAID reconstruction for hardware RAID slower than for ZFS (which is weird, isn't it?). And as we have known since the original RAID paper (Gibson et al., 1989), the MTTR = mean time to repair enters linearly into the durability calculation, so faster reconstruction makes RAID more reliable. There are lots of other reasons too.

...will cause kernel panic and reboot automatically too.
In general, if you want us to help debug things, just saying "kernel panic" isn't useful. You have to give us detailed error messages. If they aren't stored in /var/log/messages, posting a cell phone picture of the console is a good starting point.
 
Yes. the problem is fixed not perfectly (however, the zpool need to reconstruct) but at least I could get back all data within the pool.

Rich (BB code):
... I do not use multiple HDs to form raidz, I use a physical RAID card to form a hardware RAID-5 instead, ...

Thanks for your explanation I know what the advantage will take with ZFS raidz but how do you say NO if your client proposed and insist for such design ?

Rich (BB code):
In general, if you want us to help debug things, just saying "kernel panic" isn't useful. You have to give us detailed error messages. If they aren't stored in /var/log/messages, posting a cell phone picture of the console is a good starting point.

It just kernel panic and reboot and NO, there is no any message even I dmesg and checked the /var/log/message, after the reboot it just shown normal boot message
 
Thanks for your explanation I know what the advantage will take with ZFS raidz but how do you say NO if your client proposed and insist for such design ?
In that case ... you go drink. I'll go with you and buy you a drink. Sorry to bring up this topic.

Rich (BB code):
It just kernel panic and reboot and NO, there is no any message even I dmesg and checked the /var/log/message, after the reboot it just shown normal boot message
[/QUOTE]
Normally, a kernel panic will have a message and a stack trace on the console. I don't think those are stored in the log, because the kernel is already too far gone to write to a file system. I know one can set up the system so that kernel panics cause a kernel dump, which on the next boot will be automatically read and error messages copied ... somewhere I forgot.

But since the problem is solved, not a problem right now.
 
Back
Top