ZFS ZFS pool import Kernel Panic - blkptr at 0x8068e69c0 has invalid CHECKSUM 0

Jimlad

Member

Thanks: 1
Messages: 24

#1
Hi all,

I'm hoping someone can help or at least point me in the right direction. That direction is probably Mr Matt Ahrens himself, but failing that any help very much welcome.

I'v just build a new server (FreeBSD 11.1-RELEASE), Supermicro X11SDV-12C-TLN2F Intel D-2166NT. I imported my pool successfully from another FreeBSD machine and everything was fine for about a month. Now out of the blue with no disk or scrub issues, the machine reboot loops with a kernel panic blkptr at 0x8068e69c0 has invalid CHECKSUM 0

I believe this is corruption possibly metadata, but I'm not sure how to start to resolve. I'm unable to use either zfs/zpool commands without a kernel panic, though zdb does work.

This is a 6 drive RAID-Z2 and part from a few disk replacements over the years, I've had no issues. The disks are on the same LSI IT mode 2008 HBA as they were in the old machine. I tried importing on a 11.2RC machine with the same issues.

Thanks.
 

ShelLuser

Son of Beastie

Thanks: 1,490
Messages: 3,262

#2
I've had this happen to me once and the only solution I could find was make a backup, rebuild the pool and try again.

In my situation the problems started as soon as the system tried to write to the pools, but reading was no problem. Whic is what I would suggest trying.

Boot from a rescue media (rescue CD or such) and then try these commands:

# zpool import, this should both load the ZFS drivers and eventually show you the available ZFS pools. I prefer doing it this way; so keeping the loading process separated from the more serious stuff.

Then: # zpool import -fNR /mnt -o readonly=yes zroot (I'm assuming your pool is called zroot).

If this doesn't give you any errors try checking the pool condition: zpool -v status and your filesystems: zfs list.

At this point my recommendation would be to gain access to a remote host and start sending it ZFS backups. You can enable networking by using the /etc/netstart script, just start it as root. If you're not using DHCP you'll have to assign the IP address and routing manually (using ifconfig and route).

Then, for example: # zfs send zroot | ssh backup@backuphost "dd of=/opt/backups/zroot-1206.zfs".

I hope this can help out a bit.
 
OP
OP
Jimlad

Jimlad

Member

Thanks: 1
Messages: 24

#3
Thanks ShelLuser, the issue I have is an zpool/zfs commands panics the system. I've booted in to single user safe mode, tried importing readonly but it just panics. zdb works, but throws up the error about CHECKSUM 0. Its very strange how it just went bang. I was reading data from the pool and the next minute it was inaccessible. Connecting to the machine showed it stuck in a reboot panic loop.
 
OP
OP
Jimlad

Jimlad

Member

Thanks: 1
Messages: 24

#4
Also just to confirm the pool with the corruption is not the zroot. This is a separate pool just for data. So the machine boots of a separate zroot disk and then panics during boot when it tries to import/mount the secondary pool.
 
OP
OP
Jimlad

Jimlad

Member

Thanks: 1
Messages: 24

#5
Does anyone know where's best to get an answer to this? Basically I need to work out how to do low level zfs recovery of data.
 

t1066

Active Member

Thanks: 84
Messages: 226

#6
It is better to ask on the fs mailing list.
You may also try to use linux or openindiana to import the pool.
 

Oko

Daemon

Thanks: 767
Messages: 1,620

#7
I am super interested in this thread.

Jimlad could you please post machine specifications including amount of RAM and if you are using ECC RAM? Also please post dmesg. Please do the same of the original machine where the ZFS pool is created. I am guessing original machine was trashed. Please post OS version of the original machine where the pool was created unless dmesg is still available.

Do you know how to use DTrace?
 
OP
OP
Jimlad

Jimlad

Member

Thanks: 1
Messages: 24

#8
Hey Oko

Thanks for the interest, this has hit me pretty hard. I was hoping just to mount read-only recreate the pool somewhere and copy the data. as you can see this isn't that simple. The constant kernel panic relating to any zfs/zpool command relating to the pool win question is very worrying. I would have though any corruption data/meta data would have little to no impact on the OS especially as this is not the zroot, that on separate disks.

The original pool was created on a Freenas install many moons ago on the original hardware which was then installed with FreeBSD 10.2 (upgrade up to 11.1) and the pool imported with no issues. Recently with the new hardware I virtualised FreeBSD 11.2 on ESXi 6.7 and passthrough the LSI HBA (IT Mode). This was running fine,

Current machine
Supermicro X11SDV-12C-TLN2F supermicro.com.tw
64GB DDR4-2133MHz Registered ECC
LSI 9211-8i IT Mode
6 3TB Disk RAID-Z2
ESX 6.7 VT-d

Virtual Machine
8vCPU
PCI passthrough LSI 9211-8i RAID-Z2 pool
FreeBSD 11.2
10GB GB RAM
ZROOT 80GB vDisk


Original machine
Supermicro A1SAi-2750F C2750 8 Core
64GB DDR3 1600MHz Registered ECC
LSI 9211-8i IT Mode
6 3TB Disk RAID-Z2
FreeBSD 11.1 on bare metal


The old system wasn't trashed. If I try to import in to the old FreeBSD 11.1 install on bare-metal (not VM but same new hardware) I get the same Kernel panic.

Also I've installed 10.4 on a new disk and tried import the pool with the same issue, but this time I create a 20g swap and have the crash dump. Are the text output from the crash dump any use?

I know of Dtrace, but never had cause to use it, but I'm very willing to learn as this seems to be a real issue/bug.

Regards
James
 
OP
OP
Jimlad

Jimlad

Member

Thanks: 1
Messages: 24

#9
So I can use ddb on the pool when exported.

zdb -eC TANK01
MOS Configuration:
version: 5000
name: 'TANK01'
state: 0
txg: 23938109
pool_guid: 1562850898885502972
hostid: 2818149293
hostname: ''
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 1562850898885502972
children[0]:
type: 'raidz'
id: 0
guid: 4805578129041917548
nparity: 2
metaslab_array: 35
metaslab_shift: 37
ashift: 12
asize: 17990643351552
is_log: 0
create_txg: 4
com.delphix:vdev_zap_top: 84
children[0]:
type: 'disk'
id: 0
guid: 1205004479483820545
path: '/dev/diskid/DISK-WD-WMC4N1121870p2'
whole_disk: 1
DTL: 727
create_txg: 4
com.delphix:vdev_zap_leaf: 85
children[1]:
type: 'disk'
id: 1
guid: 14646932765524974895
path: '/dev/diskid/DISK-858AR6XGSp2'
whole_disk: 1
DTL: 1092
create_txg: 4
com.delphix:vdev_zap_leaf: 91
children[2]:
type: 'disk'
id: 2
guid: 7082033958519811740
path: '/dev/da2p2'
whole_disk: 1
DTL: 726
create_txg: 4
com.delphix:vdev_zap_leaf: 92
children[3]:
type: 'disk'
id: 3
guid: 10759706636802759186
path: '/dev/diskid/DISK-WD-WCC070269158p2'
whole_disk: 1
DTL: 725
create_txg: 4
com.delphix:vdev_zap_leaf: 98
children[4]:
type: 'disk'
id: 4
guid: 11465636190214487958
path: '/dev/diskid/DISK-WD-WCAWZ2272203p2'
whole_disk: 1
DTL: 719
create_txg: 4
com.delphix:vdev_zap_leaf: 102
children[5]:
type: 'disk'
id: 5
guid: 10880955587773408470
path: '/dev/diskid/DISK-WD-WCAWZ2199169p2'
whole_disk: 1
DTL: 718
create_txg: 4
com.delphix:vdev_zap_leaf: 107
features_for_read:
com.delphix:hole_birth

space map refcount mismatch: expected 231 != actual 136
 
OP
OP
Jimlad

Jimlad

Member

Thanks: 1
Messages: 24

#10
zdb -ed TANK01
Dataset mos [META], ID 0, cr_txg 4, 978M, 554 objects
Dataset TANK01/D* [ZPL], ID 42, cr_txg 17, 37.0G, 2991 objects
Dataset TANK01/S* [ZPL], ID 310, cr_txg 1368588, 126G, 20031 objects
Dataset TANK01/M* [ZPL], ID 70, cr_txg 102221, 5.01T, 27408 objects
Dataset TANK01/T* [ZPL], ID 74, cr_txg 633, 4.32T, 21444 objects
Dataset TANK01/M* [ZPL], ID 59, cr_txg 85541, 328G, 121641 objects
Dataset TANK01/P* [ZPL], ID 1121, cr_txg 11603134, 7.60G, 1635 objects
Dataset TANK01/C* [ZPL], ID 1236, cr_txg 11792609, 86.1M, 1232 objects
Dataset TANK01/S* [ZPL], ID 967, cr_txg 7016527, 8.09M, 17 objects
Dataset TANK01/Archive/T* [ZPL], ID 148, cr_txg 14717201, 990M, 34 objects
Dataset TANK01/A* [ZPL], ID 127, cr_txg 14717197, 304K, 8 objects
Dataset TANK01/iocage/download/11.0-RELEASE [ZPL], ID 140, cr_txg 19642422, 112M, 11 objects
Dataset TANK01/iocage/download/11.1-RELEASE [ZPL], ID 194, cr_txg 23848417, 260M, 12 objects
Dataset TANK01/iocage/download [ZPL], ID 54, cr_txg 19634294, 320K, 9 objects
Dataset TANK01/iocage/images [ZPL], ID 78, cr_txg 19634296, 288K, 7 objects
Dataset TANK01/iocage/releases/11.1-RELEASE/root@BRO-M* [ZPL], ID 306, cr_txg 23861009, 438M, 15694 objects
Dataset TANK01/iocage/releases/11.1-RELEASE/root@BRO-G* [ZPL], ID 276, cr_txg 23858936, 438M, 15694 objects
Dataset TANK01/iocage/releases/11.1-RELEASE/root@BRO-H* [ZPL], ID 355, cr_txg 23861124, 438M, 15694 objects
Dataset TANK01/iocage/releases/11.1-RELEASE/root [ZPL], ID 226, cr_txg 23848423, 1.50G, 95079 objects
Dataset TANK01/iocage/releases/11.1-RELEASE [ZPL], ID 218, cr_txg 23848422, 304K, 8 objects
Dataset TANK01/iocage/releases/11.0-RELEASE/root@BRO-P* [ZPL], ID 197, cr_txg 19642694, 455M, 16393 objects
Dataset TANK01/iocage/releases/11.0-RELEASE/root@BRO-S* [ZPL], ID 343, cr_txg 21975038, 455M, 16393 objects
Dataset TANK01/iocage/releases/11.0-RELEASE/root [ZPL], ID 171, cr_txg 19642428, 455M, 16393 objects
Dataset TANK01/iocage/releases/11.0-RELEASE [ZPL], ID 161, cr_txg 19642427, 304K, 8 objects
Dataset TANK01/iocage/releases [ZPL], ID 119, cr_txg 19634302, 320K, 9 objects
Dataset TANK01/iocage/log [ZPL], ID 108, cr_txg 19634300, 408K, 18 objects
Dataset TANK01/iocage/templates [ZPL], ID 132, cr_txg 19634304, 288K, 7 objects
Dataset TANK01/iocage/jails/BRO-H*-e396-418f-b8ed-37c67f7f1152/root [ZPL], ID 413, cr_txg 23861126, 682M, 27762 objects
Dataset TANK01/iocage/jails/BRO-H*-e396-418f-b8ed-37c67f7f1152 [ZPL], ID 368, cr_txg 23861125, 296K, 10 objects
Dataset TANK01/iocage/jails/BRO-P*/root [ZPL], ID 325, cr_txg 23851596, 30.4G, 1857635 objects
Dataset TANK01/iocage/jails/BRO-P* [ZPL], ID 316, cr_txg 23851595, 320K, 10 objects
Dataset TANK01/iocage/jails/BRO-M*-2d29-4083-bad5-5cad29cef825/root [ZPL], ID 346, cr_txg 23861011, 26.8G, 24531 objects
Dataset TANK01/iocage/jails/BRO-M*-2d29-4083-bad5-5cad29cef825 [ZPL], ID 335, cr_txg 23861010, 296K, 10 objects
Dataset TANK01/iocage/jails/BRO-S*/root [ZPL], ID 403, cr_txg 23851612, 1.07G, 37775 objects
Dataset TANK01/iocage/jails/BRO-S* [ZPL], ID 377, cr_txg 23851611, 320K, 10 objects
Dataset TANK01/iocage/jails/BRO-G*-cec5-4b2f-a55d-26933004f883/root [ZPL], ID 295, cr_txg 23858938, 19.0G, 432896 objects
Dataset TANK01/iocage/jails/BRO-G*-cec5-4b2f-a55d-26933004f883 [ZPL], ID 285, cr_txg 23858937, 296K, 10 objects
Dataset TANK01/iocage/jails [ZPL], ID 86, cr_txg 19634298, 368K, 12 objects
Dataset TANK01/iocage [ZPL], ID 48, cr_txg 19634292, 408K, 18 objects
Dataset TANK01/B* [ZPL], ID 263, cr_txg 356302, 251G, 27174 objects
Dataset TANK01/V* [ZPL], ID 271, cr_txg 360088, 148G, 4414 objects
Dataset TANK01/BT* [ZPL], ID 360, cr_txg 2080318, 35.6G, 2329 objects
Dataset TANK01 [ZPL], ID 21, cr_txg 1, 911K, 19 objects
Verified large_blocks feature refcount of 0 is correct
Verified sha512 feature refcount of 0 is correct
Verified skein feature refcount of 0 is correct
Verified device_removal feature refcount of 0 is correct
Verified indirect_refcount feature refcount of 0 is correct
space map refcount mismatch: expected 231 != actual 136
 
OP
OP
Jimlad

Jimlad

Member

Thanks: 1
Messages: 24

#11
zdb -e TANK01 | grep "txg = "
txg = 24032147
checkpoint_txg = 0
Assertion failed: dn->dn_bonuslen <= ((1 << 9) - 64 - (1 << 7)) (0x2000 <= 0x140), file /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c, line 273.
Abort (core dumped)
 
OP
OP
Jimlad

Jimlad

Member

Thanks: 1
Messages: 24

#12
zdb -ul /dev/da1p2 | grep 'txg =' | sort | uniq | tail -20
txg = 24032128
txg = 24032129
txg = 24032130
txg = 24032131
txg = 24032132
txg = 24032133
txg = 24032134
txg = 24032135
txg = 24032136
txg = 24032137
txg = 24032138
txg = 24032139
txg = 24032140
txg = 24032141
txg = 24032142
txg = 24032143
txg = 24032144
txg = 24032145
txg = 24032146
txg = 24032147


root@test:/var/crash # zdb -ul /dev/da2p2 | grep 'txg =' | sort | uniq | tail -20
txg = 24032128
txg = 24032129
txg = 24032130
txg = 24032131
txg = 24032132
txg = 24032133
txg = 24032134
txg = 24032135
txg = 24032136
txg = 24032137
txg = 24032138
txg = 24032139
txg = 24032140
txg = 24032141
txg = 24032142
txg = 24032143
txg = 24032144
txg = 24032145
txg = 24032146
txg = 24032147

root@test:/var/crash # zdb -ul /dev/da3p2 | grep 'txg =' | sort | uniq | tail -20
txg = 24032128
txg = 24032129
txg = 24032130
txg = 24032131
txg = 24032132
txg = 24032133
txg = 24032134
txg = 24032135
txg = 24032136
txg = 24032137
txg = 24032138
txg = 24032139
txg = 24032140
txg = 24032141
txg = 24032142
txg = 24032143
txg = 24032144
txg = 24032145
txg = 24032146
txg = 24032147

root@test:/var/crash # zdb -ul /dev/da4p2 | grep 'txg =' | sort | uniq | tail -20
txg = 24032128
txg = 24032129
txg = 24032130
txg = 24032131
txg = 24032132
txg = 24032133
txg = 24032134
txg = 24032135
txg = 24032136
txg = 24032137
txg = 24032138
txg = 24032139
txg = 24032140
txg = 24032141
txg = 24032142
txg = 24032143
txg = 24032144
txg = 24032145
txg = 24032146
txg = 24032147


root@test:/var/crash # zdb -ul /dev/da5p2 | grep 'txg =' | sort | uniq | tail -20
txg = 24032128
txg = 24032129
txg = 24032130
txg = 24032131
txg = 24032132
txg = 24032133
txg = 24032134
txg = 24032135
txg = 24032136
txg = 24032137
txg = 24032138
txg = 24032139
txg = 24032140
txg = 24032141
txg = 24032142
txg = 24032143
txg = 24032144
txg = 24032145
txg = 24032146
txg = 24032147


root@test:/var/crash # zdb -ul /dev/da0p2 | grep 'txg =' | sort | uniq | tail -20
txg = 24032128
txg = 24032129
txg = 24032130
txg = 24032131
txg = 24032132
txg = 24032133
txg = 24032134
txg = 24032135
txg = 24032136
txg = 24032137
txg = 24032138
txg = 24032139
txg = 24032140
txg = 24032141
txg = 24032142
txg = 24032143
txg = 24032144
txg = 24032145
txg = 24032146
txg = 24032147
 

VladiBG

Well-Known Member

Thanks: 136
Messages: 359

#13
make a full backup first then try to delete the last transaction group that have invalid checksum. Here's the similar problem:
https://github.com/zfsonlinux/zfs/issues/6414
https://gist.github.com/jshoward/5685757


We've used the (truly terrifying) zfs_revert script (from https://gist.github.com/jshoward/5685757). Removing 1 transaction was not sufficient -- still got the panic -- but after removing five transactions, we can mount again!

The filesystem seems to be functional at that point -- we're able to create 100,000 files and write a GB of data -- and we've brought Lustre back up.

We still have the original dd copies of the raw corrupted disks, so we can still try to learn what's happened here and how to "properly" recover.
 
OP
OP
Jimlad

Jimlad

Member

Thanks: 1
Messages: 24

#14
Hi VladiBG

How would you create a full backup when you cannot import/mount the pool? Does zdb have access to the file system when the pool isn't imported.
 
OP
OP
Jimlad

Jimlad

Member

Thanks: 1
Messages: 24

#17
Has this been raised as a bug? Surely FreeBSD shouldn't panic due to corruption in a pool especially when its not even the root file system.
 
OP
OP
Jimlad

Jimlad

Member

Thanks: 1
Messages: 24

#18
Can anyone suggest any further data recovery techniques? I cannot believe my 12TB of data is gone due metadata corruption. ZDB shows the datasets, but still FreeBSD kernel panics on import. I've tried readonly and using -T to mount an old txg but I assume i was too late. Are there any tools just to put raw file data from the zpool?
 

VladiBG

Well-Known Member

Thanks: 136
Messages: 359

#19
From the boot escape to loader prompt and set the debug options

set vfs.zfs.debug=1
set vfs.zfs.recover=1
set debug.bootverbose=1

then boot in single user with

boot -s

and see if you can import the pool

zpool import -o readonly=on -R /mnt pool
 

ShelLuser

Son of Beastie

Thanks: 1,490
Messages: 3,262

#20
I don't really got any ideas although I am a little biased towards the origin of the pool. You said it was made using FreeNAS? That is a FreeBSD derivative (somewhat offtopic here) but build upon FreeBSD CURRENT. In other words: a 'bleeding edge' developer snapshot which doesn't provide any guarantees of stability, let alone it even working.

Obviously that doesn't help you, but it made me wonder: when you say you're trying to recover this mess, what environment are you using? Are you trying to boot your server or are you using other media? And how stable is that server exactly?

My point: have you also tried using a rescue CD with FreeBSD 11.2 for example? If not I'd start there. So don't rely on your own server environment but a vanilla FreeBSD. And of course using the command I mentioned above: readonly and without automatically mounting any filesystems.

And if that fails: ever considered trying this using a FreeNAS boot environment? It might just work...

Also: did you ever run # zpool upgrade on that thing?
 

_martin

Aspiring Daemon

Thanks: 142
Messages: 728

#21
I'd recommend submitting a PR. This way you have a chance reaching developers. You could also try asking on a freebsd-fs mailing list.

I'm guessing you did it, but just in case: If it's data you really care about I suggest you do a full dd of those drives so you have a backup of the current state of the disks.
 
OP
OP
Jimlad

Jimlad

Member

Thanks: 1
Messages: 24

#22
From the boot escape to loader prompt and set the debug options

set vfs.zfs.debug=1
set vfs.zfs.recover=1
set debug.bootverbose=1

then boot in single user with

boot -s

and see if you can import the pool

zpool import -o readonly=on -R /mnt pool
Thanks for the response. something is really messed up. Even in single user with the boot loader options it still kernel panics. I'm using a LSI HBA, I'm going to move the disks to another system with enough SATA ports to see if that make a difference. It just so crazy I've had this zpool for 4 years odd and never had an issue...... I guess thats what happens.
 

_martin

Aspiring Daemon

Thanks: 142
Messages: 728

#23
Jimlad It panics because this assertation fails ( /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c, line 273 ). It doesn't matter if you are in single mode or not. I suggest you open that PR to reach the developers. It may be that ZFS is (was) just a victim of the event that happened when your machine crashed for the first time. It would be good to have the logs from that time too.
 
Top