Is there anyway to put zfs into 'safe-mode' where it doesn't attempt to modify any state so you can read your data off of it? I am stuck in a kernel panic reboot loop. Going into single use mode, the first zfs command I run starts up zfs and it panics again.
The long story:
I had two 2TB drives in mirror configuration for past 7 years (upgrading zfs and freebsd as time went by). It was finally time to upgrade due to running out of space. I added two 4TB drives as a second vdev mirror:
I should have stopped there, but I thought I would 'fix' the block size warning.
Now I just want to get it into a read-only 'safe-mode' where I can copy the data off without any modifications to the current zpool state.
I tried powering off the new drives, and then system does not panic, but I cannot access the data on the old drives. Is there any
This system is running 'FreeBSD 12.1-RELEASE-p5 GENERIC amd64'. Here is part of the /var/crash/core.txt.X:
Poking around in
The long story:
I had two 2TB drives in mirror configuration for past 7 years (upgrading zfs and freebsd as time went by). It was finally time to upgrade due to running out of space. I added two 4TB drives as a second vdev mirror:
zpool add storage mirror /dev/ada3 /dev/ada4
zpool status
looked like this:
Code:
pool: storage
state: ONLINE
status: One or more devices are configured to use a non-native block size.
Expect reduced performance.
action: Replace affected devices with devices that support the
configured block size, or migrate data to a properly configured
pool.
scan: scrub repaired 0 in 0 days 08:39:28 with 0 errors on Sat May 9 01:19:54 2020
config:
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ada1 ONLINE 0 0 0 block size: 512B configured, 4096B native
ada2 ONLINE 0 0 0 block size: 512B configured, 4096B native
mirror-1 ONLINE 0 0 0
ada3 ONLINE 0 0 0
ada4 ONLINE 0 0 0
errors: No known data errors
I should have stopped there, but I thought I would 'fix' the block size warning.
zdb
showed the mirror-0 as ashift 9 and mirror-1 as ashift 12. Next I issued zpool remove storage mirror-0
. I thought this would migrate the data off the misconfigured mirror-0 to mirror-1 and I could recreate mirror-0. However the system took a kernel panic immediately. Now I can't find a way to safely mount the filesystem to get the data off. The panic is in some code that is verifying the ashift makes sense.Now I just want to get it into a read-only 'safe-mode' where I can copy the data off without any modifications to the current zpool state.
I tried powering off the new drives, and then system does not panic, but I cannot access the data on the old drives. Is there any
sysctl
setting that will help me out here? I also tried zfs remove -s storage
as first command in single-user mode. I don't know if that did anything or not because it panicked again.This system is running 'FreeBSD 12.1-RELEASE-p5 GENERIC amd64'. Here is part of the /var/crash/core.txt.X:
Code:
panic: solaris assert: ((offset) & ((1ULL << vd->vdev_ashift) - 1)) == 0 (0x400 == 0x0), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c, line: 3593
cpuid = 1
time = 1590769420
KDB: stack backtrace:
#0 0xffffffff80c1d307 at kdb_backtrace+0x67
#1 0xffffffff80bd063d at vpanic+0x19d
#2 0xffffffff80bd0493 at panic+0x43
#3 0xffffffff82a6922c at assfail3+0x2c
#4 0xffffffff828a3b83 at metaslab_free_concrete+0x103
#5 0xffffffff828a4dd8 at metaslab_free+0x128
#6 0xffffffff8290217c at zio_dva_free+0x1c
#7 0xffffffff828feb7c at zio_execute+0xac
#8 0xffffffff80c2fae4 at taskqueue_run_locked+0x154
#9 0xffffffff80c30e18 at taskqueue_thread_loop+0x98
#10 0xffffffff80b90c53 at fork_exit+0x83
#11 0xffffffff81082c2e at fork_trampoline+0xe
Uptime: 7s
Dumping 433 out of 7980 MB:..4%..12%..23%..34%..41%..52%..63%..71%..82%..93%
__curthread () at /usr/src/sys/amd64/include/pcpu.h:234
234 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (OFFSETOF_CURTHREAD));
(kgdb) #0 __curthread () at /usr/src/sys/amd64/include/pcpu.h:234
#1 doadump (textdump=<optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:371
#2 0xffffffff80bd0238 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:451
#3 0xffffffff80bd0699 in vpanic (fmt=<optimized out>, ap=<optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:877
#4 0xffffffff80bd0493 in panic (fmt=<unavailable>)
at /usr/src/sys/kern/kern_shutdown.c:804
#5 0xffffffff82a6922c in assfail3 (a=<unavailable>, lv=<unavailable>,
op=<unavailable>, rv=<unavailable>, f=<unavailable>, l=<optimized out>)
at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:91
#6 0xffffffff828a3b83 in metaslab_free_concrete (vd=0xfffff80004623000,
offset=137438954496, asize=<optimized out>, checkpoint=0)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:3593
#7 0xffffffff828a4dd8 in metaslab_free_dva (spa=<optimized out>,
checkpoint=0, dva=<optimized out>)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:3863
#8 metaslab_free (spa=<optimized out>, bp=0xfffff800043788a0, txg=41924766,
now=<optimized out>)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:4145
#9 0xffffffff8290217c in zio_dva_free (zio=0xfffff80004378830)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3070
#10 0xffffffff828feb7c in zio_execute (zio=0xfffff80004378830)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1786
#11 0xffffffff80c2fae4 in taskqueue_run_locked (queue=0xfffff80004222800)
at /usr/src/sys/kern/subr_taskqueue.c:467
#12 0xffffffff80c30e18 in taskqueue_thread_loop (arg=<optimized out>)
at /usr/src/sys/kern/subr_taskqueue.c:773
#13 0xffffffff80b90c53 in fork_exit (
callout=0xffffffff80c30d80 <taskqueue_thread_loop>,
arg=0xfffff800041d90b0, frame=0xfffffe004dcf9bc0)
at /usr/src/sys/kern/kern_fork.c:1065
#14 <signal handler called>
Poking around in
kgdb
I could see that vdev_ashift is 12 for the operation that causes the panic. The offset was 1k aligned, but ashift 12 requires 4k alignment. I just need it to stop attempting to 'free' anything while I read the data out of it.