nvme adapter with hp ex900 ssd pci-e nvme

I doubt to be a device problem... As I said I tried on other machine, exactly same result.
Alain, have you tried to remove the devices and then reboot and after that re-add them, followed by a reboot?
 
No there was never a reason to remove them because they always worked fine.
Here a current output
Code:
NAME            SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
ZHD             904G   594G   310G        -         -    13%    65%  1.00x    ONLINE  -
  ada0s2        832G   592G   240G        -         -    13%  71.1%      -    ONLINE
special            -      -      -        -         -      -      -      -  -
  gpt/special  72.5G  2.50G  70.0G        -         -    30%  3.44%      -    ONLINE
ZT              330G   149G   181G        -         -    48%    45%  1.00x    ONLINE  -
  ada2p3        149G  71.3G  77.7G        -         -    47%  47.8%      -    ONLINE
  ada2p10       181G  78.1G   103G        -         -    50%  43.1%      -    ONLINE
logs               -      -      -        -         -      -      -      -  -
  ada1s3       13.5G  2.66M  13.5G        -         -     0%  0.01%      -    ONLINE
cache              -      -      -        -         -      -      -      -  -
  ada1s1       27.0G  26.1G   956M        -         -     0%  96.5%      -    ONLINE
To be honest i'm afraid that the process of removing them might corrupt data. I'm not certain so i don't try it.
 
what if you nuke /etc/zfs/zpool.cache and /boot/zfs/zpool.cache after removing the log and cache devices
(do it on the test system)
 
So... i installed freebsd12.3 ... shutdown the system insert the ssd... start the system, create 2 arrays on the new ssd, add it to the zroot pool, worked... reboot the system, remove the log and cache, reboot worked... destroy the gpt, recreate it, recreate the arrays, add them to the pool as cache and log, reboot the machine works...

The only different thing that i see, is when i run zpool iostat -v, the log device is placed right under the devices that are part of zroot.
Code:
zpool iostat -v
                capacity     operations    bandwidth
pool         alloc   free   read  write   read  write
-----------  -----  -----  -----  -----  -----  -----
zroot         998M   831G      7     12   120K   159K
  da0p3       493M   416G      4      6  58.5K  71.6K
  da1p3       505M   416G      3      6  57.7K  83.7K
  gpt/log      80K  7.50G      0      0  3.36K  4.13K
cache            -      -      -      -      -      -
  gpt/cache  28.5K   100G      0      0    993    934
-----------  -----  -----  -----  -----  -----  -----

Since i'm testing this i will make again a test with freebsd 13.1, 13.2 and 14.0
 
On freebsd 13.1-STABLE, i get the same result as on 13.1-RELEASE..

1. when adding the first time the log and cache drive to the pool , reboot the machine - works
2. i remove the log and cache drive from the pool, reboot the machine - works
3. destroy the gpt and recreate the gpt on log and cache drive, add the drives to the pool, reboot the machine... when hits the mount of of log device, reboots without any error message.

On FreeBSD 14-CURRENT i got a kernel panic on step 3 instead of a reboot.

Code:
VERIFY0(0 == nvlist_lookup_uint64(nvl, name, &rv)) failed (0 == 22)

also found this guy, that has the same problem, but after removing a ZIL... in my case appear after adding a ZIL

 
Heya! After reading the thread from the beginning, I don't understand what it is you want cmiu147 ? Bugs aside, you've demonstrated that you are able to start the system with and without cache drives. Do you want help confirming there is indeed a bug and not just you "holding it wrong"? Otherwise, I'd say file a bug report to eventually get it fixed.
 
Well, I don't understand why the first time when I create log device and attached to the pool works, and if delete that device, recreate it again and attach it to the same pool, goes to a reboot loop.
Trying to find the logic and what I'm doing wrong (perhaps I do something wrong)
 
Well, I don't understand why the first time when I create log device and attached to the pool works, and if delete that pool, recreate it again and attach it to the same pool, goes to a reboot loop.
Trying to find the logic and what I'm doing wrong (perhaps I do something wrong)
Ok, thanks! In my opinion, you're not doing anything wrong at all, it's quite clearly a bug that should be reported and fixed.
 
i guess it bombs here

txg = fnvlist_lookup_uint64(configs, ZPOOL_CONFIG_POOL_TXG);
in /sys/contrib/openzfs/module/os/freebsd/zfs/spa_os.c
looks like the log zdev config is missing a ZPOOL_CONFIG_POOL_TXG key and the assertion fails
why is this failing after add / remove beats me
 
just great.... I wanted to remove the log device and reboot the machine.... but /dev/gpt disappear... wtf?

Code:
ls -al /dev/gpt/


total 1


dr-xr-xr-x   2 root  wheel      512 Nov 28 09:32 .


dr-xr-xr-x  17 root  wheel      512 Nov 28 09:32 ..


crw-r-----   1 root  operator  0xa7 Nov 28 09:32 gptboot0


crw-r-----   1 root  operator  0xad Nov 28 09:32 gptboot1


crw-r-----   1 root  operator  0xbf Nov 28 09:32 gptboot2


crw-r-----   1 root  operator  0xc5 Nov 28 09:32 gptboot3



nfs# zpool iostat -v

              capacity     operations     bandwidth

pool        alloc   free   read  write   read  write

----------  -----  -----  -----  -----  -----  -----

backup      1.54T   280G     15      0   200K  79.4K

  ada4      1.54T   280G     15      0   200K  79.4K

----------  -----  -----  -----  -----  -----  -----

zroot       3.19T   450G      4     29  78.0K   729K

  mirror-0  1.61T   211G      2     11  39.7K   305K

    ada0p3      -      -      1      5  20.0K   153K

    ada1p3      -      -      1      5  19.7K   153K

  mirror-1  1.58T   239G      1     15  38.3K   354K

    ada2p3      -      -      0      7  19.4K   177K

    ada3p3      -      -      0      7  18.9K   177K

logs            -      -      -      -      -      -

  nvd0p1    27.1M  7.47G      0      2      1  71.4K

cache           -      -      -      -      -      -

  nvd0p2    28.5G  75.2G      6      1  48.0K   162K

----------  -----  -----  -----  -----  -----  -----


and if do

Code:
gpart list | grep label


   label: log


   label: cache


   label: gptboot0


   label: swap0


   label: zfs0


   label: gptboot1


   label: swap1


   label: zfs1


   label: gptboot2


   label: swap2


   label: zfs2


   label: gptboot3


   label: swap3


   label: zfs3
 
That's because they are in use by ZFS with their partition names instead of the label names. That's a safety mechanism to keep you from writing data from two places at once. When you remove the bare partitions from the zpool, the label names will reappear.
 
the below patch will fix it
the problem is that adding and removing a device will create a hole_array in the on disk config / at each removal / addition
so you have data_vdev, hole, hole, hole...,log vdev
when the configs array is created it generates a sparse array with the middle entries NULL
then it tries to select a config that holds the max txg number and iterates thru the sparse array INCLUDING the null entries (which cause it to bomb)
the hole arrays children do not generate a valid config and the configs[] array is sparse

fnvlist_lookup_uint64(configs, ZPOOL_CONFIG_POOL_TXG) calls VERIFY0 which panics if failed and when configs is null will obviously fail
see zdb output below
Code:
zroot:
    version: 5000
    name: 'zroot'
    state: 0
    txg: 1848
    pool_guid: 6875476664965044950
    errata: 0
    hostname: 'f13'
    com.delphix:has_per_vdev_zaps
    hole_array[0]: 1
    hole_array[1]: 2
    hole_array[2]: 3
    hole_array[3]: 4
    hole_array[4]: 5
    hole_array[5]: 6
    hole_array[6]: 7
    hole_array[7]: 8
    hole_array[8]: 9
    vdev_children: 11
    vdev_tree:
        type: 'root'
        id: 0
        guid: 6875476664965044950
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 3442502113579205730
            path: '/dev/da0p3'
            whole_disk: 1
            metaslab_array: 67
            metaslab_shift: 29
            ashift: 12
            asize: 15025569792
            is_log: 0
            create_txg: 4
            com.delphix:vdev_zap_leaf: 65
            com.delphix:vdev_zap_top: 66
        children[1]:
            type: 'hole'
            id: 1
            guid: 0
            whole_disk: 0
            metaslab_array: 0
            metaslab_shift: 0
            ashift: 0
            asize: 0
            is_log: 0
            is_hole: 1
        children[2]:
            type: 'hole'
            id: 2
            guid: 0
            whole_disk: 0
            metaslab_array: 0
            metaslab_shift: 0
            ashift: 0
            asize: 0
            is_log: 0
            is_hole: 1
        children[3]:
            type: 'hole'
            id: 3
            guid: 0
            whole_disk: 0
            metaslab_array: 0
            metaslab_shift: 0
            ashift: 0
            asize: 0
            is_log: 0
            is_hole: 1
        children[4]:
            type: 'hole'
            id: 4
            guid: 0
            whole_disk: 0
            metaslab_array: 0
            metaslab_shift: 0
            ashift: 0
            asize: 0
            is_log: 0
            is_hole: 1
        children[5]:
            type: 'hole'
            id: 5
            guid: 0
            whole_disk: 0
            metaslab_array: 0
            metaslab_shift: 0
            ashift: 0
            asize: 0
            is_log: 0
            is_hole: 1
        children[6]:
            type: 'hole'
            id: 6
            guid: 0
            whole_disk: 0
            metaslab_array: 0
            metaslab_shift: 0
            ashift: 0
            asize: 0
            is_log: 0
            is_hole: 1
        children[7]:
            type: 'hole'
            id: 7
            guid: 0
            whole_disk: 0
            metaslab_array: 0
            metaslab_shift: 0
            ashift: 0
            asize: 0
            is_log: 0
            is_hole: 1
        children[8]:
            type: 'hole'
            id: 8
            guid: 0
            whole_disk: 0
            metaslab_array: 0
            metaslab_shift: 0
            ashift: 0
            asize: 0
            is_log: 0
            is_hole: 1
        children[9]:
            type: 'hole'
            id: 9
            guid: 0
            whole_disk: 0
            metaslab_array: 0
            metaslab_shift: 0
            ashift: 0
            asize: 0
            is_log: 0
            is_hole: 1
        children[10]:
            type: 'disk'
            id: 10
            guid: 5287760721319056479
            path: '/dev/da1p2'
            whole_disk: 1
            metaslab_array: 0
            metaslab_shift: 0
            ashift: 12
            asize: 2142765056
            is_log: 1
            create_txg: 1848
            com.delphix:vdev_zap_leaf: 128
            com.delphix:vdev_zap_top: 135
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data


Diff:
--- contrib/openzfs/module/os/freebsd/zfs/spa_os.c    2022-05-17 07:18:53.560252000 +0300
+++ /tmp/spa_os.c    2022-12-02 16:33:04.665494000 +0200
@@ -95,6 +95,7 @@
     for (i = 0; i < count; i++) {
         uint64_t txg;
        
+        if(!configs[i]) continue;
         txg = fnvlist_lookup_uint64(configs[i], ZPOOL_CONFIG_POOL_TXG);
         if (txg > best_txg) {
             best_txg = txg;
 
I'm sorry, i'm not a developer so, my next phrase perhaps will sound stupid.

So, should i recompile openzfs with the mods you made?
Code:
+        if(!configs[i]) continue;
another question is... zfs from freebsd is not a different package then openzfs?

Meantime i opened a bug in bugzilla, that is here

Regards,
Chris
 
edit /sys/contrib/openzfs/module/os/freebsd/zfs/spa_os.c
at line 98 after

Code:
uint64_t txg;
//add this
 if(!configs[i]) continue;
//end patch
txg = fnvlist_lookup_uint64(configs, ZPOOL_CONFIG_POOL_TXG);

rebuild kernel and modules / see handbook etc
or fast and dirty
cd /sys/modules/zfs
make
make install
mv /boot/modules/zfs.ko /boot/kernel
1st maybe backup the original zfs.ko
 
the patch is for the base system provided zfs not the openzfs port from ports
i tested yesterday on fresh install in virtual box and it did panic every time / but for some reason the panic message was displayed less than a second, looked like a reboot
 
Almost 1y later...I tried to update my home server from 13.1 to 13.2... this bug is still there... can't believe it... how can i contrib on this fix?
 
Back
Top