ZFS snapshot renames failing

Hello!

I have a server running FreeBSD 10.0. I use ZFS snapshots for backup and I have a problem when I try to rename it. The renamed snapshot becomes inaccessible and remains so until reboot.

Code:
root@myserver /backup/.zfs/snapshot % uname -a
FreeBSD myserver 10.0-RELEASE-p9 FreeBSD 10.0-RELEASE-p9 #0: Mon Sep 15 14:35:52 UTC 2014     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

Code:
root@myserver /backup/.zfs/snapshot % ls -la
ls: snapshot_4: Device busy
total 51
dr-xr-xr-x   5 root  wheel   5 Sep 29 20:09 .
dr-xr-xr-x   4 root  wheel   4 Jul 29 16:47 ..
drwxr-xr-x  22 root  wheel  30 Sep  1 14:17 snapshot_1
drwxr-xr-x  22 root  wheel  30 Sep  1 14:17 snapshot_2
drwxr-xr-x  22 root  wheel  30 Sep  1 14:17 snapshot_3

I haven't found solution for this problem. Could you help me?
 
You're not trying to rename it through the .zfs directory are you? You need to use the zfs rename command. I've used renames for the past couple years in my periodic scripts as shown below and have never had a single issue.

One script to make a snapshot every day.
Code:
cat > /usr/local/etc/periodic/daily/900.rollingsnap << 'EOF'
#!/bin/sh
_30DAYSAGO=$(/bin/date -v -30d "+%Y%m%d")
_SNAPDATE=$(/bin/date "+%Y%m%d")
zfs destroy -r zfs/homedirs@$_30DAYSAGO-autodaily > /dev/null 2>&1
zfs snapshot -r zfs/homedirs@$_SNAPDATE-autodaily
'EOF'
chmod 555 /usr/local/etc/periodic/daily/900.rollingsnap

And another to rename it if it happens to be the first of the month.
Code:
cat > /usr/local/etc/periodic/monthly/900.monthlysnap << 'EOF'
#!/bin/sh
_SNAPDATE=$(/bin/date "+%Y%m%d")
zfs rename -r zfs/homedirs@$_SNAPDATE-autodaily @$_SNAPDATE-automonthly > /dev/null 2>&1
'EOF'
chmod 555 /usr/local/etc/periodic/monthly/900.monthlysnap
 
junovitch said:
You're not trying to rename it through the .zfs directory are you? You need to use the zfs rename command. I've used renames for the past couple years in my periodic scripts as shown below and have never had a single issue.
Of course I use zfs rename command. But I've compared my and your scripts and I've noted that I don't use -r key. I think it may be a cause, but renaming snapshots without this key works fine on my another server. I'll check it. Thanks for your answer.
 
OK. Sorry that was probably a dumb question but get the dumb easy questions out of the way first. The -r is just to recurse through snapshots underneath and rename those too. So you shouldn't have to worry about that.

What to consider though, is that there was some discussion regarding insufficient locking somewhere between ZFS and VFS. I want to say what trigger the discussion was either a script that was going through the .zfs directory regularly or multiple users going though there regularly. At the time, the comment was that the .zfs had some issues when it was poked too hard and that you shouldn't be accessing it on a normal basis. I didn't ask what version you are running yet because I don't know how much that still applies and if and when any of those issues have been fixed. Nonetheless, you should be able to either look on the forums or your favorite search engine for "ZFS", "VFS", "insufficient locking" and probably find some useful discussion. Given that, I do want to say that trying to stat a snapshot using an ls on the .zfs snapshot directly while there is a rename going on concurrently would be a bad thing.

What could be useful though, is to check using the zfs list -t snapshot to see if the snapshot has been renamed. If the ZFS command hangs hard, perhaps taking a note of what states the ZFS process or any other processes reading/writing to ZFS could be useful. You can check the STATE column for all processes with top -HaS and will probably see some stuck in zio->something.
 
What to consider though, is that there was some discussion regarding insufficient locking somewhere between ZFS and VFS. I want to say what trigger the discussion was either a script that was going through the .zfs directory regularly or multiple users going though there regularly. At the time, the comment was that the .zfs had some issues when it was poked too hard and that you shouldn't be accessing it on a normal basis. I didn't ask what version you are running yet because I don't know how much that still applies and if and when any of those issues have been fixed. Nonetheless, you should be able to either look on the forums or your favorite search engine for "ZFS", "VFS", "insufficient locking" and probably find some useful discussion. Given that, I do want to say that trying to stat a snapshot using an ls on the .zfs snapshot directly while there is a rename going on concurrently would be a bad thing.

This server are used only for remote backups. There are no users which can doing something while a snapshot is created.

This is the script I use for remote backup.
Code:
for r in $(${SEQ} 1 ${max_repeat_number})
do
    # Sync data with remote server
    ${RSYNC} -a --relative --delete --numeric-ids --exclude=/dev --exclude=/proc ${source} ${destination}

    if [ "x$?" == "x0" ]
    then
        # full snapshot name
        sname="${zpool}/${zfs_dataset}@${snapshot_base_name}"

        # remove the oldest snapshot
        /sbin/zfs destroy "${sname}_${max_snapshot_number}"

        # rename other snapshots
        max_number=$(($max_snapshot_number-1))
        for i in $(${SEQ} $max_number 1)
        do
            /sbin/zfs rename "${sname}_${i}" "${sname}_$(($i+1))"
        done

        # make a new snapshot
        /sbin/zfs snapshot "${sname}_1"

        # exit with success
        exit 0
    fi
done

echo "ERROR: Unknown error while making snapshot."
exit 1

I spent a lot of time with google but I couldn't find a solution.

Snapshots after this bug could be correctly renamed or destroyed, but remains inaccessible until reboot.

junovitch said:
What could be useful though, is to check using the zfs list -t snapshot to see if the snapshot has been renamed. If the ZFS command hangs hard, perhaps taking a note of what states the ZFS process or any other processes reading/writing to ZFS could be useful. You can check the STATE column for all processes with top -HaS and will probably see some stuck in zio->something.

zfs list -t shows that snapshot has renamed correctly.
top -HaS | grep zio shows a lot of entries
Code:
    0 root       -16    0     0K  2656K -       1   0:09   0.00% [kernel{zio_read_intr_0}]
...
    0 root       -16    0     0K  2656K -       1   0:05   0.00% [kernel{zio_write_issue_}]
...
    0 root       -16    0     0K  2656K -       0   0:01   0.00% [kernel{zio_write_intr_5}]
...
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_null_issue}]
...
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_3}]
...
 
OK. So correct me if I am wrong. You can rename snapshots with zfs rename and zfs list -t snapshot reflects that. The system is responsive during and after renaming snapshots. Processes accessing the current version works fine. The issue is just trying to access them afterwards via something like ls /backup/.zfs/snapshot/snapshot_4 fails. Does unmount /backup/.zfs/snapshot/snapshot_4 also fail? Maybe doing an unmount and trying to access it again will help. If the only issue is accessing through .zfs than I do think the VFS issues are part of the problem.

What my train of thought was for the top -HaS is this:
Code:
top -HaS | grep z
557 processes: 5 running, 532 sleeping, 2 zombie, 18 waiting
15822 jason         25    0 12292K  2508K tx->tx  3   0:00   1.17% dd if=/dev/zero of=zeros bs=1M count=1000
   36 root          -8    -     0K    96K arc_re  0   2:45   0.39% [zfskern{arc_reclaim_thre}]
    0 root         -16    0     0K  2928K -       1  14:01   0.20% [kernel{zio_write_issue_}]
    0 root         -16    0     0K  2928K -       0  14:01   0.20% [kernel{zio_write_issue_}]
    0 root         -16    0     0K  2928K -       2  13:59   0.20% [kernel{zio_write_issue_}]
    0 root         -16    0     0K  2928K -       0   6:56   0.20% [kernel{zio_write_intr_2}]
    0 root         -16    0     0K  2928K -       0   6:57   0.10% [kernel{zio_write_intr_1}]
    0 root         -16    0     0K  2928K -       0   6:57   0.10% [kernel{zio_write_intr_7}]

So the zio portion wasn't completely accurate but if there was a process like the dd process sitting there stuck trying to write to disk that could reflect something in the ZFS system was hung up and processes were sitting there waiting to finish their disk IO.
 
junovitch said:
OK. So correct me if I am wrong. You can rename snapshots with zfs rename and zfs list -t snapshot reflects that. The system is responsive during and after renaming snapshots. Processes accessing the current version works fine. The issue is just trying to access them afterwards via something like ls /backup/.zfs/snapshot/snapshot_4 fails.
Yes, that's right.

junovitch said:
Does unmount /backup/.zfs/snapshot/snapshot_4 also fail? Maybe doing an unmount and trying to access it again will help. If the only issue is accessing through .zfs than I do think the VFS issues are part of the problem.
Trying to unmount also causes "Device busy".

Code:
root@myserver /backup/.zfs/snapshot % umount /backup/.zfs/snapshot/snapshot_3
umount: /backup/.zfs/snapshot/snapshot_3: statfs: Device busy
umount: /backup/.zfs/snapshot/snapshot_3: unknown file system
root@myserver /backup/.zfs/snapshot % zfs umount /backup/.zfs/snapshot/snapshot_3
cannot unmount '/backup/.zfs/snapshot/snapshot_3': Device busy

junovitch said:
So the zio portion wasn't completely accurate but if there was a process like the dd process sitting there stuck trying to write to disk that could reflect something in the ZFS system was hung up and processes were sitting there waiting to finish their disk IO.

Here is a complete output of top -HaS
Code:
% top -HaS 1000 | cat
last pid:  7344;  load averages:  0.27,  0.16,  0.15  up 1+23:47:53    17:24:07
256 processes: 3 running, 234 sleeping, 19 waiting

Mem: 9796K Active, 40M Inact, 1371M Wired, 548M Free
ARC: 1011M Total, 46M MFU, 756M MRU, 16K Anon, 18M Header, 191M Other
Swap: 8192M Total, 8192M Free


  PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
   11 root       155 ki31     0K    32K CPU1    1  47.7H 100.00% [idle{idle: cpu1}]
   11 root       155 ki31     0K    32K RUN     0  47.7H  99.46% [idle{idle: cpu0}]
   12 root       -60    -     0K   304K WAIT    0   2:08   0.00% [intr{swi4: clock}]
   16 root        16    -     0K    16K syncer  0   1:16   0.00% [syncer]
    0 root       -16    0     0K  2656K swapin  0   0:51   0.00% [kernel{swapper}]
   12 root       -88    -     0K   304K WAIT    1   0:13   0.00% [intr{irq14: ata0}]
   14 root       -16    -     0K    16K -       0   0:12   0.00% [rand_harvestq]
    2 root        -8    -     0K    96K tx->tx  0   0:12   0.00% [zfskern{txg_thread_enter}]
    9 root       -16    -     0K    16K vlruwt  0   0:09   0.00% [vnlru]
    0 root       -16    0     0K  2656K -       1   0:09   0.00% [kernel{zio_read_intr_0}]
    0 root       -16    0     0K  2656K -       1   0:09   0.00% [kernel{zio_read_intr_1}]
   13 root        -8    -     0K    48K -       1   0:09   0.00% [geom{g_down}]
   12 root       -92    -     0K   304K WAIT    0   0:06   0.00% [intr{irq257: re0}]
    0 root       -16    0     0K  2656K -       0   0:06   0.00% [kernel{zio_write_issue_}]
    0 root       -16    0     0K  2656K -       1   0:06   0.00% [kernel{zio_write_issue_}]
    2 root        -8    -     0K    96K arc_re  0   0:05   0.00% [zfskern{arc_reclaim_thre}]
  827 root        20    0 25328K  3672K select  0   0:04   0.00% /usr/sbin/ntpd -c /etc/ntp.conf -p /var/run/ntpd.pid -
   13 root        -8    -     0K    48K -       0   0:04   0.00% [geom{g_up}]
   15 root       -72    -     0K   320K -       1   0:04   0.00% [usb{usbus4}]
   12 root       -88    -     0K   304K WAIT    0   0:03   0.00% [intr{irq15: ata1}]
   12 root       -88    -     0K   304K WAIT    0   0:02   0.00% [intr{irq23: uhci0 ehc}]
  871 root        20    0 23980K  5364K select  1   0:02   0.00% sendmail: accepting connections (sendmail)
    0 root       -16    0     0K  2656K -       1   0:01   0.00% [kernel{zio_write_intr_5}]
    0 root       -16    0     0K  2656K -       1   0:01   0.00% [kernel{zio_write_intr_2}]
    0 root       -16    0     0K  2656K -       1   0:01   0.00% [kernel{zio_write_intr_3}]
    0 root       -16    0     0K  2656K -       1   0:01   0.00% [kernel{zio_write_intr_4}]
    0 root       -16    0     0K  2656K -       1   0:01   0.00% [kernel{zio_write_intr_6}]
    0 root       -16    0     0K  2656K -       1   0:01   0.00% [kernel{zio_write_intr_0}]
    0 root       -16    0     0K  2656K -       1   0:01   0.00% [kernel{zio_write_intr_1}]
    0 root       -16    0     0K  2656K -       1   0:01   0.00% [kernel{zio_write_intr_7}]
    2 root        -8    -     0K    96K spa->s  1   0:01   0.00% [zfskern{trim system}]
    4 root       -16    -     0K    16K ccb_sc  1   0:01   0.00% [xpt_thrd]
   12 root       -68    -     0K   304K WAIT    0   0:01   0.00% [intr{swi2: cambio}]
   15 root       -68    -     0K   320K -       0   0:01   0.00% [usb{usbus4}]
    0 root       -16    0     0K  2656K -       0   0:01   0.00% [kernel{zio_null_issue}]
    5 root       -16    -     0K    16K psleep  1   0:01   0.00% [pagedaemon]
   17 root       -16    -     0K    16K sdflus  1   0:01   0.00% [softdepflush]
   15 root       -68    -     0K   320K -       1   0:00   0.00% [usb{usbus3}]
   15 root       -68    -     0K   320K -       0   0:00   0.00% [usb{usbus2}]
   15 root       -68    -     0K   320K -       0   0:00   0.00% [usb{usbus1}]
    8 root       -16    -     0K    16K psleep  0   0:00   0.00% [bufdaemon]
   15 root       -68    -     0K   320K -       1   0:00   0.00% [usb{usbus0}]
  878 root        20    0 16520K  2076K nanslp  0   0:00   0.00% /usr/sbin/cron -s
    2 root        -8    -     0K    96K l2arc_  0   0:00   0.00% [zfskern{l2arc_feed_threa}]
  682 root        20    0 14424K  1968K select  1   0:00   0.00% /usr/sbin/syslogd -s
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_3}]
    2 root        -8    -     0K    96K tx->tx  0   0:00   0.00% [zfskern{txg_thread_enter}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_8}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_1}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_8}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_4}]
 7230 root        20    0 42064K  5940K pause   0   0:00   0.00% /usr/local/bin/zsh
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_6}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_5}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_5}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_8}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_6}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_5}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_7}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_9}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_2}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_3}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_8}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_4}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_9}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_7}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_5}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_8}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_write_issue_}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_7}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_1}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_8}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_7}]
 7222 helicopter  20    0 86084K  7016K select  0   0:00   0.00% sshd: helicopter@pts/0 (sshd)
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_6}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_7}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_6}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_3}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_8}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_2}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_4}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_4}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_8}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_0}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_3}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_1}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_6}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_3}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_1}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_1}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_1}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_7}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_4}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_write_issue_}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_9}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_4}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_9}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_9}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_4}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_9}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_6}]
 7321 root        20    0 31540K  4992K pause   1   0:00   0.00% /usr/local/bin/zsh
    0 root         8    0     0K  2656K -       1   0:00   0.00% [kernel{thread taskq}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_2}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_1}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_6}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_3}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_2}]
  844 root        20    0 30524K  3896K nanslp  0   0:00   0.00% /usr/local/sbin/smartd -c /usr/local/etc/smartd.conf -
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_null_intr}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_6}]
 7223 helicopter  36    0 31540K  4988K pause   0   0:00   0.00% -zsh (zsh)
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_2}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_9}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_5}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_write_issue_}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_9}]
  532 root        28    0 14556K  2012K select  0   0:00   0.00% dhclient: re0 [priv] (dhclient)
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_write_issue_}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_2}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_7}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_5}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_3}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_3}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_4}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_5}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_2}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_2}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_9}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_3}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_8}]
    1 root        24    0  9428K   752K wait    0   0:00   0.00% [init]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_5}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_5}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_5}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_3}]
  568 _dhcp       20    0 14556K  2068K select  1   0:00   0.00% dhclient: re0 (dhclient)
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_2}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_2}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_4}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_7}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_9}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_2}]
  569 root        20    0 13584K  4484K select  0   0:00   0.00% /sbin/devd
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_9}]
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_write_issue_}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_5}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_3}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_ioctl_intr}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_4}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_1}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_1}]
  874 smmsp       20    0 23980K  4944K pause   0   0:00   0.00% sendmail: Queue runner@00:30:00 for /var/spool/clientm
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_1}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_8}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_write_intr_h}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_write_intr_h}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_write_intr_h}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_write_intr_h}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_write_intr_h}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_8}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_6}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_7}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_7}]
 7219 root        21    0 86084K  7016K select  0   0:00   0.00% sshd: helicopter [priv] (sshd)
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_7}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_6}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_1}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_issue_4}]
 7319 root        20    0 25668K  2904K pause   1   0:00   0.00% screen
 7320 root        20    0 25668K  3060K select  0   0:00   0.00% screen
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_free_issue_6}]
 7229 root        22    0 48160K  3140K select  0   0:00   0.00% sudo -s
   13 root        -8    -     0K    48K -       1   0:00   0.00% [geom{g_event}]
 7343 root        20    0 19768K  2748K CPU0    0   0:00   0.00% top -HaS 1000
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
   12 root       -72    -     0K   304K WAIT    1   0:00   0.00% [intr{swi1: netisr 0}]
    0 root        -8    0     0K  2656K -       0   0:00   0.00% [kernel{zfs_vn_rele_task}]
  868 root        20    0 60816K  6068K select  1   0:00   0.00% /usr/sbin/sshd
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
  923 root        52    0 14420K  1880K ttyin   0   0:00   0.00% /usr/libexec/getty Pc ttyv0
  929 root        52    0 14420K  1880K ttyin   0   0:00   0.00% /usr/libexec/getty Pc ttyv6
  930 root        52    0 14420K  1880K ttyin   1   0:00   0.00% /usr/libexec/getty Pc ttyv7
  924 root        52    0 14420K  1880K ttyin   1   0:00   0.00% /usr/libexec/getty Pc ttyv1
  926 root        52    0 14420K  1880K ttyin   1   0:00   0.00% /usr/libexec/getty Pc ttyv3
  925 root        52    0 14420K  1880K ttyin   0   0:00   0.00% /usr/libexec/getty Pc ttyv2
  927 root        52    0 14420K  1880K ttyin   0   0:00   0.00% /usr/libexec/getty Pc ttyv4
  928 root        52    0 14420K  1880K ttyin   0   0:00   0.00% /usr/libexec/getty Pc ttyv5
    7 root       155 ki31     0K    16K pgzero  0   0:00   0.00% [pagezero]
   12 root       -96    -     0K   304K WAIT    1   0:00   0.00% [intr{irq256: hdac0}]
 7344 root        20    0 12268K  1736K piperd  1   0:00   0.00% cat
  136 root        52    0 12264K  1664K pause   0   0:00   0.00% adjkerntz -i
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
   12 root       -52    -     0K   304K WAIT    0   0:00   0.00% [intr{swi6: task queue}]
    0 root       -52    0     0K  2656K -       1   0:00   0.00% [kernel{mca taskq}]
    0 root       -100    0     0K  2656K -       0   0:00   0.00% [kernel{system_taskq_0}]
    0 root       -100    0     0K  2656K -       0   0:00   0.00% [kernel{system_taskq_1}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_read_issue_4}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_read_issue_3}]
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_read_issue_1}]
   12 root       -52    -     0K   304K WAIT    0   0:00   0.00% [intr{swi6: Giant task}]
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_read_issue_5}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_read_issue_7}]
   15 root       -68    -     0K   320K -       1   0:00   0.00% [usb{usbus4}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_read_issue_0}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_read_issue_2}]
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
    0 root       -16    0     0K  2656K -       0   0:00   0.00% [kernel{zio_read_issue_6}]
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
    3 root       -16    -     0K    16K waitin  1   0:00   0.00% [sctp_iterator]
    0 root         8    0     0K  2656K -       0   0:00   0.00% [kernel{acpi_task_0}]
    0 root         8    0     0K  2656K -       1   0:00   0.00% [kernel{firmware taskq}]
   10 root       -16    -     0K    16K audit_  0   0:00   0.00% [audit]
    6 root       -16    -     0K    16K psleep  0   0:00   0.00% [vmdaemon]
   15 root       -68    -     0K   320K -       0   0:00   0.00% [usb{usbus0}]
   15 root       -68    -     0K   320K -       0   0:00   0.00% [usb{usbus1}]
    0 root         8    0     0K  2656K -       0   0:00   0.00% [kernel{acpi_task_2}]
    0 root         8    0     0K  2656K -       0   0:00   0.00% [kernel{acpi_task_1}]
    0 root         8    0     0K  2656K -       0   0:00   0.00% [kernel{ffs_trim taskq}]
    0 root         8    0     0K  2656K -       0   0:00   0.00% [kernel{kqueue taskq}]
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
    0 root        -8    0     0K  2656K -       1   0:00   0.00% [kernel{zil_clean}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_free_intr}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_ioctl_issue}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_claim_intr}]
   15 root       -68    -     0K   320K -       0   0:00   0.00% [usb{usbus3}]
   15 root       -68    -     0K   320K -       0   0:00   0.00% [usb{usbus2}]
   15 root       -68    -     0K   320K -       0   0:00   0.00% [usb{usbus0}]
   15 root       -68    -     0K   320K -       0   0:00   0.00% [usb{usbus4}]
   15 root       -68    -     0K   320K -       0   0:00   0.00% [usb{usbus3}]
   15 root       -68    -     0K   320K -       0   0:00   0.00% [usb{usbus2}]
   15 root       -68    -     0K   320K -       0   0:00   0.00% [usb{usbus1}]
   15 root       -72    -     0K   320K -       0   0:00   0.00% [usb{usbus1}]
   15 root       -72    -     0K   320K -       0   0:00   0.00% [usb{usbus3}]
   15 root       -72    -     0K   320K -       0   0:00   0.00% [usb{usbus2}]
   15 root       -72    -     0K   320K -       0   0:00   0.00% [usb{usbus0}]
    0 root       -16    0     0K  2656K -       1   0:00   0.00% [kernel{zio_claim_issue}]
   12 root       -56    -     0K   304K WAIT    0   0:00   0.00% [intr{swi5: fast taskq}]
   12 root       -60    -     0K   304K WAIT    0   0:00   0.00% [intr{swi4: clock}]
   12 root       -64    -     0K   304K WAIT    0   0:00   0.00% [intr{swi3: vm}]
   12 root       -76    -     0K   304K WAIT    0   0:00   0.00% [intr{swi0: uart}]
   12 root       -84    -     0K   304K WAIT    0   0:00   0.00% [intr{irq1: atkbd0}]
   12 root       -84    -     0K   304K WAIT    0   0:00   0.00% [intr{irq7: ppc0}]
   12 root       -88    -     0K   304K WAIT    0   0:00   0.00% [intr{irq16: uhci3}]
   12 root       -88    -     0K   304K WAIT    0   0:00   0.00% [intr{irq18: uhci2}]
   12 root       -88    -     0K   304K WAIT    0   0:00   0.00% [intr{irq19: uhci1}]

I don't see processes like dd but I see a lot of zio* entries. Could you say something about it ?
 
helicopter said:
I don't see processes like dd but I see a lot of zio* entries. Could you say something about it ?

Mainly I was trying to look for stuff stuck writing to disk. In my example the dd was just an example and closed right after getting the output to show. Had that process gotten stuck, it would have have the same state, and for that matter multiple processes may be stuck writing to ZFS. This was while I thought the issue was regarding general lockups with ZFS and not just the .zfs interface to snapshots.
 
junovitch said:
helicopter said:
I don't see processes like dd but I see a lot of zio* entries. Could you say something about it ?
Mainly I was trying to look for stuff stuck writing to disk. In my example the dd was just an example and closed right after getting the output to show. Had that process gotten stuck, it would have have the same state, and for that matter multiple processes may be stuck writing to ZFS. This was while I thought the issue was regarding general lockups with ZFS and not just the .zfs interface to snapshots.

Could you give me an advice how to identify such processes?
 
kpa said:
There could be more information on the freebsd-fs mailing list, at least this seems to be related to the problem:

http://lists.freebsd.org/pipermail/freebsd-fs/2013-November/018544.html

I read it and I checked zpool status -v.

Code:
root@myserver / % zpool status -v
  pool: system
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: none requested
config:

        NAME           STATE     READ WRITE CKSUM
        system         ONLINE       0     0     0
          mirror-0     ONLINE       0     0     0
            gpt/disk0  ONLINE       0     0     0
            gpt/disk1  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        system/backup:<0x494928>
        system/backup:<0x48eb57>
        system/backup:<0x49689e>
        system/backup:<0x48eda0>
        system/backup:<0x495fa7>
        system/backup:<0x113ab5>
        system/backup:<0x48ebc6>
        system/backup:<0x76c8>
        system/backup:<0x4930d9>
        system/backup:<0x4936e0>

May be this is a reason ?

I was trying zpool clear and it had no effect.
I've started zpool scrub and it's running now.
 
All disk errors was repaired but it didn't solve the problem.

Code:
 % zpool status
  pool: system
 state: ONLINE
  scan: scrub repaired 0 in 1h47m with 0 errors on Fri Oct  3 18:11:19 2014
config:

        NAME           STATE     READ WRITE CKSUM
        system         ONLINE       0     0     0
          mirror-0     ONLINE       0     0     0
            gpt/disk0  ONLINE       0     0     0
            gpt/disk1  ONLINE       0     0     0

errors: No known data errors

Also I note that snapshot becomes accessible when I renamed it back.
Code:
root@myserver /backup/.zfs/snapshot % zfs rename system/backup@snapshot_1 system/backup@snapshot_tmp
root@myserver /backup/.zfs/snapshot % ls -la
ls: snapshot_tmp: Device busy
total 0
dr-xr-xr-x  4 root  wheel  4 Oct  6 11:18 .
dr-xr-xr-x  4 root  wheel  4 Jul 29 16:47 ..

root@myserver /backup/.zfs/snapshot % zfs rename system/backup@snapshot_tmp system/backup@snapshot_1
root@myserver /backup/.zfs/snapshot % ls -la
total 17
dr-xr-xr-x   5 root  wheel   5 Oct  6 11:20 .
dr-xr-xr-x   4 root  wheel   4 Jul 29 16:47 ..
drwxr-xr-x  20 root  wheel  28 Sep 29 23:26 snapshot_1

There were no processes which are using data or snapshot directories.

Code:
root@myserver / % fstat /backup
USER     CMD          PID   FD MOUNT      INUM MODE         SZ|DV R/W NAME

root@myserver / % fstat /backup/.zfs/snapshot/snapshot_1
USER     CMD          PID   FD MOUNT      INUM MODE         SZ|DV R/W NAME

Do you have any ideas about it ?
 
Back
Top