ZFS performance degradation over time

The issue with the useage of too much active memory only seems to happend when doing some copying of data from one filesystem to a zfs and not when uploading some data by AFD (osx).

loader.conf:
Code:
vfs.zfs.arc_max="1800M"
vfs.zfs.arc_min="1000M"
I think with these vales I managed to hold at least 1G for zfs and by this I kept rather good performance judging by the value in iostat.

Could someone try and confirm this?
 
I've also been playing around with my ARC sizes as my zfs write performance seems to decrease after about a day of uptime on 8.0-RELEASE-p2. I've found that at this time, kstat.zfs.misc.arcstats.memory_throttle_count seems to rise when I write to it.

The problem is, FreeBSD seems to ignore any of the values I set in /boot/loader.conf. I've set the following values in loader.conf:

Code:
vm.kmem_size="2048M"
vm.kmem_size_max="2048M"
vfs.zfs.arc_min="256M"
vfs.zfs.arc_max="1024M"
Upon rebooting it seems that FreeBSD is still automatically tuning those values by itself:

Code:
vm.kmem_size_max: 329853485875
vm.kmem_size: 2770690048
vfs.zfs.arc_min: 216460160
vfs.zfs.arc_max: 1731681280
 
Matty said:
My result posted above were on an empty pool.

after filling the pool to 90% I also see some kind of degradation from 61MB/s to 35/40MB/s.


This is to be expected. ZFS has this issue on ALL systems.

It is mainly to do with the algorithm that ZFS uses to try to pick the next good block. I've read that they are going to start using different algorithms depending on how full the pool is and maybe even make it a tunable.
 
Cellsplicer said:
I've also been playing around with my ARC sizes as my zfs write performance seems to decrease after about a day of uptime on 8.0-RELEASE-p2. I've found that at this time, kstat.zfs.misc.arcstats.memory_throttle_count seems to rise when I write to it.

The problem is, FreeBSD seems to ignore any of the values I set in /boot/loader.conf. I've set the following values in loader.conf:

Code:
vm.kmem_size="2048M"
vm.kmem_size_max="2048M"
vfs.zfs.arc_min="256M"
vfs.zfs.arc_max="1024M"
Upon rebooting it seems that FreeBSD is still automatically tuning those values by itself:

Code:
vm.kmem_size_max: 329853485875
vm.kmem_size: 2770690048
vfs.zfs.arc_min: 216460160
vfs.zfs.arc_max: 1731681280

Nevermind. I had a typo in my loader.conf! Sorry for the inconvenience.
 
Information + first test

I have similar issues with performance degradation.
I've noticed it when copying from a UFS2+softupdates FS to a ZFS filesystem. Once the performance issue hits, it affects *all* writes to the ZFS pool, be it direct copy from UFS2 to ZFS, over samba, ftp, or using dd to dump.

It does not manifest itself until 'top' says the amount of free memory is below 200M, but isn't constantly present.

The amount of inact memory increases roughly as fast as free memory decreases - which makes sense.

Below are details on the setup and initial tests.
Due to character limits, I'll have to post the rest of the results in followup posts.

Code:
CPU: Intel(R) Core(TM)2 Duo CPU     E7400  @ 2.80GHz (2806.98-MHz K8-class CPU)
real memory  = 8589934592 (8192 MB)
avail memory = 8253591552 (7871 MB)
atapci1: <Intel ICH10 SATA300 controller> port 0xb000-0xb007,0xac00-0xac03,0xa880-0xa887,0xa800-0xa803,0xa480-0xa49f mem 0xf9fff000-0xf9fff7ff irq 19 at device 31.2 on pci0

Pool:
Code:
NAME           STATE    READ  WRITE CKSUM
storage        ONLINE     0     0     0
 raidz1        ONLINE     0     0     0
   ad10        ONLINE     0     0     0
   ad12        ONLINE     0     0     0
   ad14        ONLINE     0     0     0
   ad16        ONLINE     0     0     0
These are 4x Western Digital Greenpower 1.5TB disks.

file: /boot/loader.conf
Code:
kern.hz=1000
net.inet.tcp.tcbhashsize=4096
net.inet.tcp.hostcache.hashsize=1024
kern.ipc.nmbclusters=65536
vm.kmem_size="4G"
vfs.zfs.vdev.min_pending=1 #default=4
vfs.zfs.vdev.max_pending=1 #default = 35
vfs.zfs.arc_min="3G"
vfs.zfs.arc_max="3G"

# zpool get all storage
Code:
NAME     PROPERTY       VALUE       SOURCE
storage  size           5.44T       -
storage  used           635G        -
storage  available      4.82T       -
storage  capacity       11%         -
storage  altroot        -           default
storage  health         ONLINE      -
storage  guid           <#here>     -
storage  version        13          default
storage  bootfs         -           default
storage  delegation     on          default
storage  autoreplace    off         default
storage  cachefile      -           default
storage  failmode       wait        default
storage  listsnapshots  off         default

Before writing anything
# top -SP
Code:
last pid:  1843;  load averages:  0.00,  0.00,  0.00  up 0+00:15:44  04:41:49
186 processes: 3 running, 163 sleeping, 20 waiting
CPU 0:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 1:  0.4% user,  0.0% nice,  0.0% system,  0.0% interrupt, 99.6% idle
Mem: 40M Active, 21M Inact, 161M Wired, 284K Cache, 44M Buf, 7692M Free
Swap: 4096M Total, 4096M Free

  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
   11 root        2 171 ki31     0K    32K CPU0    0  31:37 200.00% idle
   12 root       20 -60    -     0K   320K WAIT    0   0:07  0.00% intr
 1730 root        1  44    0 12652K  2200K nanslp  0   0:00  0.00% gstat

# sysctl kstat.zfs.misc.arcstats
Code:
kstat.zfs.misc.arcstats.hits: 1481
kstat.zfs.misc.arcstats.misses: 110
kstat.zfs.misc.arcstats.demand_data_hits: 0
kstat.zfs.misc.arcstats.demand_data_misses: 0
kstat.zfs.misc.arcstats.demand_metadata_hits: 1477
kstat.zfs.misc.arcstats.demand_metadata_misses: 100
kstat.zfs.misc.arcstats.prefetch_data_hits: 0
kstat.zfs.misc.arcstats.prefetch_data_misses: 0
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 4
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 10
kstat.zfs.misc.arcstats.mru_hits: 294
kstat.zfs.misc.arcstats.mru_ghost_hits: 0
kstat.zfs.misc.arcstats.mfu_hits: 1183
kstat.zfs.misc.arcstats.mfu_ghost_hits: 0
kstat.zfs.misc.arcstats.deleted: 28
kstat.zfs.misc.arcstats.recycle_miss: 0
kstat.zfs.misc.arcstats.mutex_miss: 0
kstat.zfs.misc.arcstats.evict_skip: 0
kstat.zfs.misc.arcstats.hash_elements: 82
kstat.zfs.misc.arcstats.hash_elements_max: 83
kstat.zfs.misc.arcstats.hash_collisions: 0
kstat.zfs.misc.arcstats.hash_chains: 0
kstat.zfs.misc.arcstats.hash_chain_max: 0
kstat.zfs.misc.arcstats.p: 1610612736
kstat.zfs.misc.arcstats.c: 3221225472
kstat.zfs.misc.arcstats.c_min: 3221225472
kstat.zfs.misc.arcstats.c_max: 3221225472
kstat.zfs.misc.arcstats.size: 821232
kstat.zfs.misc.arcstats.hdr_size: 17056
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 0

First write test
Writing 1GB to the pool.
# dd if=/dev/urandom of=./file1 bs=1m count=1024
Code:
1024+0 records in
1024+0 records out
1073741824 bytes transferred in 14.762347 secs (72735170 bytes/sec)

During write
# zpool iostat 5
Code:
storage      635G  4.82T      0      0      0      0
storage      635G  4.82T      1     29  3.00K  1.09M
storage      635G  4.82T      0      0      0      0
storage      635G  4.82T      0      0      0      0
storage      635G  4.82T      0      0      0      0
storage      635G  4.82T      0  1.59K      0   204M
storage      635G  4.82T      0      0      0      0
First line is start of write. End of write is at third line. Some backlogging of writes?

After write
# top -SP
Code:
last pid:  1946;  load averages:  0.00,  0.02,  0.00  up 0+00:31:05  04:57:10
185 processes: 3 running, 162 sleeping, 20 waiting
CPU 0:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 1:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
Mem: 40M Active, 21M Inact, 1245M Wired, 284K Cache, 44M Buf, 6608M Free
Swap: 4096M Total, 4096M Free

  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
   11 root        2 171 ki31     0K    32K CPU0    0  60:30 199.56% idle
   12 root       20 -60    -     0K   320K WAIT    0   0:13  0.00% intr
    0 root       40  -8    0     0K   624K -       0   0:02  0.00% kernel

# sysctl kstat.zfs.misc.arcstats
Code:
kstat.zfs.misc.arcstats.hits: 1500
kstat.zfs.misc.arcstats.misses: 117
kstat.zfs.misc.arcstats.demand_data_hits: 0
kstat.zfs.misc.arcstats.demand_data_misses: 0
kstat.zfs.misc.arcstats.demand_metadata_hits: 1496
kstat.zfs.misc.arcstats.demand_metadata_misses: 107
kstat.zfs.misc.arcstats.prefetch_data_hits: 0
kstat.zfs.misc.arcstats.prefetch_data_misses: 0
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 4
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 10
kstat.zfs.misc.arcstats.mru_hits: 311
kstat.zfs.misc.arcstats.mru_ghost_hits: 0
kstat.zfs.misc.arcstats.mfu_hits: 1185
kstat.zfs.misc.arcstats.mfu_ghost_hits: 0
kstat.zfs.misc.arcstats.deleted: 38
kstat.zfs.misc.arcstats.recycle_miss: 0
kstat.zfs.misc.arcstats.mutex_miss: 0
kstat.zfs.misc.arcstats.evict_skip: 0
kstat.zfs.misc.arcstats.hash_elements: 8348
kstat.zfs.misc.arcstats.hash_elements_max: 8348
kstat.zfs.misc.arcstats.hash_collisions: 193
kstat.zfs.misc.arcstats.hash_chains: 190
kstat.zfs.misc.arcstats.hash_chain_max: 1
kstat.zfs.misc.arcstats.p: 1610612736
kstat.zfs.misc.arcstats.c: 3221225472
kstat.zfs.misc.arcstats.c_min: 3221225472
kstat.zfs.misc.arcstats.c_max: 3221225472
kstat.zfs.misc.arcstats.size: 1077603800
kstat.zfs.misc.arcstats.hdr_size: 1736384
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 0
 
second batch of tests.

Second write test
Writing 1GB to the pool.
[CMD="dd"]if=/dev/urandom of=./file2 bs=1m count=1024[/CMD]
Code:
1024+0 records in
1024+0 records out
1073741824 bytes transferred in 15.163315 secs (70811814 bytes/sec)

During write
[CMD="zpool"]iostat 1[/CMD]
Code:
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0    136      0  17.1M
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0      0      0      0
storage      636G  4.82T      0  1.74K      0   211M
storage      636G  4.82T      0  1.82K      0   232M
storage      636G  4.82T      0  2.02K      0   257M
storage      636G  4.82T      0  1.90K      0   241M
storage      636G  4.82T      0    575      0  62.1M
storage      638G  4.81T      0      0      0      0
storage      638G  4.81T      0      0      0      0
storage      638G  4.81T      0      0      0      0
storage      638G  4.81T      0      0      0      0
storage      638G  4.81T      0      0      0      0

First line is start of write. dd is done before the first 211M line. Still some backlogging of writes. (may be intended?)

After write
[CMD="top"]-SP[/CMD]
Code:
last pid:  1994;  load averages:  0.04,  0.03,  0.01  up 0+00:38:17  05:04:22
185 processes: 3 running, 162 sleeping, 20 waiting
CPU 0:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 1:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 40M Active, 21M Inact, 2281M Wired, 284K Cache, 44M Buf, 5571M Free
Swap: 4096M Total, 4096M Free

  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
   11 root        2 171 ki31     0K    32K CPU0    0  74:32 199.17% idle
   12 root       20 -60    -     0K   320K WAIT    0   0:16  0.00% intr
    0 root       40  -8    0     0K   624K -       1   0:04  0.00% kernel

[CMD="sysctl"]kstat.zfs.misc.arcstats[/CMD]
Code:
kstat.zfs.misc.arcstats.hits: 1518
kstat.zfs.misc.arcstats.misses: 117
kstat.zfs.misc.arcstats.demand_data_hits: 0
kstat.zfs.misc.arcstats.demand_data_misses: 0
kstat.zfs.misc.arcstats.demand_metadata_hits: 1514
kstat.zfs.misc.arcstats.demand_metadata_misses: 107
kstat.zfs.misc.arcstats.prefetch_data_hits: 0
kstat.zfs.misc.arcstats.prefetch_data_misses: 0
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 4
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 10
kstat.zfs.misc.arcstats.mru_hits: 323
kstat.zfs.misc.arcstats.mru_ghost_hits: 0
kstat.zfs.misc.arcstats.mfu_hits: 1191
kstat.zfs.misc.arcstats.mfu_ghost_hits: 0
kstat.zfs.misc.arcstats.deleted: 48
kstat.zfs.misc.arcstats.recycle_miss: 0
kstat.zfs.misc.arcstats.mutex_miss: 0
kstat.zfs.misc.arcstats.evict_skip: 0
kstat.zfs.misc.arcstats.hash_elements: 16605
kstat.zfs.misc.arcstats.hash_elements_max: 16605
kstat.zfs.misc.arcstats.hash_collisions: 927
kstat.zfs.misc.arcstats.hash_chains: 883
kstat.zfs.misc.arcstats.hash_chain_max: 3
kstat.zfs.misc.arcstats.p: 2150481920
kstat.zfs.misc.arcstats.c: 3221225472
kstat.zfs.misc.arcstats.c_min: 3221225472
kstat.zfs.misc.arcstats.c_max: 3221225472
kstat.zfs.misc.arcstats.size: 2154261472
kstat.zfs.misc.arcstats.hdr_size: 3453840
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 0

Write test three through seven show no performance degradation. (69-72 MB/s)
[CMD="top"]-SP[/CMD] (after seventh run)
Code:
last pid:  2058;  load averages:  0.20,  0.15,  0.06  up 0+00:45:42  05:11:47
186 processes: 3 running, 163 sleeping, 20 waiting
CPU 0:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
CPU 1:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 40M Active, 21M Inact, 3288M Wired, 284K Cache, 44M Buf, 4564M Free
Swap: 4096M Total, 4096M Free

So far so good.
 
Third batch.

Copying a file from a UFS2+softupdate FS (/usr/home) to the pool.
The file is 23178715182 bytes (22G). Lets see what happens.
Code:
15s: Mem: 42M Active, 287M Inact, 4211M Wired, 284K Cache, 828M Buf, 3374M Free
25s: Mem: 42M Active, 1139M Inact, 4128M Wired, 284K Cache, 828M Buf, 2605M Free
35s: Mem: 42M Active, 1932M Inact, 4081M Wired, 284K Cache, 828M Buf, 1860M Free
45s: Mem: 42M Active, 2767M Inact, 4180M Wired, 284K Cache, 828M Buf, 925M Free
'zpool iostat' says between 0 and 170MB/s writes.
later:
Code:
Mem: 42M Active, 3304M Inact, 4085M Wired, 284K Cache, 828M Buf, 483M Free
Performance just dropped to about 60MB/s, according to 'zpool iostat'.

Even later:
Code:
Mem: 56M Active, 3506M Inact, 4025M Wired, 116M Cache, 828M Buf, 211M Free
'zpool iostat' says between 10M and 50M writes when there are any. gstat still reporting "healthy" busy levels for all disks. (below 70%)

The Problem
Even later, still copying the same file.
zpool iostat:
Code:
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
storage      683G  4.77T      0      0      0      0
storage      683G  4.77T      0     50      0  5.77M
storage      683G  4.77T      0      1      0   255K
storage      683G  4.77T      0      0      0      0
storage      683G  4.77T      0      1      0   255K
storage      683G  4.77T      0      0      0      0
storage      683G  4.77T      0     45      0  5.73M
storage      683G  4.77T      0     73      0  9.22M
storage      683G  4.77T      1    283  2.49K  26.3M
storage      683G  4.77T      0      0      0      0
storage      683G  4.77T      0      0      0      0
storage      683G  4.77T      0      0      0      0
storage      683G  4.77T      0     50      0  5.77M
storage      683G  4.77T      0    362      0  41.1M
storage      683G  4.77T      0      0      0      0
storage      683G  4.77T      2      2  3.99K  1.50K
storage      683G  4.77T      0      0      0      0
storage      683G  4.77T      0     17      0  25.4K
storage      683G  4.77T      0      0      0      0
storage      683G  4.77T      0     20      0  29.9K
storage      683G  4.77T      0      0      0    510
storage      683G  4.77T      0      0      0      0
storage      683G  4.77T      0     16      0  27.9K
storage      683G  4.77T      0      0      0      0
storage      683G  4.77T      0      0      0      0
storage      683G  4.77T      0     46      0  5.80M
storage      683G  4.77T      0    229      0  28.4M
storage      683G  4.77T      0      0      0      0
storage      683G  4.77T      0     50      0  5.87M
storage      683G  4.77T      0      0      0      0
storage      683G  4.77T      0    100      0  11.5M
gstat
Code:
dT: 1.003s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    1     91     91  11506    1.4      0      0    0.0   13.1| ad8
    1     91     91  11506    1.5      0      0    0.0   13.2| ad8s1
    0      0      0      0    0.0      0      0    0.0    0.0| ad8s1a
    0      0      0      0    0.0      0      0    0.0    0.0| ad8s1b
    0      0      0      0    0.0      0      0    0.0    0.0| ad8s1d
    0      0      0      0    0.0      0      0    0.0    0.0| ad8s1e
    1     91     91  11506    1.5      0      0    0.0   13.3| ad8s1f
    1     25      0      0    0.0     24   1963    0.9  184.6| ad10
    0    106      0      0    0.0    106   8761    1.1   12.0| ad12
    0    107      0      0    0.0    107   8760    1.1   12.2| ad14
    0    109      0      0    0.0    109   8754    1.0   11.1| ad16

top:
Code:
last pid:  2161;  load averages:  0.25,  0.33,  0.27  up 0+01:02:50  05:28:55
187 processes: 3 running, 164 sleeping, 20 waiting
CPU 0:  0.0% user,  0.0% nice,  0.0% system,  0.4% interrupt, 99.6% idle
CPU 1:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 56M Active, 3606M Inact, 4029M Wired, 38M Cache, 828M Buf, 186M Free
Swap: 4096M Total, 48K Used, 4096M Free
The moment the "Mem: Free" status in top went below 200M, there was a severe performance hit.
Okay, this seems to be pretty much exactly what I observed earlier.

sysctl kstat.zfs.misc.arcstats
Code:
kstat.zfs.misc.arcstats.hits: 49109
kstat.zfs.misc.arcstats.misses: 1769
kstat.zfs.misc.arcstats.demand_data_hits: 22112
kstat.zfs.misc.arcstats.demand_data_misses: 1
kstat.zfs.misc.arcstats.demand_metadata_hits: 6058
kstat.zfs.misc.arcstats.demand_metadata_misses: 1335
kstat.zfs.misc.arcstats.prefetch_data_hits: 20310
kstat.zfs.misc.arcstats.prefetch_data_misses: 0
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 629
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 433
kstat.zfs.misc.arcstats.mru_hits: 26543
kstat.zfs.misc.arcstats.mru_ghost_hits: 1073
kstat.zfs.misc.arcstats.mfu_hits: 1627
kstat.zfs.misc.arcstats.mfu_ghost_hits: 27
kstat.zfs.misc.arcstats.deleted: 294263
kstat.zfs.misc.arcstats.recycle_miss: 3619
kstat.zfs.misc.arcstats.mutex_miss: 0
kstat.zfs.misc.arcstats.evict_skip: 702
kstat.zfs.misc.arcstats.hash_elements: 28690
kstat.zfs.misc.arcstats.hash_elements_max: 48274
kstat.zfs.misc.arcstats.hash_collisions: 63018
kstat.zfs.misc.arcstats.hash_chains: 2737
kstat.zfs.misc.arcstats.hash_chain_max: 5
kstat.zfs.misc.arcstats.p: 3199266304
kstat.zfs.misc.arcstats.c: 3221225472
kstat.zfs.misc.arcstats.c_min: 3221225472
kstat.zfs.misc.arcstats.c_max: 3221225472
kstat.zfs.misc.arcstats.size: 3154617112
kstat.zfs.misc.arcstats.hdr_size: 6015360
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 1103
 
And the final bunch of data

Now to rerun the previous tests.
Test 9
dd if=/dev/urandom of=./file9 bs=1m count=1024
Code:
1024+0 records in
1024+0 records out
1073741824 bytes transferred in 26.445673 secs (40601796 bytes/sec)
It went down from ~70MB/s to ~40MB/s!

sysctl kstat.zfs.misc.arcstats
Code:
kstat.zfs.misc.arcstats.hits: 50068
kstat.zfs.misc.arcstats.misses: 1772
kstat.zfs.misc.arcstats.demand_data_hits: 22496
kstat.zfs.misc.arcstats.demand_data_misses: 1
kstat.zfs.misc.arcstats.demand_metadata_hits: 6257
kstat.zfs.misc.arcstats.demand_metadata_misses: 1338
kstat.zfs.misc.arcstats.prefetch_data_hits: 20686
kstat.zfs.misc.arcstats.prefetch_data_misses: 0
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 629
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 433
kstat.zfs.misc.arcstats.mru_hits: 27114
kstat.zfs.misc.arcstats.mru_ghost_hits: 1073
kstat.zfs.misc.arcstats.mfu_hits: 1639
kstat.zfs.misc.arcstats.mfu_ghost_hits: 30
kstat.zfs.misc.arcstats.deleted: 294281
kstat.zfs.misc.arcstats.recycle_miss: 3643
kstat.zfs.misc.arcstats.mutex_miss: 0
kstat.zfs.misc.arcstats.evict_skip: 702
kstat.zfs.misc.arcstats.hash_elements: 38480
kstat.zfs.misc.arcstats.hash_elements_max: 48274
kstat.zfs.misc.arcstats.hash_collisions: 65447
kstat.zfs.misc.arcstats.hash_chains: 4725
kstat.zfs.misc.arcstats.hash_chain_max: 5
kstat.zfs.misc.arcstats.p: 3199118848
kstat.zfs.misc.arcstats.c: 3221225472
kstat.zfs.misc.arcstats.c_min: 3221225472
kstat.zfs.misc.arcstats.c_max: 3221225472
kstat.zfs.misc.arcstats.size: 3149963720
kstat.zfs.misc.arcstats.hdr_size: 8003840
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 1169

Copying a bunch of files from the UFS2 filesystem to the ZFS filesystem, all of them LARGE files.
Performance is good for a while (zpool iostat claims ~40MB/s), untill suddenly this happens:
[CMD=]zpool iostat 1[/CMD]
Code:
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
storage      670G  4.78T      0    439      0  44.6M
storage      670G  4.78T      0      0      0      0
storage      670G  4.78T      0      0      0      0
storage      670G  4.78T      0      0      0      0
storage      671G  4.78T      0      3      0   464K
storage      671G  4.78T      0     50      0  5.75M
storage      671G  4.78T      0      0      0      0
storage      671G  4.78T      0    435      0  44.6M
storage      671G  4.78T      0      0      0      0
storage      671G  4.78T      0      0      0      0
storage      671G  4.78T      0      0      0      0
storage      671G  4.78T      0      0      0      0
storage      671G  4.78T      0     50      0  5.77M
storage      671G  4.78T      0    276      0  34.2M
storage      671G  4.78T      0      0      0      0
storage      671G  4.78T      0    158      0  10.3M
storage      671G  4.78T      0      0      0      0
storage      671G  4.78T      0      0      0      0
storage      671G  4.78T      0      0      0      0
storage      671G  4.78T      0      0      0      0
storage      671G  4.78T      0    288      0  35.5M
storage      671G  4.78T      0    199      0  15.1M
storage      671G  4.78T      0      0      0      0
storage      671G  4.78T      0      0      0      0
storage      671G  4.78T      0      0      0      0
storage      671G  4.78T      0     51      0  5.77M
storage      671G  4.78T      0    440      0  45.1M

Code:
sysctl kstat.zfs.misc.arcstats
kstat.zfs.misc.arcstats.hits: 246502
kstat.zfs.misc.arcstats.misses: 12076
kstat.zfs.misc.arcstats.demand_data_hits: 26592
kstat.zfs.misc.arcstats.demand_data_misses: 7
kstat.zfs.misc.arcstats.demand_metadata_hits: 193843
kstat.zfs.misc.arcstats.demand_metadata_misses: 7060
kstat.zfs.misc.arcstats.prefetch_data_hits: 24321
kstat.zfs.misc.arcstats.prefetch_data_misses: 2
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 1746
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 5007
kstat.zfs.misc.arcstats.mru_hits: 141208
kstat.zfs.misc.arcstats.mru_ghost_hits: 2008
kstat.zfs.misc.arcstats.mfu_hits: 79334
kstat.zfs.misc.arcstats.mfu_ghost_hits: 32
kstat.zfs.misc.arcstats.deleted: 365410
kstat.zfs.misc.arcstats.recycle_miss: 3737
kstat.zfs.misc.arcstats.mutex_miss: 0
kstat.zfs.misc.arcstats.evict_skip: 702
kstat.zfs.misc.arcstats.hash_elements: 33907
kstat.zfs.misc.arcstats.hash_elements_max: 48274
kstat.zfs.misc.arcstats.hash_collisions: 79293
kstat.zfs.misc.arcstats.hash_chains: 3648
kstat.zfs.misc.arcstats.hash_chain_max: 5
kstat.zfs.misc.arcstats.p: 3214337024
kstat.zfs.misc.arcstats.c: 3221225472
kstat.zfs.misc.arcstats.c_min: 3221225472
kstat.zfs.misc.arcstats.c_max: 3221225472
kstat.zfs.misc.arcstats.size: 1900052248
kstat.zfs.misc.arcstats.hdr_size: 7131904
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 1535

top:
Code:
last pid:  2307;  load averages:  0.32,  0.28,  0.17  up 0+01:23:49  05:49:54
186 processes: 3 running, 163 sleeping, 20 waiting
CPU 0:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 1:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 58M Active, 4637M Inact, 2894M Wired, 102M Cache, 828M Buf, 222M Free
Swap: 4096M Total, 40K Used, 4096M Free

  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
   11 root        2 171 ki31     0K    32K RUN     0 157:22 198.10% idle
    0 root       40  -8    0     0K   624K -       1   1:43  0.00% kernel
   12 root       20 -60    -     0K   320K WAIT    0   1:01  0.00% intr

After some time, performance picks up again, into the 40-50MB/s range. Still nowhere near where it was in the beginning, though.


Whew, that ent up being a lot of data.
I really hope this information helps in finding out where the write performance degredation resides. :)
 
the problem is in the maxvnodes. Yesterday I set maxvnodes to 10000 (from 800000) and even after copying few gigs from ufs to zfs the active/inactive stayed below 500mb with leaves enough free mem for zfs.

If the freemem goes below a certain level then performance drops as well.
As far as I can see the maxvnodes goes into active/inactive mem. ZFS-ARC cache into wired. correct me if I'm wrong :)
 
Today I added the following directive to my /boot/loader.conf:

Code:
vfs.zfs.txg.synctime="1"

It seems to have made a difference and write throttles occur less often. It is able to sustain writes of over 80MB/s versus 20-30MB/s which I was getting with the default sync time of 5 seconds. Of course this is still down from the 200MB/s+ that I should be getting but it's a good start.

My theory is that by lowering the sync time, ZFS starts writing to the disk straight away, before the ARC fills up. Write throttles seem to occur when the ARC fills up and it needs to write to the disks. If the drives can't keep up during an I/O intensive operation then a write throttle will occur and applications will have to wait until ZFS can sync the disks.
 
No performance gain from doing that here - plummeled down to 30-40MB/s write speed right away actually.

I notice I forgot to mention I'm running FBSD 8.0-RELEASE-p2 AMD64.
 
@Savagedlight: Nice work. Thanks for your data. I see the exact same patterns as you do.

@Matty: I'm not sure about your vnodes theory. Whether my performance is good, or when my performance is bad, my vnodes seem to be staying around 12,000.


Memory:
I've been watching my memory usage and I have no idea what is consuming memory as 'Active'.

Last night I had around 6500MB 'Active' again, 1500MB Wired, no inact, ~30MB buf, no free, and ~100MB swap used. My performance copying ZFS->ZFS was again slow (<1MB/s). I tried killing rTorrent and no significant amount of memory was reclaimed - maybe 100MB. `ps aux` showed no processes using any significant amount of memory, and I was definitely nowhere near 6500MB usage.

I tried running a perl oneliner to hog a bunch of memory (perl -e '$x="x"x3000000000'), and almost all of the Active memory was IMMEDIATELY marked as Free, and my performance was excellent again.

I'm not sure what in userland could be causing the issue. The only things I've installed are rTorrent, lighttpd, samba, smartmontools, vim, bash, Python, Perl, and SABNZBd. There is nothing that *should* be consuming any serious amount of memory.


Drives:
One thing that came up on FreeBSD-stable is that WD Green drives have a serious issue with their power management under FreeBSD/Linux. The drive heads park after 8 seconds of inactivity, but BSD/Linux sync to disk less frequently than this, so the drives are continually parking and unparking.

The drives are rated for ~300,000 load cycles; after 2000 power on hours my drives are already over 90,000 load cycles. I used a tool released by WD to change the idle timeout from 8 seconds to 5 minutes (the maximum) and haven't seen my load cycle count increase yet.

Use smartctl to check the Load_Cycle_Count attribute on your drives, and if it is unusually high, get the tool 'wdidle3' and boot to a DOS environment to use it to configure your drive timeouts.
Details: http://www.silentpcreview.com/Terabyte_Drive_Fix
wdidle3: http://home.arcor.de/ghostadmin/wdidle3_1_00.zip

Doing this does not seem to have helped with my performance issues, but it ought to help drive longevity.
 
garrettmoore said:



Memory:
I've been watching my memory usage and I have no idea what is consuming memory as 'Active'.

Last night I had around 6500MB 'Active' again, 1500MB Wired, no inact, ~30MB buf, no free, and ~100MB swap used. My performance copying ZFS->ZFS was again slow (<1MB/s). I tried killing rTorrent and no significant amount of memory was reclaimed - maybe 100MB. `ps aux` showed no processes using any significant amount of memory, and I was definitely nowhere near 6500MB usage.

which of these programs is reading or writing on ufs?
 
All of the apps are installed to a UFS gmirror. Configuration files are stored on UFS.

All data (rTorrent torrents, folders shared via samba, download locations for sabnzbd, etc) is on ZFS.
 
I realize it's a dirty workaround, but I've set the perl one-liner to run every five minutes. It really does make ZFS' performance improve for a while. By improve, I mean "jump up from about 20MB/s writes to about 230MB/s writes". :)

It seems to me like there is *something* in relation to the filesystems (cache perhaps?) which causes memory to be marked as Inact when it doesn't really need that particular data any more - but simply doesn't want to *use* Inact memory, even when there's no 'free' memory.

Right now, while copying from UFS2+SoftUpdates to ZFS:
Code:
Mem: 14M Active, 2487M Inact, 5155M Wired, 202M Cache, 828M Buf, 56M Free
(yes, I bumped the ARC size to 4G and kernel size to start at 5G)
 
I can confirm that the perl trick works. After running the following:

Code:
perl -e '$x="x"x3000000000'

I am able to write at over 200MB/s for a short while.
 
Slightly off-topic:
garrettmoore said:
Drives:
I used a tool released by WD to change the idle timeout from 8 seconds to 5 minutes (the maximum) and haven't seen my load cycle count increase yet.

The documentation for the tool states you can use the /D switch to disable the timer. I assume this means it won't ever enter idle mode. Is this worth it, or would that cause longevity issues of its own?

On-topic again:
Custom kernel:
In the hassle of trying to make ZFS work properly, I forgot I had compiled a custom kernel. The said kernel were built with these make.conf lines:
Code:
CPUTYPE=core2
COPTFLAGS= -O2 -pipe

It also had these customizations:
Code:
#makeoptions     DEBUG=-g                # Build kernel without gdb(1) debug symbols
options         QUOTA
options         DEVICE_POLLING

I've recompiled the kernel w/o the COPTFLAGS, and reenabled the default debug options. Going to see if this changes anything.

Memory
Is there any way of telling FBSDs memory manager to flush out inactive memory, other than by forcing it to like stated earlier in the thread?
 
Savagedlight said:
Slightly off-topic:
The documentation for the tool states you can use the /D switch to disable the timer. I assume this means it won't ever enter idle mode. Is this worth it, or would that cause longevity issues of its own?
People on the silentpcreview forums claimed that sometimes the disable switch would cause the timer to just stay at it's default setting of 8 seconds instead of actually disabling it. I set it the idle time to 5 minutes over 24 hours ago and they haven't parked again since.
 
Savagedlight said:
I realize it's a dirty workaround, but I've set the perl one-liner to run every five minutes. It really does make ZFS' performance improve for a while. By improve, I mean "jump up from about 20MB/s writes to about 230MB/s writes". :)

It seems to me like there is *something* in relation to the filesystems (cache perhaps?) which causes memory to be marked as Inact when it doesn't really need that particular data any more - but simply doesn't want to *use* Inact memory, even when there's no 'free' memory.

Right now, while copying from UFS2+SoftUpdates to ZFS:
Code:
Mem: 14M Active, 2487M Inact, 5155M Wired, 202M Cache, 828M Buf, 56M Free
(yes, I bumped the ARC size to 4G and kernel size to start at 5G)
you could try to set the maxvnodes to much lower level.
4GB machine: 2800M arc cache leaves around 1 Gb free and with maxvnodes set to 12000 a max of 500 mb gets used by ufs which leaves enough room for proper zfs performance.
worked for me.


11.13.3.1 kern.maxvnodes

A vnode is the internal representation of a file or directory. So increasing the number of vnodes available to the operating system cuts down on disk I/O.
Normally this is handled by the operating system and does not need to be changed. In some cases where disk I/O is a bottleneck and the system is running out of vnodes, this setting will need to be increased.
The amount of inactive and free RAM will need to be taken into account.

To see the current number of vnodes in use:

Code:
# sysctl vfs.numvnodes
vfs.numvnodes: 91349

To see the maximum vnodes:

Code:
# sysctl kern.maxvnodes
kern.maxvnodes: 100000

If the current vnode usage is near the maximum, increasing kern.maxvnodes by a value of 1,000 is probably a good idea.
Keep an eye on the number of vfs.numvnodes. If it climbs up to the maximum again, kern.maxvnodes will need to be increased further.
A shift in your memory usage as reported by top(1) should be visible. More memory should be active.
http://www.freebsd.org/doc/handbook/configtuning-kernel-limits.html
 
The one thing that's for sure is that zfs on FreeBSD competes for your RAM with other filesystems.

Meaning: Whenever top shows memory as beeing "Inactive", it effectively means that zfs cannot use this memory for it's caches.

So what happens if you copy a large amount of data from a ufs/ext2/msdos filesystem to a zfs filesystem is that more and more memory will get "Inactive", while zfs caches are shrinked to a minimum. Even if the copy job is finished, the memory will stay "Inactive". You have to unmount the ufs/ext2/msdos filesystem to make the "Inactive" memory usable for zfs again.

That doesn't mean that there are no other problems. I don't use zfs on a permanently running machine, and I don't use raidz. But you should check with top, if you have large amounts of "Inactive" memory, and if you have, unmount non-zfs filesystems (if you can) and see if the performance increases to normal.
 
Seems like Matty's last post did the trick for me.
ZFS performance seems to be good after adding these lines to /etc/sysctl.conf:
Code:
kern.maxvnodes=10000
kern.minvnodes=1000
(I figured it made sense to change minvnodes to something less than maxvnodes, since it defaulted to 25k for me)
 
garrettmoore said:
Hi,

I'm having problems with ZFS performance. When my system comes up, read/write speeds are excellent (testing with dd if=/dev/zero of=/tank/bigfile and dd if=/tank/bigfile of=/dev/null); I get at least 100MB/s on both reads and writes, and I'm happy with that.

The longer the system is up, the worse my performance gets. Currently my system has been up for 4 days, and read/write performance is down to about 10MB/s at best.

The system is only accessed by 3 clients: myself, my roommate, and our HTPC. Usually, only one client will be doing anything at a time, so it is not under heavy load or anything.

[*]rTorrent - read and write, usually active, not doing much for reads (not doing any major seeding)
[*]SABnzbd+ - write only, not always active
[*]Lighttpd - running ruTorrent (web interface for rTorrent); nothing else
[*]samba - all of our clients are running Windows, so we use samba to network-mount the zpool

Welcome!

Encountered this problem - the culprit was a program rtorrent (it eventually all RAM mark as Active - and had no memory for ZFS - hence the slowdown).

Translate and Read this link: http://www.opennet.ru/openforum/vsluhforumID1/87116.html

And use for rtorrent this patch: http://www.opennet.ru/openforum/vsluhforumID1/87116.html # 14

Good Luck!
 
garrettmoore said:
I'm not sure what in userland could be causing the issue. The only things I've installed are rTorrent, lighttpd, samba, smartmontools, vim, bash, Python, Perl, and SABNZBd. There is nothing that *should* be consuming any serious amount of memory.

one thing i can say for sure is that with rtorrent if you do not LIMIT the memory rtorrent has available to it on a zfs system it will not work well.

I had this EXACT same issue on a server, by limited both arc and rtorrent's memory i was able to get consistent performance.
 
Dorlas said:
Welcome!

Encountered this problem - the culprit was a program rtorrent (it eventually all RAM mark as Active - and had no memory for ZFS - hence the slowdown).

Translate and Read this link: http://www.opennet.ru/openforum/vsluhforumID1/87116.html

And use for rtorrent this patch: http://www.opennet.ru/openforum/vsluhforumID1/87116.html # 14

Good Luck!

you shouldnt' have to patch rtorrent.
It's QUITE simple to just set rtorrent to a maximum amount of memory with a line line this:

Code:
max_memory_usage = 1024M
 
Back
Top