ZFS Wired memory keeps growing

Hi, I'm posting this in storage because the problem described did not happen prior to using ZFS.

I have several FreeBSD 9.3-RELEASE systems, kept up to date using freebsd-update(8) and pkg-update(8). They are running on SuperMicro hardware and AMD processors (6 to 12 core CPUs) and have 16-32GB memory.

Server1 (32 GB) runs primarily Dovecot for IMAP and Postfix for SMTP, although it also runs spamassassin/amavisd/clamav, BIND, MySQL, Apache+mod_php and flow-capture. It also runs a handful of jails for anti spam and some FAMP applications. Server2 (20GB) runs primarily FAMP applications, a handful of jails as well as BIND.

Both of these had been running on UFS for several years and had no problems related to wired memory. Both were upgraded in the last 6 months to ZFS. They have 4 drives in a mirrored pool and have an SSD for L2ARC, SLOG and swap. Server 1 has vfs.zfs.arc_max=2gb and server2 is set to 1 GB.

Since the upgrade, wired memory on server1 as reported by top(1) grows by about 1GB per day and once it reaches about 28GB the system starts swapping and within a day or two it becomes unusable. server2 however does not exhibit this problem and the wired memory does not continue to grow. I have to reboot server1 about once a month for this issue.

Prior to the last reboot of server1, I shut down all jails and processes on the machine except ssh and the wired memory did not decrease. Output from the server1 prior to the reboot is listed below.

Suggestions are appreciated. Also let me know if this is better posted to another forum.

===================================================
Code:
# top
last pid: 85846;  load averages:  0.02,  0.43,  0.99  up 28+19:18:48    06:50:00
162 processes: 1 running, 156 sleeping, 5 stopped
Mem: 152M Active, 750M Inact, 26G Wired, 561M Cache, 3310M Buf, 4136M Free
ARC: 2048M Total, 1139M MFU, 107M MRU, 41M Anon, 247M Header, 541M Other
Swap: 8192M Total, 79M Used, 8112M Free


Code:
# vmstat -m
         Type InUse MemUse HighUse Requests  Size(s)
         cdev     9     3K       -        9  256
      CAM DEV    19    38K       -     1755  2048
        sigio    22     2K       -    22567  64
     filedesc  1989  2034K       - 25118472  16,32,64,128,256,512,1024,2048,4096
      kdtrace  4297   917K       - 21308170  64,256
         kenv    85    11K       -      101  16,32,64,128
       kqueue  1444  1643K       - 245467159  256,512,2048,4096
      CAM XPT    69     7K       -     1937  16,32,64,128,256,1024,2048
    proc-args   751    59K       - 368462218  16,32,64,128,256
        hhook     2     1K       -        2  256
       kbdmux     7    18K       -        7  16,512,1024,2048
      ithread   138    23K       -      138  32,128,256
       prison    16    33K       -       18  16,4096
       KTRACE   100    13K       -      100  128
          LED     2     1K       -        2  16,128
       linker   236    67K       -      258  16,32,64,128,256,512,1024,2048,4096
        lockf  2800   251K       - 3668556859  64,128,256,512,1024,2048,4096
   loginclass     4     1K       -   249769  64
      entropy  1024    64K       -     1024  64
       ip6ndp     9     1K       -       10  64,128
       ip6opt     4     1K       -  9342118  256
         temp    43    18K       - 95611931  16,32,64,128,256,512,1024,2048,4096
       devbuf 17394 34470K       -    17596  16,32,64,128,256,512,1024,2048,4096
        cache     1     1K       -        1  32
       module   494    62K       -      494  128
     mtx_pool     2    16K       -        2  
    CAM queue    39    10K       -     7013  16,32,64,512
          osd   118     2K       -   645666  16,32,64
      CAM SIM    12     3K       -       12  256
     pmchooks     1     1K       -        1  128
      acpidev    54     4K       -       54  64
      subproc  3891  4882K       - 12327428  512,4096
         proc     2   128K       -        2  
      session   187    24K       -   522831  128
         pgrp   224    28K       -   552746  128
         cred  3078   481K       - 954170796  64,256
      uidinfo   118    31K       -  3641121  128
       plimit   571   143K       - 16386354  256
    sysctltmp     0     0K       - 10970865  16,32,64,128,4096
    sysctloid  5166   254K       -     5287  16,32,64,128
       sysctl     0     0K       -  9630567  16,32,64
      tidhash     1   128K       -        1  
      callout    11  5632K       -       11  
         umtx 10734  1342K       -    23460  128
     p1003.1b     1     1K       -        1  16
         SWAP     4  1097K       -        4  64
       bus-sc   109   198K       -     4771  16,32,64,128,256,512,1024,2048,4096
          bus  1334   113K       -    16571  16,32,64,128,256,512,1024
      devstat    28    57K       -       28  32,4096
 eventhandler    92     8K       -       92  64,128
         kobj   334  1336K       -     1031  4096
      Per-cpu     1     1K       -        1  32
     SCSI ENC    26   132K       -   704906  16,32,64,256,2048
       feeder     7     1K       -        7  32
         rman   248    29K       -      630  16,32,128
       DEVFS1   191    96K       -     2024  512
         sbuf     1     1K       -   113236  16,32,64,128,256,512,1024,2048,4096
       DEVFS3  1938   485K       -    17353  256
       DEVFS2   191    24K       -     5051  16,32,64,128
        stack     0     0K       -        2  256
    taskqueue   121    11K       -      185  16,32,64,128
       Unitno    32     2K       - 28289163  32,64
          iov     0     0K       - 3288981224  16,32,64,128,256,512,1024
       select  4990   624K       -    11283  128
     ioctlops     0     0K       - 767987696  16,32,64,128,256,512,1024
          msg     4    30K       -        4  2048,4096
          sem     4   106K       -        4  2048,4096
          shm     1    20K       -        3  2048
          tty    60    60K       -     1886  1024,2048
          pts    41    11K       -     1859  256
     mbuf_tag     0     0K       -     7169  32,128
        shmfd     1     8K       -        2  1024
   DEVFS_RULE    55    26K       -      106  64,512
          pcb   282   165K       - 236719788  16,32,128,1024,2048,4096
       soname   629    75K       - 1478789101  16,32,64,128
          acl     0     0K       -   109295  4096
     vfscache     1  8192K       -        1  
   cl_savebuf     0     0K       -  4812969  64
     vfs_hash     1  4096K       -        1  
        DEVFS   199     5K       -      222  16,32,128
       vnodes    13     2K       -       30  64,256
       DEVFSP     0     0K       -      209  64
        mount   909    39K       -    15440  16,32,64,128,256,1024
  vnodemarker     0     0K       -  5760596  512
          BPF     7     1K       -       78  16,64,128,256,512,4096
  ether_multi    40     3K       -       46  16,32,64
       ifaddr   152    50K       -      152  32,64,128,256,512,4096
        ifnet     8    15K       -        8  128,2048
        clone     6    24K       -        6  4096
       arpcom     4     1K       -        4  16
      lltable    99    29K       -    10689  256,512
     routetbl   179    28K       -  1801083  32,64,128,256,512
         igmp     7     2K       -        7  256
  ip_moptions   126    20K       -      126  64,256
     in_multi     2     1K       -        2  256
   in_mfilter    63    63K       -       63  1024
    sctp_iter     0     0K       -       50  256
     sctp_ifn     2     1K       -        2  128
     sctp_ifa    67     9K       -       67  128
     sctp_vrf     1     1K       -        1  64
    sctp_a_it     0     0K       -       50  16
    hostcache     1    28K       -        1  
     syncache     1    96K       -        1  
 ip6_moptions     4     1K       -        8  32,256
    in6_multi    25     3K       -       25  32,256
  in6_mfilter     2     2K       -        4  1024
          mld     7     1K       -        7  128
          rpc     2     1K       -        2  256
audit_evclass   180     6K       -      219  32
      jblocks     2     1K       -        2  128,256
     savedino     0     0K       -  1061499  256
        sbdep     0     0K       -    80214  64
      jsegdep    47     3K       - 230875518  64
         jseg    87    11K       -  5410275  128
    jfreefrag     0     0K       - 21231527  128
      jnewblk     0     0K       - 176280997  128
       jmvref     0     0K       -       37  128
      jremref     0     0K       - 16681477  128
      jaddref     0     0K       - 16681517  128
      freedep     0     0K       -   230592  64
     freework     7     1K       - 13300412  16,128
    newdirblk     0     0K       -       42  64
       dirrem    28     4K       - 16681477  128
       diradd     4     1K       - 16681517  128
     freefile    11     1K       -  6902026  64
     freeblks     6     2K       -  6453455  256
     freefrag     0     0K       - 21231527  128
     indirdep     0     0K       -   249469  128
       newblk     1  8192K       - 176280998  256
    bmsafemap     2     9K       -  9940390  256
     inodedep    70  4131K       - 17916957  512
      pagedep    33   520K       -  4795103  256
  ufs_dirhash     1     1K       -      564  16,32,64,128,512
    ufs_quota     1  4096K       -        1  
    ufs_mount     3    13K       -        3  512,4096
    vm_pgdata     3  4097K       -        3  128
      UMAHash     5  1082K       -       33  512,1024,2048,4096
    pfs_nodes    21     6K       -       21  256
  pfs_vncache     0     0K       -    22970  64
     pci_link    38     3K       -       38  16,128
      memdesc     1     4K       -        1  4096
         GEOM   443    69K       -     2678  16,32,64,128,256,512,1024,2048
     atkbddev     2     1K       -        2  64
       aacbuf   264    73K       -      321  32,64,128,512
   CAM periph    12     3K       -      900  16,32,64,128,256
    raid_data     0     0K       -      450  32,128,256
         UART     6     4K       -        6  16,512,1024
md_nvidia_data     0     0K       -       74  512
  md_sii_data     0     0K       -       74  512
     acpiintr     1     1K       -        1  64
       acpica  2613   265K       -    66254  16,32,64,128,256,512,1024,2048
     acpitask     1    16K       -        1  
       apmdev     1     1K       -        1  128
CAM dev queue    12     1K       -       12  32
   madt_table     0     0K       -        1  4096
      acpisem    17     3K       -       17  128
     CAM path    22     1K       -     2670  32
      io_apic     1     2K       -        1  2048
       isadev     6     1K       -        6  128
          MCA    12     2K       -       12  128
          msi    17     3K       -       17  128
     nexusdev     3     1K       -        3  16
      CAM CCB    11    22K       -       58  2048
       USBdev    17     4K       -       17  64,128,512
          USB    24    34K       -       26  16,32,64,128,256,4096
      scsi_cd     0     0K       -        6  16
   kstat_data     6     1K       -        6  64
      solaris 2989444 23584478K       - 33934776661  16,32,64,128,256,512,1024,2048,4096
  IpFw/IpAcct   162    56K       -     1311  16,32,64,128,256,512,1024,2048
     ipfw_tbl    23     6K       -     5246  256
     dummynet     3     1K       -        3  256,512
     dummynet     3     3K       -        3  512,1024
  nullfs_node    68     5K       -   363870  64
  nullfs_hash     1  4096K       -        1  
 nullfs_mount     8     1K       -        9  32
  fdesc_mount     8     1K       -        9  16


# vmstat -z 
ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP

UMA Kegs:               208,      0,     102,       0,     102,   0,   0
UMA Zones:             1920,      0,     102,       0,     102,   0,   0
UMA Slabs:              568,      0,  425396,   28918,1090374753,   0,   0
UMA RCntSlabs:          568,      0,    3590,    1233, 4279233,   0,   0
UMA Hash:               256,      0,       0,       0,       5,   0,   0
16 Bucket:              152,      0,      28,     147,     300,   0,   0
32 Bucket:              280,      0,      31,      67,     438,   0,   0
64 Bucket:              536,      0,      49,      77,     772,  81,   0
128 Bucket:            1048,      0,    2487,       0,10057607,6212912,   0
VM OBJECT:              232,      0,   51693,  222355,571603465,   0,   0
MAP:                    240,      0,       8,      24,       8,   0,   0
KMAP ENTRY:             128, 2140519,   19909,   20923,2994482057,   0,   0
MAP ENTRY:              128,      0,   51926,   46094,1498486887,   0,   0
fakepg:                 120,      0,       0,     124,      42,   0,   0
mt_zone:               4112,      0,     336,      68,     336,   0,   0
16:                      16,      0,   49782,  372402,10707331930,   0,   0
32:                      32,      0, 2001565,   23081,5789638082,   0,   0
64:                      64,      0,   73833,  453015,7917751389,   0,   0
128:                    128,      0,   92278,  125454,12010379811,   0,   0
256:                    256,      0,   32478,   87777,3795185736,   0,   0
512:                    512,      0,  396038,  391742,3501362459,   0,   0
1024:                  1024,      0,   10915,    2409,197647209,   0,   0
2048:                  2048,      0,    9145,    1993,266652150,   0,   0
4096:                  4096,      0,   95221,    6303,687473294,   0,   0
Files:                   80,      0,   10552,   13658,1336116957,   0,   0
rl_entry:                40,      0,    3001,     695,    3001,   0,   0
TURNSTILE:              136,      0,    5368,    2332,   11731,   0,   0
umtx pi:                 96,      0,       0,       0,       0,   0,   0
MAC labels:              40,      0,       0,       0,       0,   0,   0
PROC:                  1192,      0,     839,    2212,12324378,   0,   0
THREAD:                1160,      0,    3459,    1908, 8983796,   0,   0
SLEEPQUEUE:              80,      0,    5368,    2346,   11731,   0,   0
VMSPACE:                400,      0,     821,    2572,12324361,   0,   0
cpuset:                  72,      0,     291,     659,     536,   0,   0
audit_record:           960,      0,       0,       0,       0,   0,   0
mbuf_packet:            256, 13010940,     263,    2169,3471104399,   0,   0
mbuf:                   256, 13010940,     276,    2122,10609670444,   0,   0
mbuf_cluster:          2048, 2032958,    2432,    2796, 3841664,   0,   0
mbuf_jumbo_page:       4096, 1016479,      15,     961,909732577,   0,   0
mbuf_jumbo_9k:         9216, 301179,       0,       0,       0,   0,   0
mbuf_jumbo_16k:       16384, 169413,       0,       0,       0,   0,   0
mbuf_ext_refcnt:          4,      0,     232,    1784,928650984,   0,   0
g_bio:                  248,      0,      39,    3711,3950598183,   0,   0
ttyinq:                 160,      0,     735,    1521,   29115,   0,   0
ttyoutq:                256,      0,     392,    1123,   15527,   0,   0
ata_request:            328,      0,       0,    1248,456305323,   0,   0
ata_composite:          336,      0,       0,       0,       0,   0,   0
vtnet_tx_hdr:            24,      0,       0,       0,       0,   0,   0
FPU_save_area:          512,      0,       0,       0,       0,   0,   0
taskq_zone:              48,      0,       0,    2592,37850138,   0,   0
VNODE:                  504,      0,  171275,   46701,853879454,   0,   0
VNODEPOLL:              112,      0,     414,    4437, 1009149,   0,   0
NAMEI:                 1024,      0,       1,    1755,5493991062,   0,   0
S VFS Cache:            108,      0,   40645,  172832,489907711,   0,   0
STS VFS Cache:          148,      0,       0,       0,       0,   0,   0
L VFS Cache:            328,      0,  142259,   11053,445475139,   0,   0
LTS VFS Cache:          368,      0,       0,       0,       0,   0,   0
NCLNODE:                568,      0,       0,       0,       0,   0,   0
DIRHASH:               1024,      0,       0,     540,    1696,   0,   0
pipe:                   728,      0,     941,    1944,10549554,   0,   0
Mountpoints:            824,      0,     111,      97,     118,   0,   0
range_seg_cache:         64,      0,  526017,  161383,474243547,   0,   0
zio_cache:              920,      0,      70,    4370,13115140732,   0,   0
zio_link_cache:          48,      0,      70,    4466,6121891838,   0,   0
lz4_ctx:              16384,      0,       0,    1727,209144141,   0,   0
sa_cache:                80,      0,  170671,  124934,846538044,   0,   0
dnode_t:                744,      0,  414796,  271059,814607573,   0,   0
dmu_buf_impl_t:         224,      0,  462904,  749162,1335972833,   0,   0
arc_buf_hdr_t:          216,      0, 1041610,     932,289556791,   0,   0
arc_buf_t:               72,      0,   78698,   44052,731117164,   0,   0
zil_lwb_cache:          192,      0,      19,    2061,38877251,   0,   0
zfs_znode_cache:        368,      0,  170671,   49909,846538044,   0,   0
ksiginfo:               112,      0,    3043,    2534, 5101536,   0,   0
itimer:                 344,      0,       0,       0,       0,   0,   0
KNOTE:                  128,      0,    4492,    6238,8813175059,   0,   0
socket:                 680, 1047756,    2530,    7172,376304529,   0,   0
ipq:                     56,  63567,       0,    2205,   82578,   0,   0
udp_inpcb:              392, 1047760,      90,    2110,217303073,   0,   0
udpcb:                   16, 1047816,      90,    2094,217303073,   0,   0
tcp_inpcb:              392, 1047760,     924,    7746,54044998,   0,   0
tcpcb:                  976, 1047756,     729,    5583,54044998,   0,   0
tcptw:                   72,  27800,     195,    7655,20379511,   0,   0
syncache:               152,  15375,       0,    2250,31240638,   0,   0
hostcache:              136,  15372,    2830,    3218, 1638402,   0,   0
sackhole:                32,      0,       0,    2020, 8362012,   0,   0
tcpreass:                40, 127092,       1,    5207,11534579,   0,   0
sctp_ep:               1384, 1047754,       0,       0,       0,   0,   0
sctp_asoc:             2296,  40000,       0,       0,       0,   0,   0
sctp_laddr:              48,  80064,       0,    1008,      66,   0,   0
sctp_raddr:             704,  80000,       0,       0,       0,   0,   0
sctp_chunk:             136, 400008,       0,       0,       0,   0,   0
sctp_readq:             104, 400032,       0,       0,       0,   0,   0
sctp_stream_msg_out:    104, 400032,       0,       0,       0,   0,   0
sctp_asconf:             40, 400008,       0,       0,       0,   0,   0
sctp_asconf_ack:         48, 400032,       0,       0,       0,   0,   0
ripcb:                  392, 1047760,       0,     750,   23086,   0,   0
unpcb:                  240, 1047760,    1694,    2370,104932902,   0,   0
rtentry:                200,      0,     141,     182,     141,   0,   0
IPFW dynamic rule:      120,   4123,       0,       0,       0,   0,   0
selfd:                   56,      0,    9213,    1938,3500259534,   0,   0
SWAPMETA:               288, 4065919,   33371,    1170,29265989,   0,   0
FFS inode:              168,      0,     307,    2047, 6937126,   0,   0
FFS1 dinode:            128,      0,       0,       0,       0,   0,   0
FFS2 dinode:            256,      0,     307,    1898, 6937126,   0,   0
 
ZFS is eating your memory. zfs-stats -a will show you what ARC is doing, but it certainly is not being limited to 2G.

Code:
      solaris 2989444 23584478K       - 33934776661  16,32,64,128,256,512,1024,2048,4096

Take a look at this thread, it might be related.

https://forums.freebsd.org/threads/41880/

One thing that'd be helpful is to build charts showing the exhaustion profile. This other thread ran in circles some of the time because there were no pictures clearly illustrating the problem.
If you run something like this in a cron job and save the results then anyone can see what is happening over time.

Code:
#!/usr/bin/perl -w
# Track possible Wired memory exhaustion.
use strict;
#sysctl hw | fgrep mem | awk '{print $1" "$2 / (1024*1024)"M"}'
#vmstat -m | perl -ne 'if (/(\d+)K/) { $sum += $1; } END { print int(0.5+$sum/1024),"M\n"; }'
#vmstat -m | fgrep solaris
#vmstat -z | perl -ne 'if (/(\d+),\D+(\d+),\D+(\d+),\D+(\d+),/) { $sum += ($1*($3+$4)); } END {print int(0.5+($sum/(1024*1024))),"M\n";}'

my @titles;
my %data;

fetchMem(\@titles, \%data);
fetchTop(\@titles, \%data);
fetchKmalloc(\@titles, \%data);
fetchUma(\@titles, \%data);
my @vals = map { $data{$_} } @titles;

$" = ",";
print "@titles\n@vals\n";
exit;

sub fetchMem {
        my ($titles, $data) = @_;
        # hw.physmem: 4268482560
        # hw.usermem: 978481152
        my $mem = `sysctl hw.physmem hw.usermem`;
        while ($mem =~ /^hw.(\w+):\s+(\d+)$/mg) {
                push @$titles, $1;
                $data->{$1} = convertToMeg($2 . "b");
        }
}

sub fetchTop {
        my ($titles, $data) = @_;
        # Mem: 341M Active, 248M Inact, 3134M Wired, 69M Cache, 417M Buf, 126M Free
        # ARC: 2440M Total, 276M MFU, 2054M MRU, 1874K Anon, 29M Header, 79M Other
        my $top = `top -d 1 0`;
        my @mem = qw(Active Inact Wired Cache Buf Free);
        my @memUse;
        foreach my $m (@mem) {
                if ($top =~ /(\w+) $m/) {
                        push @memUse, convertToMeg($1);
                } else {
                        push @memUse, 0;
                }
        }
#       my @memUse = map { convertToMeg($_) } $top =~ /^Mem: (\w+) Active, (\w+) Inact, (\w+) Wired, (\w+) Cache, (\w+) Buf, (\w+) Free$/m;
        push @$titles, @mem;
        @{$data}{@mem} = @memUse;

        my @arc = qw(Total MFU MRU Anon Header Other);
        my @arcUse = map { convertToMeg($_) } $top =~ /^ARC: (\w+) Total, (\w+) MFU, (\w+) MRU, (\w+) Anon, (\w+) Header, (\w+) Other$/m;
        push @$titles, @arc;
        @{$data}{@arc} = @arcUse;
}

sub fetchKmalloc {
        my ($titles, $data) = @_;
        #         Type InUse MemUse HighUse Requests  Size(s)
        #         cdev     7     2K       -        7  256
        my $kmalloc = `vmstat -m`;
        my $sum = 0;
        while ($kmalloc =~ /\s+\d+\s+(\d+)K\s+/mg) {
                $sum += $1;
        }
        my $label = "Kmalloc";
        push @$titles, $label;
        $data->{$label} = convertToMeg($sum . "K");
        # Solaris (ARC) usage
        if ($kmalloc =~ /^\s*solaris\s+\d+\s+(\d+K)/m) {
                my $label = "Solaris";
                push @$titles, $label;
                $data->{$label} = convertToMeg($1);
        }
}

sub fetchUma {
        my ($titles, $data) = @_;
        # ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
        # UMA Kegs:               208,      0,     124,      12,     124,   0,   0
        my $uma = `vmstat -z`;
        my $sum = 0;
        while ($uma =~ /:\s+(\d+),\s+\d+,\s+(\d+),\s+(\d+)/gm) {
                $sum += $1 * ($2 + $3);
        }
        my $label = "Uma";
        push @$titles, $label;
        $data->{$label} = convertToMeg($sum . "b");
}

sub convertToMeg {
        my $v = shift;
        if ($v =~ /^(\d+)(\D)$/) {
                if ("M" eq $2) {
                        # Megabytes
                        return $1;
                } elsif ("K" eq $2) {
                        # Kilobyes
                        return int(0.5 + ($1 / 1024));
                } elsif ("b" eq $2) {
                        # Bytes
                        return int(0.5 + ($1 / (1024*1024)));
                } elsif ("G" eq $2) {
                        # Gigabytes
                        return $1*1024;
                } else {
                        warn "Invalid value: $v\n";
                        return $1;
                }
        } else {
                warn "Invalid value: $v\n";
        }
        return -1;
}
 
Upgraded to 10.3-RELEASE about a week ago. Wired memory increases and decreases and has not gone above 10G. I think it's fixed. Thanks
 
Upgraded to 10.3-RELEASE about a week ago. Wired memory increases and decreases and has not gone above 10G. I think it's fixed. Thanks
FreeBSD 10.3 has another issue which does not occur on FreeBSD 10.2 , the level 2 arc device on partition will missing after zpool import or server reboot.

I'm sticking with FreeBSD 10.2 until they fix this issue.
 
Back
Top