Inherited zfs mailserver..

amd64 (Xeon 3450) with 8GB of RAM and four 250GB SATA drives in a RAID 1+0. Upgraded it to 9.2-RELEASE the other day, hoping to fix some of (what I think) are ZFS related problems.

I found zfs-stats in ports.

Code:
zfs-mon:
ZFS real-time cache activity monitor
Seconds elapsed: 161

Cache hits and misses:
                                  1s    10s    60s    tot
                     ARC hits:   167    255    287    330
                   ARC misses:     0      0      1      6
         ARC demand data hits:    42     53     47     59
       ARC demand data misses:     0      0      0      0
     ARC demand metadata hits:   125    202    238    268
   ARC demand metadata misses:     0      0      0      4
       ARC prefetch data hits:     0      0      0      0
     ARC prefetch data misses:     0      0      0      0
   ARC prefetch metadata hits:     0      0      2      3
 ARC prefetch metadata misses:     0      0      0      2
                  ZFETCH hits:   800   1248   1159   1295
                ZFETCH misses:    34     49     85    224

Cache efficiency percentage:
                          10s    60s    tot
                  ARC: 100.00  99.65  98.21
      ARC demand data: 100.00 100.00 100.00
  ARC demand metadata: 100.00 100.00  98.53
    ARC prefetch data:   0.00   0.00   0.00
ARC prefetch metadata:   0.00 100.00  60.00
               ZFETCH:  96.22  93.17  85.25

zfs-stat -a:
http://pastebin.com/CBYLGp56

egrep -v \# /boot/loader.conf:
Code:
zfs_load="YES"
ahci_load="YES"
geom_mirror_load="YES"

accf_http_load="YES"
accf_data_load="YES"

The machine is also being backed up with BackupPC.

I have read that rsync and ZFS do not play nice together, and Maildir and ZFS do not play nice together.

This is a mailserver for a company that has over 1k users on it. (qmail and dovecot.)

I am looking for a way to make it more stable. What happens is the entire machine is unresponsive and only a "walk over and power down" will get it to work again. Apparently this has been happening to them for a few months now, but no one ever said anything and the exiting admin neglected to bring that up either. :p

Thinking about moving the whole thing to a similar machine with UFS if I can not get this stable.

I see there are a ton of "tuning the ARC cache" but I can not find out where/when you decide what/why to set the values to.. most people seem to randomly choose values and "hope for the best."

Looking to not build a future on hope.

Suggestions for what I could to do determine if ZFS (ARC or something else) is my problem. And then to figure out what to do with the problem.. Again my understanding is that "ZFS is the problem" in this situation.
 
I would for starters set the maximum ARC cache to about 4GBs, that's well enough to give you the benefits of the caching but also give enough memory to run the services/applications on the system and also prevent memory exhaustion. Then later finetune it if a need arises.

Code:
vfs.zfs.arc_max="4096M"
 
I am running a not so different configuration without any problems. Only one comes on mind - the filesystem is daily snapshotted and becomes very unresponsive when the free space goes down to somewhere between 100 - 200 GB (yes, gigabytes). There are 4 x 2 TB drives in RAIDZ2 giving about 2 TB usable space.
 
Code:
Filesystem             Size    Used   Avail Capacity  Mounted on
/dev/mirror/rootfsa      2G    908M    911M    50%    /
devfs                  1.0k    1.0k      0B   100%    /dev
procfs                 4.0k    4.0k      0B   100%    /proc
basefs                 125G     30k    125G     0%    /basefs
basefs/home            125G     74M    125G     0%    /home
basefs/usr             424G    299G    125G    71%    /usr
basefs/var             149G     23G    125G    16%    /var
/dev/md0               495M      3M    492M     1%    /tmp
linprocfs              4.0k    4.0k      0B   100%    /usr/compat/linux/proc
Does the /tmp on md0 play a role in this at all? Other than stealing directly from system memory?

PHP sessions are there:

Code:
ls /tmp | wc -l
756
 
Hi,

I have used ZFS with a Maildir based server (running Exim and Dovecot) with 1000+ users and not had any stability issues ever. I would also consider the possibility that the issue is not related to ZFS as this is very stable and has been for some time. It's true that Maildir systems may not perform great with ZFS but that isn't the same as saying you should expect you system to hang or crash. I could provide you with some tips for improving performance but it seems sensible to first concentrate on the stability issues.
Have you looked in the messages log after recovering from a system hang? Can you post any relavent info from there?

thanks, Andy.
 
If it's about general stability, you might want to memtest your memory.
And look at the insides of the power supply, any bulging capacitors will not help with a stable system. Perhaps try to test if it's stable with a program like Prime95.
Maybe update the BIOS to the newest version (newer AHCI firmware?).
 
So things to share..

I did update the BIOS and made sure AHCI was selected; there is also random information about

  • Enable/Disable Drive cache (BIOS)
    Enable/Disable 32-bit drive access (BIOS)

I have made some simple additions to /boot/loader.conf
Code:
accf_http_load="YES"
accf_data_load="YES"

vfs.zfs.cache_flush_disable="1"
vfs.zfs.prefetch_disable="1"
vfs.zfs.arc_max="1536M"
vfs.zfs.arc_mix="512M"
vm.kmem_size_max="8G"
vm.kmem_size="6G"

The data and http modules are for nginx; as this machine is also serving webmail..

The cache_flush_disable and prefetch_disable made a huge performance hit, but I now have zero stalling and system hangs. Which is the major problem.. When the system would hang local keyboard I/O would also stop.

Minor additions to /etc/sysctl.conf.local
Code:
kern.randompid=1
net.inet.ip.random_id=1
vfs.zfs.prefetch_disable=1

kern.random.sys.harvest.ethernet=0
kern.random.sys.harvest.point_to_point=0
kern.random.sys.harvest.interrupt=0

kern.maxfiles=262144
kern.maxfilesperproc=65536
vfs.read_max=128

kern.ipc.somaxconn=1024
net.inet.tcp.hostcache.expire=1


I also read that ZFS likes/wants/needs free space.. my /usr slice is 71% full.

I also found that enabling lzjb or lz4 might help.. as lzjb is a default in 9.1.. My zfs was was created in 2010.. Another suggestion was to backup the whole zfs and move it to a fresh install..

I would be a fan of that but it takes *forever* to move data off of the drive due to the stalling..

Also since the BIOS update I have these:
Code:
kernel: bge0: watchdog timeout -- resetting
kernel: bge0: link state changed to DOWN
kernel: bge0: link state changed to UP
kernel: bge0: watchdog timeout -- resetting
kernel: bge0: link state changed to DOWN
kernel: bge0: link state changed to UP

Code:
kernel: bge0: <HP NC107i PCIe Gigabit Server Adapter, ASIC rev. 0x5784100> mem 0xdf900000-0xdf90ffff irq 19 at device 0.0 on pci30
kernel: bge0: CHIP ID 0x05784100; ASIC REV 0x5784; CHIP REV 0x57841; PCI-E
kernel: miibus0: <MII bus> on bge0

Code:
zfs get all | grep creation
basefs       creation              Tue Jun 15 21:53 2010  -
basefs/home  creation              Wed Jun 16 10:29 2010  -
basefs/usr   creation              Tue Jun 15 21:53 2010  -
basefs/var   creation              Tue Jun 15 21:53 2010  -

Code:
zfs get all | grep compression
basefs       compression           off                    default
basefs/home  compression           off                    default
basefs/usr   compression           off                    default
basefs/var   compression           off                    default

And I have also changed these zfs settings as well, again they all 'seemed' to show signs of improvement and while not on the FreeBSD ZFS page seemed to be new defaults or suggested defaults..
Code:
zfs set atime=off 
zfs set primarycache=metadata 
zfs set recordsize=16k 
zfs set sync=disabled 
zfs set checksum=fletcher4

Thank you in advance for any assistance or suggestions you may have.
 
Do you have possibility to add a separate NIC to the system? I would immediately replace the bge(4) NIC with an Intel server class gigabit NIC, they are not praised without reason.
 
I don't see the results of "zpool status". It will show you read, write, and checksum error counts. That is the first thing I would run to see about the health of ZFS. I would also run SMART status on all the individual hard drives.

Have you ever run "zpool scrub"? If so, can you list the output? It conveys useful performance information.

IMHO setting sync=disabled is a desperation move and should not ever be necessary/used in production.

You don't have dedup set do you?
 
Code:
pool: basefs
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        basefs      ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada0s3  ONLINE       0     0     0
            ada2s3  ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            ada1s3  ONLINE       0     0     0
            ada3s3  ONLINE       0     0     0

errors: No known data errors

zpool get all
Code:
NAME    PROPERTY               VALUE                  SOURCE
basefs  size                   456G                   -
basefs  capacity               71%                    -
basefs  altroot                -                      default
basefs  health                 ONLINE                 -
basefs  guid                   17021171353384153787   default
basefs  version                -                      default
basefs  bootfs                 -                      default
basefs  delegation             on                     default
basefs  autoreplace            off                    default
basefs  cachefile              -                      default
basefs  failmode               wait                   default
basefs  listsnapshots          off                    default
basefs  autoexpand             off                    default
basefs  dedupditto             0                      default
basefs  dedupratio             1.00x                  -
basefs  free                   129G                   -
basefs  allocated              327G                   -
basefs  readonly               off                    -
basefs  comment                -                      default
basefs  expandsize             0                      -
basefs  freeing                0                      default
basefs  feature@async_destroy  enabled                local
basefs  feature@empty_bpobj    enabled                local
basefs  feature@lz4_compress   active                 local

zfs get all

Code:
NAME         PROPERTY              VALUE                  SOURCE
basefs       type                  filesystem             -
basefs       creation              Tue Jun 15 21:53 2010  -
basefs       used                  327G                   -
basefs       available             122G                   -
basefs       referenced            30K                    -
basefs       compressratio         1.00x                  -
basefs       mounted               yes                    -
basefs       quota                 none                   default
basefs       reservation           none                   default
basefs       recordsize            16K                    local
basefs       mountpoint            /basefs                default
basefs       sharenfs              off                    default
basefs       checksum              fletcher4              local
basefs       compression           lz4                    local
basefs       atime                 off                    local
basefs       devices               on                     default
basefs       exec                  on                     default
basefs       setuid                on                     default
basefs       readonly              off                    default
basefs       jailed                off                    default
basefs       snapdir               hidden                 default
basefs       aclmode               discard                default
basefs       aclinherit            restricted             default
basefs       canmount              on                     default
basefs       xattr                 off                    temporary
basefs       copies                1                      default
basefs       version               5                      -
basefs       utf8only              off                    -
basefs       normalization         none                   -
basefs       casesensitivity       sensitive              -
basefs       vscan                 off                    default
basefs       nbmand                off                    default
basefs       sharesmb              off                    default
basefs       refquota              none                   default
basefs       refreservation        none                   default
basefs       primarycache          metadata               local
basefs       secondarycache        all                    default
basefs       usedbysnapshots       0                      -
basefs       usedbydataset         30K                    -
basefs       usedbychildren        327G                   -
basefs       usedbyrefreservation  0                      -
basefs       logbias               latency                default
basefs       dedup                 off                    default
basefs       mlslabel                                     -
basefs       sync                  disabled               local
basefs       refcompressratio      1.00x                  -
basefs       written               30K                    -
basefs       logicalused           324G                   -
basefs       logicalreferenced     15K                    -
basefs/home  type                  filesystem             -
basefs/home  creation              Wed Jun 16 10:29 2010  -
basefs/home  used                  74.1M                  -
basefs/home  available             122G                   -
basefs/home  referenced            74.1M                  -
basefs/home  compressratio         1.00x                  -
basefs/home  mounted               yes                    -
basefs/home  quota                 none                   default
basefs/home  reservation           none                   default
basefs/home  recordsize            16K                    inherited from basefs
basefs/home  mountpoint            /home                  local
basefs/home  sharenfs              off                    default
basefs/home  checksum              fletcher4              inherited from basefs
basefs/home  compression           lz4                    inherited from basefs
basefs/home  atime                 off                    inherited from basefs
basefs/home  devices               on                     default
basefs/home  exec                  on                     default
basefs/home  setuid                on                     default
basefs/home  readonly              off                    default
basefs/home  jailed                off                    default
basefs/home  snapdir               hidden                 default
basefs/home  aclmode               discard                default
basefs/home  aclinherit            restricted             default
basefs/home  canmount              on                     default
basefs/home  xattr                 off                    temporary
basefs/home  copies                1                      default
basefs/home  version               5                      -
basefs/home  utf8only              off                    -
basefs/home  normalization         none                   -
basefs/home  casesensitivity       sensitive              -
basefs/home  vscan                 off                    default
basefs/home  nbmand                off                    default
basefs/home  sharesmb              off                    default
basefs/home  refquota              none                   default
basefs/home  refreservation        none                   default
basefs/home  primarycache          metadata               inherited from basefs
basefs/home  secondarycache        all                    default
basefs/home  usedbysnapshots       0                      -
basefs/home  usedbydataset         74.1M                  -
basefs/home  usedbychildren        0                      -
basefs/home  usedbyrefreservation  0                      -
basefs/home  logbias               latency                default
basefs/home  dedup                 off                    default
basefs/home  mlslabel                                     -
basefs/home  sync                  disabled               inherited from basefs
basefs/home  refcompressratio      1.00x                  -
basefs/home  written               74.1M                  -
basefs/home  logicalused           74.0M                  -
basefs/home  logicalreferenced     74.0M                  -
basefs/usr   type                  filesystem             -
basefs/usr   creation              Tue Jun 15 21:53 2010  -
basefs/usr   used                  303G                   -
basefs/usr   available             122G                   -
basefs/usr   referenced            303G                   -
basefs/usr   compressratio         1.00x                  -
basefs/usr   mounted               yes                    -
basefs/usr   quota                 none                   default
basefs/usr   reservation           none                   default
basefs/usr   recordsize            16K                    inherited from basefs
basefs/usr   mountpoint            /usr                   local
basefs/usr   sharenfs              off                    default
basefs/usr   checksum              fletcher4              inherited from basefs
basefs/usr   compression           lz4                    inherited from basefs
basefs/usr   atime                 off                    local
basefs/usr   devices               on                     default
basefs/usr   exec                  on                     default
basefs/usr   setuid                on                     default
basefs/usr   readonly              off                    default
basefs/usr   jailed                off                    default
basefs/usr   snapdir               hidden                 default
basefs/usr   aclmode               discard                default
basefs/usr   aclinherit            restricted             default
basefs/usr   canmount              on                     default
basefs/usr   xattr                 off                    temporary
basefs/usr   copies                1                      default
basefs/usr   version               5                      -
basefs/usr   utf8only              off                    -
basefs/usr   normalization         none                   -
basefs/usr   casesensitivity       sensitive              -
basefs/usr   vscan                 off                    default
basefs/usr   nbmand                off                    default
basefs/usr   sharesmb              off                    default
basefs/usr   refquota              none                   default
basefs/usr   refreservation        none                   default
basefs/usr   primarycache          metadata               inherited from basefs
basefs/usr   secondarycache        all                    default
basefs/usr   usedbysnapshots       0                      -
basefs/usr   usedbydataset         303G                   -
basefs/usr   usedbychildren        0                      -
basefs/usr   usedbyrefreservation  0                      -
basefs/usr   logbias               latency                default
basefs/usr   dedup                 off                    default
basefs/usr   mlslabel                                     -
basefs/usr   sync                  disabled               inherited from basefs
basefs/usr   refcompressratio      1.00x                  -
basefs/usr   written               303G                   -
basefs/usr   logicalused           300G                   -
basefs/usr   logicalreferenced     300G                   -
basefs/var   type                  filesystem             -
basefs/var   creation              Tue Jun 15 21:53 2010  -
basefs/var   used                  24.0G                  -
basefs/var   available             122G                   -
basefs/var   referenced            24.0G                  -
basefs/var   compressratio         1.00x                  -
basefs/var   mounted               yes                    -
basefs/var   quota                 none                   default
basefs/var   reservation           none                   default
basefs/var   recordsize            16K                    inherited from basefs
basefs/var   mountpoint            /var                   local
basefs/var   sharenfs              off                    default
basefs/var   checksum              fletcher4              inherited from basefs
basefs/var   compression           lz4                    inherited from basefs
basefs/var   atime                 off                    local
basefs/var   devices               on                     default
basefs/var   exec                  on                     default
basefs/var   setuid                on                     default
basefs/var   readonly              off                    default
basefs/var   jailed                off                    default
basefs/var   snapdir               hidden                 default
basefs/var   aclmode               discard                default
basefs/var   aclinherit            restricted             default
basefs/var   canmount              on                     default
basefs/var   xattr                 off                    temporary
basefs/var   copies                1                      default
basefs/var   version               5                      -
basefs/var   utf8only              off                    -
basefs/var   normalization         none                   -
basefs/var   casesensitivity       sensitive              -
basefs/var   vscan                 off                    default
basefs/var   nbmand                off                    default
basefs/var   sharesmb              off                    default
basefs/var   refquota              none                   default
basefs/var   refreservation        none                   default
basefs/var   primarycache          metadata               inherited from basefs
basefs/var   secondarycache        all                    default
basefs/var   usedbysnapshots       0                      -
basefs/var   usedbydataset         24.0G                  -
basefs/var   usedbychildren        0                      -
basefs/var   usedbyrefreservation  0                      -
basefs/var   logbias               latency                default
basefs/var   dedup                 off                    default
basefs/var   mlslabel                                     -
basefs/var   sync                  disabled               inherited from basefs
basefs/var   refcompressratio      1.00x                  -
basefs/var   written               24.0G                  -
basefs/var   logicalused           24.1G                  -
basefs/var   logicalreferenced     24.1G                  -

zpool scrub basefs
Code:
  pool: basefs
 state: ONLINE
  scan: scrub in progress since Wed Jan 22 09:11:15 2014
        15.0M scanned out of 327G at 1.37M/s, 67h57m to go
        0 repaired, 0.00% done
config:

        NAME        STATE     READ WRITE CKSUM
        basefs      ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada0s3  ONLINE       0     0     0
            ada2s3  ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            ada1s3  ONLINE       0     0     0
            ada3s3  ONLINE       0     0     0

errors: No known data errors

The sync disable was out of total desperation; yes.

With regards to the s/bge/em I might be able to come up with one..

(thanks again for all the interest and responses..)
 
The zfs scrub report was taken very early so the exceedingly low figure for MBps may not mean anything. It would be interesting to see how high the MBps of the scrub rises as time goes on. Five minutes later should tell the story.

When the machine is unresponsive are you able to log in, either via ssh or at the local keyboard, and run top for clues as to what is bogging it down?
 
Back
Top