ZFS, istgt and kern.maxswzone value

Hi all,

I have some problems with my freebsd box. I want to use it as a SAN, with a ZFS on root raidZ2 pool and istgt. Performances are very very poor, i can't copy a huge amount of files, the server freeze and the iscsi iniator lost the connection. In my dmesg, I have an kern.maxswzone error. I've tried to correct it with different values (equal or superior at the total swap space), but it continue to give me the error message at boot time, and the performance are not really improved (just a few).

If I restart istgt, the system seems more free and usable, that's why I think about a swap or a cache problem

I think I've make a lots of mistakes in my configuration, but I don't know where to investigate.
This is how my system (up to date) is configured:

Code:
root@beastie:~ # uname -a
FreeBSD beastie 9.2-RELEASE-p3 FreeBSD 9.2-RELEASE-p3 #0: Sat Jan 11 03:25:02 UTC 2014     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

root@beastie:~ # grep memory /var/run/dmesg.boot
real memory  = 17179869184 (16384 MB)
avail memory = 16489336832 (15725 MB)

root@beastie:~ # sysctl kern.maxswzone
kern.maxswzone: 402653184

root@beastie:~ # swapinfo
Device          1K-blocks     Used    Avail Capacity
/dev/gpt/swap0   67108864        0 67108864     0%
/dev/gpt/swap1   67108864        0 67108864     0%
/dev/gpt/swap2   67108864        0 67108864     0%
/dev/gpt/swap3   67108864        0 67108864     0%
/dev/gpt/swap4   67108864        0 67108864     0%
/dev/gpt/swap5   67108864        0 67108864     0%
Total           402653184        0 402653184     0%

root@beastie:~ # zpool status
  pool: zroot
 state: ONLINE
  scan: none requested
config:

        NAME                                            STATE     READ WRITE CKSUM
        zroot                                           ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gpt/disk0                                   ONLINE       0     0     0
            gptid/458950a2-617e-11e3-9217-00151745b57c  ONLINE       0     0     0
            gptid/69a3f6e7-617e-11e3-9217-00151745b57c  ONLINE       0     0     0
            gptid/87078ae9-617e-11e3-9217-00151745b57c  ONLINE       0     0     0
            gptid/b021b357-617e-11e3-9217-00151745b57c  ONLINE       0     0     0
            gptid/c808b07a-617e-11e3-9217-00151745b57c  ONLINE       0     0     0

errors: No known data errors

root@beastie:~ # more /etc/sysctl.conf
# $FreeBSD: release/9.2.0/etc/sysctl.conf 112200 2003-03-13 18:43:50Z mux $
#
#  This file is read when going to multi-user and its contents piped thru
#  ``sysctl'' to adjust kernel values.  ``man 5 sysctl.conf'' for details.
#
kern.maxvnodes=250000
# Uncomment this to prevent users from seeing information about processes that
# are being run under another UID.
#security.bsd.see_other_uids=0

root@beastie:~ # more /boot/loader.conf
vfs.zfs.tgx.timeout="5"
kern.maxswzone="402653184"
zfs_load="YES"
vfs.root.mountfrom="zfs:zroot/ROOT/default"

Any help would be very appreciated. Thanks in advance for any help, and sorry for my bad English ;-)
 
scolyo said:
I have some problems with my freebsd box. I want to use it as a SAN, with a ZFS on root raidZ2 pool and istgt. Performances are very very poor, i can't copy a huge amount of files, the server freeze and the iscsi iniator lost the connection. In my dmesg, I have an kern.maxswzone error. I've tried to correct it with different values (equal or superior at the total swap space), but it continue to give me the error message at boot time, and the performance are not really improved.
What is the specific error you are getting in dmesg(8) for kern.maxswzone?
Why have you also modified kern.maxvnodes and vfs.zfs.tgx.timeout?
 
Thanks for answering :)

In dmesg, I've this error:

Code:
warning: total configured swap (33554432 pages) exceeds maximum recommended amount (22369776 pages).
warning: increase kern.maxswzone or reduce amount of swap.
warning: total configured swap (50331648 pages) exceeds maximum recommended amount (22369776 pages).
warning: increase kern.maxswzone or reduce amount of swap.
warning: total configured swap (67108864 pages) exceeds maximum recommended amount (22369776 pages).
warning: increase kern.maxswzone or reduce amount of swap.
warning: total configured swap (83886080 pages) exceeds maximum recommended amount (22369776 pages).
warning: increase kern.maxswzone or reduce amount of swap.
warning: total configured swap (100663296 pages) exceeds maximum recommended amount (22369776 pages).
warning: increase kern.maxswzone or reduce amount of swap.

And for kern.maxvnodes and vfs.zfs.tgx.timeout, I follow the informations on this page:

https://wiki.freebsd.org/ZFSTuningGuide

I misunderstand something?
Thanks
 
scolyo said:
Code:
root@beastie:~ # swapinfo
Device          1K-blocks     Used    Avail Capacity
/dev/gpt/swap0   67108864        0 67108864     0%
/dev/gpt/swap1   67108864        0 67108864     0%
/dev/gpt/swap2   67108864        0 67108864     0%
/dev/gpt/swap3   67108864        0 67108864     0%
/dev/gpt/swap4   67108864        0 67108864     0%
/dev/gpt/swap5   67108864        0 67108864     0%
Total           402653184        0 402653184     0%
Is your system really configured for 402 GB of swap (64 GB x 6)? Why?
 
Because i'm a dumb...

My system is about 16GB of Ram, but I plan to upgrade to 32 GB of Ram. When I've done the install, I wanted to configure the system with 64 GB of swap stripped on the 6 disks, but I've created 64 Gb of swap on each disk...

I'm trying to correct this...
 
Hi @scolyo!

OK, so your swap is a little misconfigured, if you are planning on having 64 GB RAM, the absolute max swap you´d need is 128 GB, and that´s divided between the partitions that make it, in your case about 21-22 GB swap/disk. You´ll have to swapoff the partitions that are configured right now, repartition the disks and then swapon the new partitions again to get rid of those messages.

But, as regarding to your performance issues, lets start from the top. Please provide output of:

# gpart show
# zdb | grep ashift

/Sebulon
 
Last edited by a moderator:
Hi Sebulon,

Thanks to answer.

Well, for my swap, I reconfigured it, but the system still crashing under "heavy" load.

So, to be (almost) sure there is no problem, I launch memtest. It run run since 235 hours, and no errors are detected. That the first time I use it, it's a bit long, no?
 
Hi Sebulon,

This is the ouput for two command you asked:
Code:
root@beastie:~ # gpart show
=>        34  7814037101  ada0  GPT  (3.7T)
          34           6        - free -  (3.0k)
          40         216     1  freebsd-boot  (108k)
         256    20971520     2  freebsd-swap  (10G)
    20971776   113246208        - free -  (54G)
   134217984  7679819144     3  freebsd-zfs  (3.6T)
  7814037128           7        - free -  (3.5k)

=>        34  7814037101  ada1  GPT  (3.7T)
          34           6        - free -  (3.0k)
          40         216     1  freebsd-boot  (108k)
         256    20971520     2  freebsd-swap  (10G)
    20971776   113246208        - free -  (54G)
   134217984  7679819144     3  freebsd-zfs  (3.6T)
  7814037128           7        - free -  (3.5k)

=>        34  7814037101  ada2  GPT  (3.7T)
          34           6        - free -  (3.0k)
          40         216     1  freebsd-boot  (108k)
         256    20971520     2  freebsd-swap  (10G)
    20971776   113246208        - free -  (54G)
   134217984  7679819144     3  freebsd-zfs  (3.6T)
  7814037128           7        - free -  (3.5k)

=>        34  7814037101  ada3  GPT  (3.7T)
          34           6        - free -  (3.0k)
          40         216     1  freebsd-boot  (108k)
         256    20971520     2  freebsd-swap  (10G)
    20971776   113246208        - free -  (54G)
   134217984  7679819144     3  freebsd-zfs  (3.6T)
  7814037128           7        - free -  (3.5k)

=>        34  7814037101  ada4  GPT  (3.7T)
          34           6        - free -  (3.0k)
          40         216     1  freebsd-boot  (108k)
         256    20971520     2  freebsd-swap  (10G)
    20971776   113246208        - free -  (54G)
   134217984  7679819144     3  freebsd-zfs  (3.6T)
  7814037128           7        - free -  (3.5k)

=>        34  7814037101  ada5  GPT  (3.7T)
          34           6        - free -  (3.0k)
          40         216     1  freebsd-boot  (108k)
         256    20971520     2  freebsd-swap  (10G)
    20971776   113246208        - free -  (54G)
   134217984  7679819144     3  freebsd-zfs  (3.6T)
  7814037128           7        - free -  (3.5k)

root@beastie:~ # zdb | grep ashift
            ashift: 12

I also give the result of the zfs get recordsize command, because I think I've understand that the ashift parameter is linked to the blocksize:

Code:
root@beastie:~ # zfs get recordsize
NAME                       PROPERTY    VALUE    SOURCE
zroot                      recordsize  128K     default
zroot/ROOT                 recordsize  128K     default
zroot/ROOT/default         recordsize  128K     default
zroot/data                 recordsize  128K     default
zroot/home                 recordsize  128K     default
zroot/iscsi                recordsize  128K     default
zroot/iscsi/idisk1         recordsize  -        -
zroot/iscsi/idisk2         recordsize  -        -
zroot/iscsi/idisk3         recordsize  -        -
zroot/tmp                  recordsize  128K     default
zroot/usr                  recordsize  128K     default
zroot/usr/local            recordsize  128K     default
zroot/usr/obj              recordsize  128K     default
zroot/usr/ports            recordsize  128K     default
zroot/usr/ports/distfiles  recordsize  128K     default
zroot/usr/ports/packages   recordsize  128K     default
zroot/usr/src              recordsize  128K     default
zroot/var                  recordsize  128K     default
zroot/var/crash            recordsize  128K     default
zroot/var/db               recordsize  128K     default
zroot/var/db/pkg           recordsize  128K     default
zroot/var/empty            recordsize  128K     default
zroot/var/log              recordsize  128K     default
zroot/var/mail             recordsize  128K     default
zroot/var/run              recordsize  128K     default
zroot/var/tmp              recordsize  128K     default

Thanks in advance :)
 
Don't use that much swap, it's going to be wasted space. The rule of thumb is indeed "swap = 2 x RAM" but you don't want a system with 64 GB of swap. With that much memory you're probably never going to touch any of it. But you don't want to turn off swap completely either. Modern operating systems will always swap some things. Swap usage in and of itself isn't a problem, it's excessive swapping that will cause performance issues. If you ever find yourself running out of memory (very unlikely unless you have a really badly behaving application) you need to increase the amount of RAM. Increasing swap is just too much of a performance hit. I think 8 or 16 GB would be more than enough swap.
 
134217984 / 4096 = 32768(.0625)

Is that really 4k aligned? Perhaps someone could give a second opinion. "ashift" on the other hand was 12, so all good there.

Perhaps you could try and decrease ARC:
/boot/loader.conf
Code:
vfs.zfs.arc_max="11G"

That requires a reboot to take effect. Then there's all the other "evil" tuning you have there, take that out to see if it makes any difference.

We`ll get to istgt after the system is stable again.

/Sebulon
 
Hi all, sorry for this long time,

I've modified the ARC parameter in the loader.conf file, but it don't really improve the system.

I've some ISTGT error messages on the console:

Code:
root@beastie:~ # tail -f /var/log/messages
May  7 06:38:40 beastie istgt[1007]: Logout(discovery) from iqn.1991-05.com.microsoft:storage.ntiufm.fr (10.8.52.200) on (10.8.52.22:3260,1), ISID=400001370000, TSIH=2, CID=1, HeaderDigest=off, DataDigest=off
May  7 06:39:02 beastie istgt[1007]: Login from iqn.1991-05.com.microsoft:storage.ntiufm.fr (10.8.52.200) on iqn.2013-12.com.example:disk3 LU3 (10.8.52.22:3260,1), ISID=400001370000, TSIH=3, CID=1, HeaderDigest=off, DataDigest=off
May  7 07:13:15 beastie istgt[1007]: istgt_iscsi.c:1261:istgt_iscsi_write_pdu_internal: ***ERROR*** writev() failed (errno=32,iqn.1991-05.com.microsoft:storage.ntiufm.fr,time=0)
May  7 07:13:15 beastie istgt[1007]: istgt_iscsi.c:3924:istgt_iscsi_task_response: ***ERROR*** iscsi_write_pdu() failed
May  7 07:13:15 beastie istgt[1007]: istgt_iscsi.c:5392:sender: ***ERROR*** iscsi_task_response() CmdSN=215009 failed on iqn.2013-12.com.example:disk3,t,0x0001(iqn.1991-05.com.microsoft:storage.ntiufm.fr,i,0x400001370000)
May  7 07:13:16 beastie istgt[1007]: Login from iqn.1991-05.com.microsoft:storage.ntiufm.fr (10.8.52.200) on iqn.2013-12.com.example:disk3 LU3 (10.8.52.22:3260,1), ISID=400001370000, TSIH=4, CID=1, HeaderDigest=off, DataDigest=off
May  7 07:13:54 beastie istgt[1007]: istgt_iscsi.c:1261:istgt_iscsi_write_pdu_internal: ***ERROR*** writev() failed (errno=32,iqn.1991-05.com.microsoft:storage.ntiufm.fr,time=0)
May  7 07:13:54 beastie istgt[1007]: istgt_iscsi.c:3924:istgt_iscsi_task_response: ***ERROR*** iscsi_write_pdu() failed
May  7 07:13:54 beastie istgt[1007]: istgt_iscsi.c:5392:sender: ***ERROR*** iscsi_task_response() CmdSN=225009 failed on iqn.2013-12.com.example:disk3,t,0x0001(iqn.1991-05.com.microsoft:storage.ntiufm.fr,i,0x400001370000)
May  7 07:13:54 beastie istgt[1007]: Login from iqn.1991-05.com.microsoft:storage.ntiufm.fr (10.8.52.200) on iqn.2013-12.com.example:disk3 LU3 (10.8.52.22:3260,1), ISID=400001370000, TSIH=5, CID=1, HeaderDigest=off, DataDigest=off

There is something I don't understand, when the copy stop (I use robocopy, it'a a Windows 2003 server), if I' launch a command, as a simple "ls", on the freebsd box, the copy restart...

Nothing seems wrong in the log

thanks again

/Ludo
 
Hi,

I've found something, I've a new message:
Code:
May  7 13:08:48 beastie kernel: sonewconn: pcb 0xfffffe0116113188: Listen queue overflow: 4 already in queue awaiting acceptance
May  7 13:08:48 beastie last message repeated 405 times

Is it a NIC problem?

Still investigating
 
Back
Top