View Full Version : Horrible iSCSI (istgt) performance
olav
March 19th, 2011, 19:46
I get on average around 1MB/s on a gigabit network.
I've created a zfs volume which I've exported as a block device with istgt
zfs create -V 330G -b 512 tank/vol
Jumbo frames are enabled. MTU is set to 7000. It also runs on a seperate network/vlan.
I've tried two different switches, both has same performance.
I guess it's a configuration issue?
/usr/local/etc/istgt/istgt.conf
17 [Global]
18 Comment "Global section"
19 # node name (not include optional part)
20 NodeBase "iqn.backupbay.com"
21
22 # files
23 PidFile /var/run/istgt.pid
24 AuthFile /usr/local/etc/istgt/auth.conf
25
26 # directories
27 # for removable media (virtual DVD/virtual Tape)
28 MediaDirectory /var/istgt
29
30 # syslog facility
31 LogFacility "local7"
32
33 # socket I/O timeout sec. (polling is infinity)
34 Timeout 30
35 # NOPIN sending interval sec.
36 NopInInterval 20
37
38 # authentication information for discovery session
39 DiscoveryAuthMethod Auto
40 #DiscoveryAuthGroup AuthGroup9999
41
42 # reserved maximum connections and sessions
43 # NOTE: iSCSI boot is 2 or more sessions required
44 MaxSessions 32
45 MaxConnections 8
46
47 # maximum number of sending R2T in each connection
48 # actual number is limited to QueueDepth and MaxCmdSN and ExpCmdSN
49 # 0=disabled, 1-256=improves large writing
50 MaxR2T 128
51
52 # iSCSI initial parameters negotiate with initiators
53 # NOTE: incorrect values might crash
54 MaxOutstandingR2T 16
55 DefaultTime2Wait 2
56 DefaultTime2Retain 60
57 FirstBurstLength 262144
58 MaxBurstLength 1048576
59 MaxRecvDataSegmentLength 262144
60
61 # NOTE: not supported
62 InitialR2T Yes
63 ImmediateData Yes
64 DataPDUInOrder Yes
65 DataSequenceInOrder Yes
66 ErrorRecoveryLevel 0
67
68 [UnitControl]
69 Comment "Internal Logical Unit Controller"
70 #AuthMethod Auto
71 AuthMethod CHAP Mutual
72 AuthGroup AuthGroup1
73 # this portal is only used as controller (by istgtcontrol)
74 # if it's not necessary, no portal is valid
75 #Portal UC1 [::1]:3261
76 Portal UC1 127.0.0.1:3261
77 # accept IP netmask
78 #Netmask [::1]
79 Netmask 127.0.0.1
80
81 # You should set IPs in /etc/rc.conf for physical I/F
82 [PortalGroup1]
83 Comment "SINGLE PORT TEST"
84 # Portal Label(not used) IP(IPv6 or IPv4):Port
85 #Portal DA1 [2001:03e0:06cf:0003:021b:21ff:fe04:f405]:3260
86 Portal DA1 192.168.3.36:3260
87
96 [InitiatorGroup1]
97 Comment "Initiator Group1"
98 #InitiatorName "iqn.1991-05.com.microsoft:saturn"
99 # special word "ALL" match all of initiators
100 InitiatorName "ALL"
101 Netmask 192.168.3.0/24
102
103 # TargetName, Mapping, UnitType, LUN0 are minimum required
104 [LogicalUnit1]
105 Comment "Hard Disk Sample"
106 # full specified iqn (same as below)
107 #TargetName iqn.2007-09.jp.ne.peach.istgt:disk1
108 # short specified non iqn (will add NodeBase)
109 TargetName disk1
110 TargetAlias "Data Disk1"
111 # use initiators in tag1 via portals in tag1
112 Mapping PortalGroup1 InitiatorGroup1
113 # accept both CHAP and None
114 AuthMethod Auto
115 AuthGroup AuthGroup2
116 #UseDigest Header Data
117 UseDigest Auto
118 UnitType Disk
119 # SCSI INQUIRY - Vendor(8) Product(16) Revision(4) Serial(16)
120 UnitInquiry "FreeBSD" "iSCSI Disk" "0123" "10000001"
121 # Queuing 0=disabled, 1-255=enabled with specified depth.
122 #QueueDepth 32
123
124 # override global setting if need
125 #MaxOutstandingR2T 16
126 #DefaultTime2Wait 2
127 #DefaultTime2Retain 60
128 #FirstBurstLength 262144
129 #MaxBurstLength 1048576
130 #MaxRecvDataSegmentLength 262144
131 #InitialR2T Yes
132 #ImmediateData Yes
133 #DataPDUInOrder Yes
134 #DataSequenceInOrder Yes
135 #ErrorRecoveryLevel 0
136
137 # LogicalVolume for this unit on LUN0
138 # for file extent
139 #LUN0 Storage /tank/iscsi/istgt-disk1 10GB
140 # for raw device extent
141 #LUN0 Storage /dev/ad4 Auto
142 # for ZFS volume extent
143 LUN0 Storage /dev/zvol/tank/vol Auto
I was hoping to get something around at least 50MB/s with iSCSI. The same zfs pool can easily read and write over 70-80MB/s with Samba (and no jumbo frames).
mamalos
March 19th, 2011, 20:24
Try to isolate the problem. Run your experiment without the use of vlans or other virtual devices. Run vmstat and/or iostat on both machines and see if something peculiar is happening. Test it without jumbo frame support using the normal MTU (1500). Try exporting a ufs filesystem of the same physical disk. Crosscheck performance to see where the bottleneck comes from. Isolate your problem.
Check your logs!
By the way, have you checked your dmesg in case it shows something about sysctl tuning?
olav
March 20th, 2011, 02:09
I've not tuned sysctl, I use default settings. None of the servers seems to have any io/cpu issues. I've tested without jumbo frames and got same result.
I'm gonna try to export an empty file as a volume from the ufs partition
Can't find anything in the logs either.
Interesting though, gstat says the iscsi device is working 100% when I use dd
olav
March 20th, 2011, 11:59
Something is really wrong. After I rebooted the initiator now, I can't seem to connect to the iscsi target at all.
I find this error message on the target
Mar 20 11:54:53 zpool istgt[3262]: istgt_iscsi.c:4328:istgt_iscsi_send_nopin: ***ERROR*** before Full Feature
Mar 20 11:54:53 zpool istgt[3262]: istgt_iscsi.c:4817:worker: ***ERROR*** iscsi_send_nopin() failed
The initiator says
recvpdu: Socket is not connected
recvpdu failed
olav
March 24th, 2011, 09:16
The extremely slow iSCSI was because of a broken network cable. I replaced it and now it runs very good! However I get maximum 50MB/s, the target is far from stressed. Are there other things I can do improve performance?
dennylin93
March 24th, 2011, 09:36
There are many ways of tuning ZFS. There isn't a best method as setups vary. Try searching the mailing lists since many people have posted tuning stuff (especially Jeremy Chadwick).
AndyUKG
March 24th, 2011, 15:06
I think the first thing to do would be try without ZFS. I'd be interested to see what performance you get using a disk device directly, i.e. /dev/ada1 or what ever.
I also get terrible istgt performance with ZFS, but don't currently have the kit to do the testing I propose above as my istgt system is already in production.
Thanks, Andy.
olav
March 24th, 2011, 23:27
I get 90-80MB/s with the Ubuntu initiator. I don't have a (fast) raw harddrive at the moment to test with.
I do have problems with my switch though, it doesn't seem to like jumbo frames. I need to test it more. A direct cable to each network card works fine with jumbo frames (but I don't get any performance benefits). But if anyone else is thinking about buying the NetGear GS724T, don't do it.
AndyUKG
March 25th, 2011, 13:39
I don't have a (fast) raw harddrive at the moment to test with.
I think any disk would do to test, if the problem is just ZFS related you will see a huge jump in performance even with a slow disk.
kometen
April 12th, 2011, 13:38
Hi.
I've installed a server with zfs version 28 and FreeBSD 8.2 prerelease. The pool consists of 11 disks in raidz2. The disks are 2 TB sata disks on an areca-controller. The server has 12 GB of RAM and a xeon E5620 @ 2.4 Ghz. I installed net/istgt from ports. The only thing I tuned was 'QueueDepth 64'.
I have mounted the iscsi-disk from Windows 7/2008r2 and vmware 4.1 and have performance peaking at 118 MB/s which is very nice on plain GB nics. :-)
[LogicalUnit99]
Comment "disk99"
TargetName disk99
TargetAlias "Data Disk99"
Mapping PortalGroup1 InitiatorGroup1
AuthMethod CHAP
AuthGroup AuthGroup1
UseDigest Auto
UnitType Disk
QueueDepth 64
LUN0 Storage /dev/zvol/data/iscsi/disk99 Auto
Regards
Claus
AndyUKG
April 12th, 2011, 15:45
Hi Claus,
So what has caused the jump in performance? Just the V28 ZFS you think?
Thanks Andy.
kometen
April 13th, 2011, 19:35
Hi.
The only setting I changed was queue depth, which was commented out in the example iscsi.conf-file. I did not try zfs ver. 15 which is default in 8.2 but patched it to ver. 28 before creating the pool.
regards
Claus
AndyUKG
April 14th, 2011, 10:29
Hi Claus,
I don't see any noticeable difference when varying the size of queue depth. Do you actually see those speeds when moving real data around or are you seeing that from a windows benchmark utility? I see very good performance using benchmarking utils, but when actually moving real data the performance is very low...
If you really have got good performance then the only obvious explanation would be ZFS v28, I am using v15 currently,
thanks Andy.
kometen
April 15th, 2011, 08:26
Hi.
I copied my Download-folder on Win 7 using both samba and iscsi, the smaller files don't utilize the network completely (of course) but the larger iso-files have no problem moving approx. to 100 MB/s, sometimes more, sometimes less.
The folder has 43.3 GB of data.
regards
Claus
peetaur
June 7th, 2012, 08:50
The solution, as stated by a few, is to set QueueDepth (I found 64 works well, but only tested the VirtualBox initiator).
Why is this thread not marked solved? Is it because you wanted to know why the different initiators will perform well on the same default unset QueueDepth? Or did different initiators not benefit from QueueDepth? (I ask for my own personal knowledge, as I am starting to use istgt)
frijsdijk
June 11th, 2012, 08:05
I too can get no faster than about 50MB/sec (in this case, a Windows host with iSCSI initiator, running Veeam backup software for VMware, that's writing to the disk that is connected via a 1Gbit link (dark fiber) to the FreeBSD host running istgt).
Iperf gets more traffic over the link (can max out the 1Gbit link), but istgt seems to get no faster than about half the 1Gbit link.
Probably because it's not running in the kernel but in user space?
peetaur
June 14th, 2012, 10:18
I tested istgt with a file on zfs; it went only 4.5 MB/s! A zvol went over 100. Is this what you did? (and due to zvol hangs, I'm reverting to NFS for now.)
throAU
June 20th, 2012, 02:44
As an aside, have you found any decent documentation for istgt?
I gave up with it because I couldn't make sense of the documentation. I'm by no means a storage guru, but I had run an iSCSI SAN for 2-3 years at that point...
peetaur
June 20th, 2012, 09:07
One online howto made istgt segfault. :D But editing the sample config files worked fine, but no I found no documentation whatsoever.... just examples by 3rd parties. I wanted documentation so I would know what every setting meant, but found none.
In addition to the no documentation, also, if you typo in the istgt.conf file and try to reload/refresh the config, istgt can hang. So I don't think I want to use istgt at all. But there seems to be no good alternative. But if there is, I would love to hear about it.
phoenix
June 20th, 2012, 19:33
Something to keep in mind when using NTFS-on-iSCSI-on-ZFS: when you create the zvol, make sure that you match the recordsize to the NTFS block size, and (if possibile) set the MTU of the link to be the same. If you don't, then you will be doing a lot of extra roundtrips for all reads and writes to the zvol.
peetaur
June 27th, 2012, 09:19
I tested istgt with a file on zfs; it went only 4.5 MB/s! A zvol went over 100.
I tested this again, with VERY different results.
Previously the initiator machine was using FreeBSD with VirtualBox's built in initiator, and 1Gbps. And as I said before, it was only around 4.5 MB/s (writing).
This time, I used a Linux machine, and it runs as fast as zvols (95-110 MB/s over 1Gbps, and 210-230 over 10 Gbps). And for comparison, netcat over the 10Gbps goes about 550 MB/s, and 2 threads goes 950 MB/s; I'm not sure what to do to speed it up for a single thread or iSCSI.
But this doesn't mean there is anything wrong with FreeBSD. It might simply be keeping better data integrity but Linux vbox doesn't. If I test this with Linux KVM, which I know does not have data integrity issues like vbox does, and get the slow result, I would blame zfs. If I get the fast result, I would blame FreeBSD or their changes to vbox.
dave
June 29th, 2012, 08:36
Don't forget to add your NICs into the equation. There are good NICs and bad. Just because it says gigabit doesn't mean you will get the same from NIC to NIC. Some rely on the CPU, some don't.
peetaur
June 29th, 2012, 08:47
Don't forget to add your NICs into the equation. There are good NICs and bad. Just because it says gigabit doesn't mean you will get the same from NIC to NIC. Some rely on the CPU, some don't.
If you are referring to my tests:
I doubt that a CPU or a cheap network card would be the difference between 4.5 MB/s and 40-80 MB/s. And the test clearly showed the zil at 100% load with the FreeBSD + vbox machine. And other tests (eg. scp) always show fast enough results.
dave
June 29th, 2012, 08:54
If you are referring to my tests:
I doubt that a CPU or a cheap network card would be the difference between 4.5 MB/s and 40-80 MB/s. And the test clearly showed the zil at 100% load with the FreeBSD + vbox machine. And other tests (eg. scp) always show fast enough results.
I meant that more as a general reply because I noted that noone had mentioned it yet. I was not referring specifically to your tests.
olav
June 30th, 2012, 08:00
If you are referring to my tests:
I doubt that a CPU or a cheap network card would be the difference between 4.5 MB/s and 40-80 MB/s. And the test clearly showed the zil at 100% load with the FreeBSD + vbox machine. And other tests (eg. scp) always show fast enough results.
Actually, when it comes to iSCSI there is a HUGE difference between a good and a BAD NIC. I tested first with a Realtek NIC and then tried an Intel. I went from 5MB/s to 125MB/s.
tinusb
August 9th, 2012, 18:39
Hey all! Is there any possibility you can post some of your istgt.conf files as example? I sit with a similar problem of bad performance, not a very powerful pc, but expect better performance than what I'm getting...
iostat reports about 8MB/sec stable throughput....
dave
August 9th, 2012, 20:30
Hey all! Is there any possibility you can post some of your istgt.conf files as example? I sit with a similar problem of bad performance, not a very powerful pc, but expect better performance than what I'm getting...
iostat reports about 8MB/sec stable throughput....
When you say "not a very powerful pc" - what are we talking here?
Do you get different throughput using scp?
Is your CPU spiking when you do your tests? If so, you may want to look for a better network card, one that doesn't have to interrupt the CPU.
Do your drives support the speed you are expecting? What about the bus that the drives are connected to?
tinusb
August 10th, 2012, 08:15
When you say "not a very powerful pc" - what are we talking here?
Do you get different throughput using scp?
Is your CPU spiking when you do your tests? If so, you may want to look for a better network card, one that doesn't have to interrupt the CPU.
Do your drives support the speed you are expecting? What about the bus that the drives are connected to?
Hi,
Haven't checked using scp yet...
SCP copy speed is:
2900kb/sec
No spiking of the CPU. Drives are SATA2, connected via a SATA1 bay, so can only do SATA1 - with normal on board controller.
Specs are:
CPU AMD A8-3850 CPU (2900Mhz)
RAM 8GB Mushkin DDR-1333
MOBO Gigabyte A75-D3H
HDD WD10EARX WD10EZRX WD10EZRX (thus 3x 1TB WD Green)
NIC Realtek 8168/8111 Gigabit; and INTeL/Pro1000 (Have the INTeL cards in our own machines, and do manage to get 85MB/sec tho)
When I take a look a the performance graphs in ESXi, I get a maximum throughput at times of about 300Mbps - utilizing a Gbit connection by only 30%? I do understand there are overheads, etc...
Regards
Tinus
dave
August 10th, 2012, 14:50
Actually, when it comes to iSCSI there is a HUGE difference between a good and a BAD NIC. I tested first with a Realtek NIC and then tried an Intel. I went from 5MB/s to 125MB/s.
Is it worth a try to switch the NIC to intel?
olav
August 11th, 2012, 11:23
Absolutely. I recommend the cheap Intel desktop ones
http://www.ebay.com/itm/Intel-Pro-1000-PT-1Gb-Ethernet-PCIe-Card-EXPI9300PT-55-/380460165058?pt=US_Internal_Network_Cards&hash=item58953167c2
http://www.amazon.com/Intel-Gigabit-Network-Adapter-EXPI9301CTBLK/dp/B001CY0P7G/ref=sr_1_10?s=electronics&ie=UTF8&qid=1344680554&sr=1-10&keywords=intel+desktop+1000
tinusb
August 11th, 2012, 16:15
Absolutely. I recommend the cheap Intel desktop ones
http://www.ebay.com/itm/Intel-Pro-1000-PT-1Gb-Ethernet-PCIe-Card-EXPI9300PT-55-/380460165058?pt=US_Internal_Network_Cards&hash=item58953167c2
http://www.amazon.com/Intel-Gigabit-Network-Adapter-EXPI9301CTBLK/dp/B001CY0P7G/ref=sr_1_10?s=electronics&ie=UTF8&qid=1344680554&sr=1-10&keywords=intel+desktop+1000
Tried with INTeL cards, no difference.
Own setup has got intel, with a Adaptec 51645 card with drives as JBOD - and it performs like crazy! But I can't believe that maximum I can get on the iSCSI is only about 3mb/sec?
peetaur
August 13th, 2012, 08:23
Just to repeat the same thing again, you DID update QueueDepth to 64, right? Because out of anything I tested, these things are all that matters.
Here are some example snippets:
[Global]
...
# QueueDepth is limited by this number, and I don't know if it is per connection or accross all (due to zero documentation available) so I raised it a bunch.
MaxR2T 256
...
[LogicalUnit1]
...
# 64 was perfectly sufficient for VirtualBox clients, but something else I tried ... maybe Proxmox, had errors until doubling it again.
QueueDepth 128
...
larrypatrickmaloney
August 15th, 2012, 22:07
I've gotta chime in on this...
So there are some 'alternate' ways to do iSCSI.
First of all make sure you've got your nic's setup with optimal MTU's (for FreeBSD best to set to 8244)
2nd of all, make sure you create your block devices with 4k or 8k alignment.
Now for the good stuff...
Run multiple istgt's on your target. :)
The istgt target client is single threaded bound.
To get around this, launch multiple istgt, with each istgt providing a single block device.
Once you see the target devices on the initiator side, STRIPE ON THE INITIATOR. :)
Try it.
Larry
vBulletin® v3.8.7, Copyright ©2000-2013, vBulletin Solutions, Inc.
0