Believe it or not, this will be the second time I write this. I had nearly completed the post when, for some reason, I thought it best to press CTRL+W without a draft saved (don't ask why, I don't know):
http://i.stack.imgur.com/jiFfM.jpg
OK, lesson learned... So I wanted to write about something very interesting for which I've yet to find somewhere explained in detail what is needed and what to expect. I have found many articles online that describe iSCSI multipathing, except without LACP in the mix, many failures, few successes. This is what I aim to add here.
Here is some of the sources I've plowed through online:
http://n4f.siftusystems.com/index.php/2013/07/03/iscsi-multipathing-mpio/
http://nex7.blogspot.se/2013/03/ipmp-vs-lacp-vs-mpio.html
http://forums.freenas.org/index.php...esxi-setup-via-iscsi-having-some-issues.8557/
http://arstechnica.com/civis/viewtopic.php?t=1184984
http://etherealmind.com/iscsi-netwo...ipathing-hba-ha-high-availability-redundancy/
http://agnosticcomputing.com/2014/0...formance-oriented-zfs-box-for-hyper-v-vmware/
http://agnosticcomputing.com/2014/03/26/labworks-11-2-i-heart-the-arc-lets-pull-some-drives/
http://agnosticcomputing.com/2014/04/16/labworks-21-4-converged-hyper-v-switching-like-a-boss/
http://agnosticcomputing.com/2014/0...me-convergedswitching-for-hyper-v-now-please/
Storage specs
Hypervisor specs
Storage setup
Volume creation:
iSCSI target config:
/etc/ctl.conf
Network setup
The switch used was a Netgear GS108T v2.
Storage network config
/etc/rc.conf
Hypervisor network config
/etc/modprobe.d/bonding.conf
/etc/sysconfig/network-scripts/ifcfg-eth0
/etc/sysconfig/network-scripts/ifcfg-eth1
/etc/sysconfig/network-scripts/ifcfg-bond0
/etc/sysconfig/network-scripts/ifcfg-bond0.1
/etc/sysconfig/network-scripts/ifcfg-Public
/etc/sysconfig/network-scripts/ifcfg-bond0.10
/etc/sysconfig/network-scripts/ifcfg-Jumbo_iSCSI_1
/etc/sysconfig/network-scripts/ifcfg-bond0.11
/etc/sysconfig/network-scripts/ifcfg-Jumbo_iSCSI_2
Note: you can´t do this over SSH of course, since it cuts your connection
iSCSI initiator configuration:
MPIO config:
/dev/sda here is my CF boot/root device so the iSCSI volumes came in as /dev/sdb and /dev/sdc. To verify this, run:
/etc/multipath.conf
Check multipath status:
Using fdisk to partition /dev/mapper/hypervisor-1, you go like, "n, [enter], [enter], [enter], [enter], w". Done! Then run this to create a filesystem:
Then you are free to mount that babe anywhere you want and use it for whatever. I´m using it as a VM store and these numbers are from a virtual FreeBSD server with a 80 GB UFS drive running bonnie++:
Read 'em and weep Except, for the purpose of just measuring the network throughput, I
Last words; the caveat
For reasons unknown to me (please chip in if you know), using the iftop utility watching the physical eth0 and eth1 in CentOS, both iSCSI addresses go out through the same eth0 interface, at which case throughput is capped at 1 Gb/s. Rebooting the hypervisor and looking again, both addresses go out through eth1 instead. Keep rebooting and soon enough you will be able to see address 172.16.11.10 going out through eth0 and 172.16.10.10 through eth1 (or vice versa), and then you will get 2 Gb/s throughput. No idea as to why that occurs, and as I said, if you know, do share
/Sebulon
http://i.stack.imgur.com/jiFfM.jpg
OK, lesson learned... So I wanted to write about something very interesting for which I've yet to find somewhere explained in detail what is needed and what to expect. I have found many articles online that describe iSCSI multipathing, except without LACP in the mix, many failures, few successes. This is what I aim to add here.
Here is some of the sources I've plowed through online:
http://n4f.siftusystems.com/index.php/2013/07/03/iscsi-multipathing-mpio/
http://nex7.blogspot.se/2013/03/ipmp-vs-lacp-vs-mpio.html
http://forums.freenas.org/index.php...esxi-setup-via-iscsi-having-some-issues.8557/
http://arstechnica.com/civis/viewtopic.php?t=1184984
http://etherealmind.com/iscsi-netwo...ipathing-hba-ha-high-availability-redundancy/
http://agnosticcomputing.com/2014/0...formance-oriented-zfs-box-for-hyper-v-vmware/
http://agnosticcomputing.com/2014/03/26/labworks-11-2-i-heart-the-arc-lets-pull-some-drives/
http://agnosticcomputing.com/2014/04/16/labworks-21-4-converged-hyper-v-switching-like-a-boss/
http://agnosticcomputing.com/2014/0...me-convergedswitching-for-hyper-v-now-please/
Storage specs
Code:
1x Supermicro X8SIL-F
2x Supermicro AOC-USAS2-L8i
2x Supermicro CSE-M35T-1B
1x Intel Core i5 650 3,2 GHz
4x 2 GB 1333 MHz DDR3 ECC RDIMM
8x 2 TB HDD (Mix of Seagate, Samsung, Western Digital)
FreeBSD-10.0-RELEASE-p2
Hypervisor specs
Code:
1x Supermicro X8SIL-F
1x Intel Xeon X3470 2.93 GHz
2x 2 GB 1333 MHz DDR3 ECC RDIMM
1x CF to SATA Converter
1x 32 GB Compact Flash card (for boot/root)
CentOS release 6.5
Storage setup
Code:
Pool layout:
pool: pool1
state: ONLINE
scan: scrub repaired 0 in 3h53m with 0 errors on Mon Jun 9 05:53:39 2014
config:
NAME STATE READ WRITE CKSUM
pool1 ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gpt/disk2 ONLINE 0 0 0
gpt/disk3 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
gpt/disk4 ONLINE 0 0 0
gpt/disk5 ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
gpt/disk6 ONLINE 0 0 0
gpt/disk7 ONLINE 0 0 0
mirror-3 ONLINE 0 0 0
gpt/disk8 ONLINE 0 0 0
gpt/disk9 ONLINE 0 0 0
errors: No known data errors
Block mode:
ashift: 12
ashift: 12
ashift: 12
ashift: 12
Partition layout:
2048 3907027080 1 disk2 (1.8T)
2048 3907027080 1 disk3 (1.8T)
2048 3907027080 1 disk4 (1.8T)
2048 3907027080 1 disk5 (1.8T)
2048 3907027080 1 disk6 (1.8T)
2048 3907027080 1 disk7 (1.8T)
2048 3907027080 1 disk8 (1.8T)
2048 3907027080 1 disk9 (1.8T)
Volume creation:
# zfs create -o compress=lz4 -b 128k -s -V 500g pool1/hypervisor_1
iSCSI target config:
/etc/ctl.conf
Code:
portal-group pg1 {
discovery-auth-group no-authentication
listen 172.16.10.11:3260
listen 172.16.11.11:3261
}
target iqn.2014-06.bar.foo:storage.foo.bar:hypervisor-1 {
auth-group no-authentication
portal-group pg1
lun 0 {
path /dev/zvol/pool1/hypervisor_1
size 500G
}
}
Network setup
The switch used was a Netgear GS108T v2.
- Port 1,2,3,4 configured for jumbo frames
- Port 1,2 configured as LACP for storage (LAG 1)
- Port 3,4 configured as LACP for hypervisor (LAG 2)
- Create VLAN 10,11
- Set tagged VLAN 1,10,11 on LAG 1,2 (I can´t remember if this was needed but in case it was.)
- Configure LAG 1,2 for jumbo frames as well.
Storage network config
/etc/rc.conf
Code:
...
ifconfig_em0="mtu 9000 up"
ifconfig_em1="mtu 9000 up"
cloned_interfaces="lagg0 vlan1 vlan10 vlan11"
ifconfig_lagg0="up laggproto lacp laggport em0 laggport em1 lagghash l3,l4"
ifconfig_vlan1="inet 192.168.0.4 netmask 255.255.255.0 vlan 1 vlandev lagg0 mtu 1500"
ifconfig_vlan10="inet 172.16.10.11 netmask 255.255.255.0 vlan 10 vlandev lagg0 mtu 9000"
ifconfig_vlan11="inet 172.16.11.11 netmask 255.255.255.0 vlan 11 vlandev lagg0 mtu 9000"
...
Hypervisor network config
/etc/modprobe.d/bonding.conf
Code:
alias bond0 bonding
options bond0 max_bonds=8 mode=802.3ad xmit_hash_policy=layer3+4 miimon=100 downdelay=0 updelay=0
/etc/sysconfig/network-scripts/ifcfg-eth0
Code:
NM_CONTROLLED="no"
BOOTPROTO="none"
DEVICE="eth0"
ONBOOT="yes"
USERCTL="no"
MASTER="bond0"
SLAVE="yes"
/etc/sysconfig/network-scripts/ifcfg-eth1
Code:
NM_CONTROLLED="no"
BOOTPROTO="none"
DEVICE="eth1"
ONBOOT="yes"
USERCTL="no"
MASTER="bond0"
SLAVE="yes"
/etc/sysconfig/network-scripts/ifcfg-bond0
Code:
DEVICE="bond0"
NM_CONTROLLED="no"
USERCTL="no"
BOOTPROTO="none"
BONDING_OPTS="mode=4 miimon=100 xmit_hash_policy=layer3+4"
TYPE="Ethernet"
MTU="9000"
/etc/sysconfig/network-scripts/ifcfg-bond0.1
Code:
DEVICE="bond0.1"
VLAN="yes"
BOOTPROTO="none"
NM_CONTROLLED="no"
BRIDGE="Public"
MTU="1500"
/etc/sysconfig/network-scripts/ifcfg-Public
Code:
TYPE="Bridge"
NM_CONTROLLED="no"
BOOTPROTO="none"
DEVICE="Public"
ONBOOT="yes"
IPADDR="192.168.0.9"
NETMASK="255.255.255.0"
/etc/sysconfig/network-scripts/ifcfg-bond0.10
Code:
DEVICE="bond0.10"
VLAN="yes"
BOOTPROTO="none"
NM_CONTROLLED="no"
BRIDGE="Jumbo_iSCSI_1"
MTU="9000"
/etc/sysconfig/network-scripts/ifcfg-Jumbo_iSCSI_1
Code:
TYPE="Bridge"
NM_CONTROLLED="no"
BOOTPROTO="none"
DEVICE="Jumbo_iSCSI_1"
ONBOOT="yes"
IPADDR="172.16.10.10"
NETMASK="255.255.255.0"
/etc/sysconfig/network-scripts/ifcfg-bond0.11
Code:
DEVICE="bond0.11"
VLAN="yes"
BOOTPROTO="none"
NM_CONTROLLED="no"
BRIDGE="Jumbo_iSCSI_2"
MTU="9000"
/etc/sysconfig/network-scripts/ifcfg-Jumbo_iSCSI_2
Code:
TYPE="Bridge"
NM_CONTROLLED="no"
BOOTPROTO="none"
DEVICE="Jumbo_iSCSI_2"
ONBOOT="yes"
IPADDR="172.16.11.10"
NETMASK="255.255.255.0"
# chkconfig NetworkManager off
# chkconfig network on
# service NetworkManager stop
# service network restart
Note: you can´t do this over SSH of course, since it cuts your connection
iSCSI initiator configuration:
# yum install -y iscsi-initiator-utils device-mapper-multipath
# iscsiadm -m discovery -t sendtargets -p 172.16.10.11:3260
Code:
172.16.10.11:3260,-1 iqn.2014-06.bar.foo:storage.foo.bar:hypervisor-1
# iscsiadm -m discovery -t sendtargets -p 172.16.11.11:3261
Code:
172.16.11.11:3261,-1 iqn.2014-06.bar.foo:storage.foo.bar:hypervisor-1
# iscsiadm -m node --targetname "iqn.2014-06.bar.foo:storage.foo.bar:hypervisor-1" --portal "172.16.10.11:3260" --login
# iscsiadm -m node --targetname "iqn.2014-06.bar.foo:storage.foo.bar:hypervisor-1" --portal "172.16.11.11:3261" --login
MPIO config:
/dev/sda here is my CF boot/root device so the iSCSI volumes came in as /dev/sdb and /dev/sdc. To verify this, run:
tail /var/log/messages
# scsi_id -g -u /dev/sdb
Code:
1FREEBSD_MYDEVID_0
/etc/multipath.conf
Code:
blacklist {
devnode "sda"
}
defaults {
user_friendly_names yes
}
multipaths {
multipath {
wwid "1FREEBSD_MYDEVID_0"
alias hypervisor-1
path_grouping_policy multibus
path_selector "round-robin 0"
no_path_retry 5
}
}
Check multipath status:
# multipath -ll
Code:
hypervisor-1 (1FREEBSD_MYDEVID_0) dm-2 FREEBSD,CTLDISK
size=500G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
|- 5:0:0:0 sdc 8:32 active ready running
`- 4:0:0:0 sdb 8:16 active ready running
Using fdisk to partition /dev/mapper/hypervisor-1, you go like, "n, [enter], [enter], [enter], [enter], w". Done! Then run this to create a filesystem:
# mkfs.ext4 /dev/mapper/hypervisor-1[b]p1[/b]
(filesystem goes into the partition)Then you are free to mount that babe anywhere you want and use it for whatever. I´m using it as a VM store and these numbers are from a virtual FreeBSD server with a 80 GB UFS drive running bonnie++:
Code:
Version 1.97 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
virtual.machine 32G 754 99 178318 47 62653 34 1346 98 196696 37 131.0 6
Latency 19592us 926ms 501ms 36170us 355ms 437ms
Version 1.97 ------Sequential Create------ --------Random Create--------
virtual.machine -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 27388 67 +++++ +++ +++++ +++ 18330 46 +++++ +++ +++++ +++
Latency 100ms 3078us 3026us 167ms 34878us 113us
Read 'em and weep Except, for the purpose of just measuring the network throughput, I
zfs set sync=disabled pool1/hypervisor_1
just to show its potential.Last words; the caveat
For reasons unknown to me (please chip in if you know), using the iftop utility watching the physical eth0 and eth1 in CentOS, both iSCSI addresses go out through the same eth0 interface, at which case throughput is capped at 1 Gb/s. Rebooting the hypervisor and looking again, both addresses go out through eth1 instead. Keep rebooting and soon enough you will be able to see address 172.16.11.10 going out through eth0 and 172.16.10.10 through eth1 (or vice versa), and then you will get 2 Gb/s throughput. No idea as to why that occurs, and as I said, if you know, do share
/Sebulon