advice for big ZFS Supermicro iscsi storage - around 90 SAS2 hdds

Hello to everyone,

We're just a few steps before signing a contract for purchasing the two configurations below. The budget has been approved already so we can't add more components to these configurations but we can still change specific component if needed. My request for you guys is to review the configuration and find any misconfiguration or parts which will not perform well together.

First Supermicro server - used for development environment and short term backup. We have a 2TB Oracle application with many different development projects (between 15-20) running simultaneously and we need a different environment for all of them. We just snapshot and clone the main DB and then make the cloned DB live until the end of the project. No deduplication will be used and we don't want to use it because performance is important for us. Also we keep snapshots of the main DB as a backup for not more than a month. All these snapshots will be send to the second storage where kept for not more than a year. On the second storage performance is not important. We will use it only as an archive instead our tape library. Both servers will used iSCSI to present space to the virtual servers. Until the end of this year maybe we will buy one more enclosure for the first server and more HDDs for the second.

FreeBSD 9.1/10 will be used on both servers.

Here space is NOT important, performance IS important
Chassis
1x Chassis Server Case Supermicro 847E26-R1K28LPB, 1280W Redundant Power Supply, 36 x 3.5" SAS2/SATA3 Hot-swappable Drive Bays,SAS2 Dual Expander, 7x Hot-Swappable Cooling Fans, 4U Chassis
1x Mother Board Supermicro X9DRi-LN4F+, Intel C602 chipset,UP to 768GB DDR3 ,24x DDR3 DIMM slots, 4 x16 PCI-E 3.0, 1 x8 PCI-E 3.0, 1 x4 PCI-E 3.0 (in x8), 4x Gigabit LAN Intel i350,8x SATA2 and 2x SATA3, RAID 0, 1, 5, 10, IPMI 2.0 and KVM
2x CPU Intel Xeon E5-2650 (8 cores) 2.0Ghz,20MB L3 cache, s2011
2x Cooler SNK-P0048P Supermicro ,2U+ Passive Heat Sink for Intel X9
8x RAM 16GB DDR3-1600 2Rx4 ECC REG RoHS
2x Cables CBL-0281L IPASS to IPASS SAS Cable, 75cm, Pb-free
9x SAS2 HDD Seagate Cheetah 600GB, SAS2, 15K.7, 3.5 inch (Formatted 512 KB/Sector)
2x for OS - SSD HDD OCZ SSD Vertex 4 drive, 2.5" 128GB VTX4-25SAT3-128G
1x for L2ARC - SSD HDD Deneva2 C Sync MLC 240GB, 2.5"D2CSTK251M21-0240
1x HBA, LSI SAS 2/SATA Controller 9271-8i
first additional enclosure
1x Chassis Server Case Supermicro CSE-847E26-RJBOD1, 1400W Redundant Gold Level PSU, 45 (24 front+21 rear) x3.5" Hot-swap HDD bays E26 (6Gb/s) Expander, 7x 8cm hot-swap cooling fans
40x SAS2 HDD - Seagate Cheetah 600GB, SAS2, 15K.7, 3.5 inch (Formatted 512 KB/Sector)
2x Cables CBL-0166L SAS EL2/EL1 Cascading Cable (External), 68cm
1x HBA, LSI SAS 9207-8e
second additional enclosure
1x Chassis Server Case Supermicro CSE-847E26-RJBOD1, 1400W Redundant Gold Level PSU, 45 (24 front+21 rear) x3.5" Hot-swap HDD bays E26 (6Gb/s) Expander, 7x 8cm hot-swap cooling fans
40x SAS2 HDD Seagate Cheetah 600GB, SAS2, 15K.7, 3.5 inch (Formatted 512 KB/Sector)
2x Cables CBL-0166L SAS EL2/EL1 Cascading Cable (External), 68cm
1x HBA, LSI SAS 9207-8e

zfs configuration - one pool with zfs mirror for all HDD's (39*2 HDDs mirror + 2 hot spare)
9 HDDs in the chassis will be used for our MS Exchange backup and we want them in a separate pool.

Second Supermicro server - Long term backup. Space is important, performance is not important

1x Chassis Server Case Supermicro 847E26-R1K28LPB, 1280W Redundant Power Supply, 36 x 3.5" SAS2/SATA3 Hot-swappable Drive Bays,SAS2 Dual Expander, 7x Hot-Swappable Cooling Fans, 4U Chassis
1x Mother Board Supermicro X9DRi-LN4F+, Intel C602 chipset,UP to 768GB DDR3 ,24x DDR3 DIMM slots, 4 x16 PCI-E 3.0, 1 x8 PCI-E 3.0, 1 x4 PCI-E 3.0 (in x8), 4x Gigabit LAN Intel i350,8x SATA2 and 2x SATA3, RAID 0, 1, 5, 10, IPMI 2.0 and KVM
2x CPU Intel Xeon E5-2650 (8 cores) 2.0Ghz,20MB L3 cache, s2011
2x Cooler SNK-P0048P Supermicro ,2U+ Passive Heat Sink for Intel X9
8x RAM 16GB DDR3-1600 2Rx4 ECC REG RoHS
2x Cables CBL-0281L IPASS to IPASS SAS Cable, 75cm, Pb-free
22x SATA HDD Hitachi 2TB , Ultrastar, A7K3000, HUA723020ALA640, 7200 RPM, SATA III 6Gbit/s (Formatted 512 KB/Sector)
2x for OS - SSD HDD OCZ SSD Vertex 4 drive, 2.5" 128GB VTX4-25SAT3-128G
1x for L2ARC - SSD HDD Deneva2 C Sync MLC 240GB, 2.5"D2CSTK251M21-0240
1x HBA, LSI SAS 2/SATA Controller 9271-8i

zfs configuration - one pool with raidz (3*7 HDDs + 1 hot spare)


Regards: Ivan
 
Hi,

As far as I see this it should be OK, but I am not so good at the hardware configurations :)
Maybe the others can provide better opinion :)
 
@zag

Since the system is going to be used as database stores, I was thinking about IOPS and write latency. I made a quick look at the wikipedia IOPS page, took the IOPS number for 15k SAS drive and multiplied that by your number of disks; (39*2)*175=13,650 IOPS total for that system. Comparing that to the specs of a "D2RSTK251E19-0200" that boasts 80,000 IOPS, your system could have use for that kind of SLOG, prefferably mirrored. And regarding comparing write latency of a SSD to any normal HDD, the SSD wins by miles.

Remember to [CMD=]zfs set sync=always foo/bar[/CMD] since istgt doesn´t obay iSCSI SYNCHRONIZE CACHE commands.

/Sebulon
 
You want Intel DC S3700 Series SSDs for low latency, durability and capacitor protection. Older 320 series are also very good. OCZ and/or Sandforce simply do not belong in a datacenter.
 
Why are you using a 35-bay server for your head unit (first server)?

Using a 2U SuperMicro case (forget the model number, it has 24x 2.5" bays along the front) is better. That way, you have just the motherboard, CPUs, RAM, SSDs, and HBAs installed in that case. Everything else is attached via external cables to the storage units (the JBODs). Also, why do you have extra drives plugged into this box (the 9 SAS Cheetahs)?

This is the setup we're using, although we're using SATA drives instead of SAS (it's just a backups server, so speed is less important than storage space).

Also, you don't mention what VM hypervisor you're using, or guest OSes, but you will definitely want to compare performance between:
  1. iSCSI export to the VM host, passed into the guest
  2. iSCSI export direct to the guest VM
  3. NFS export to the guest VM
Depending on the application, the last option should be much faster, and will provide a lot more flexibility on the storage end (separate ZFS filesystem for each VM, with full access to the filesystem from the storage box; much nicer for backups). However, it depends on what VM tech is being used, and what OS is in the guest.
Edit: Here's the head unit we're using: SuperMicro SC216E26-R1200LPB (24x 2.5" bays). Works especially well with motherboards that have onboards multi-lane SAS controllers, as they plug into the backplane in this server, and leave the PCIe-based HBAs for the external enclosures. Thus, the "head unit" only has the OS installed, and the L2ARC/SLOG devices, no actual data storage. And, the external JBODs have all the bulk data storage.
 
waksmundzki said:
You want Intel DC S3700 Series SSDs for low latency, durability and capacitor protection. Older 320 series are also very good. OCZ and/or Sandforce simply do not belong in a datacenter.

Does statement come from any personal experience? As for us, we´ve been using Deneva 2 R as SLOG and Vertex 3 as L2ARC in production for over a year now in two of our storage systems without any issues what so ever. Now, the DC S3700 is an outstanding SSD. Intel knows that, and charges you accordingly;)

/Sebulon
 
phoenix said:
Why are you using a 35-bay server for your head unit (first server)?

Using a 2U SuperMicro case (forget the model number, it has 24x 2.5" bays along the front) is better. That way, you have just the motherboard, CPUs, RAM, SSDs, and HBAs installed in that case. Everything else is attached via external cables to the storage units (the JBODs). Also, why do you have extra drives plugged into this box (the 9 SAS Cheetahs)?

This is the setup we're using, although we're using SATA drives instead of SAS (it's just a backups server, so speed is less important than storage space).

Also, you don't mention what VM hypervisor you're using, or guest OSes, but you will definitely want to compare performance between:
  1. iSCSI export to the VM host, passed into the guest
  2. iSCSI export direct to the guest VM
  3. NFS export to the guest VM
Depending on the application, the last option should be much faster, and will provide a lot more flexibility on the storage end (separate ZFS filesystem for each VM, with full access to the filesystem from the storage box; much nicer for backups). However, it depends on what VM tech is being used, and what OS is in the guest.
Edit: Here's the head unit we're using: SuperMicro SC216E26-R1200LPB (24x 2.5" bays). Works especially well with motherboards that have onboards multi-lane SAS controllers, as they plug into the backplane in this server, and leave the PCIe-based HBAs for the external enclosures. Thus, the "head unit" only has the OS installed, and the L2ARC/SLOG devices, no actual data storage. And, the external JBODs have all the bulk data storage.

On this storage we are not planning to have any hypervisor/virtual machines. We have another storages - EMC/VNX which holds the virtual machines (they run on VMWare 5.1). The space from the supermicro is planned to be used only as separate partition on windows servers for holding oracle databases via iscsi.

The 9 disk in the chassis will be used as our Exchange backup solution (5-10 snapshots per day with retention period from 30 days to an year) and we want them to be separated from the big pool used for databases. We are expecting a huge growth in Exchange data in the following years and that's why we prefer to have more available bays in order to expand.


Thanks everyone for their opinions.
Regards: Ivan
 
Sebulon said:
Does statement come from any personal experience? As for us, we´ve been using Deneva 2 R as SLOG and Vertex 3 as L2ARC in production for over a year now in two of our storage systems without any issues what so ever.

My mysqld slave boxes would get data corruption on OCZ/Sandforce after about 7-9 months of heavy usage. After the fifth failed drive the ISP just installed Intels. Under lighter load ( nginx/ php-fpm) I had situations where OCZ/Sandforce drives would simply disappear and the server required a cold reboot for them to come back. My point is simply that the amount of downtime and admin time to restore services is simply not worth the potential savings of cheap SSDs.
 
waksmundzki said:
My mysqld slave boxes would get data corruption on OCZ/Sandforce after about 7-9 months of heavy usage. After the fifth failed drive the ISP just installed Intels. Under lighter load ( nginx/ php-fpm) I had situations where OCZ/Sandforce drives would simply disappear and the server required a cold reboot for them to come back. My point is simply that the amount of downtime and admin time to restore services is simply not worth the potential savings of cheap SSDs.

I don't blame you after having that kind of experience. It's a shame that our experiences of OZC SSD's are so different. I remember having a Vertex 3 also just "drop out" like that, before putting the system into production. I fixed it with a firmware update and it has been a trustworthy servant ever since:) I've read about a rather nasty firmware bug affecting some 320's, making them suddenly snap down to 1GB total in size. I'm guessing you've stayed clear of that one.

/Sebulon
 
Back
Top