Performance problems with Western Digital SAS drives

We are experiencing weird write performance problems with Western Digital 4[ ]TB SAS drives (WD4001FYYG) under FreeBSD. Read performance is fine but (sequential) write performance does not get above 10-14 MB/s which is way below the expected value of 100-150 MB/s. This was tested by doing a dd with blocksize 1 MB to the disk device on an idle disk.

Our setup is as follows:

FreeBSD 9.1
LSI 9211-4i controller
3x WD 4 TB SAS drives
2x Kingston V300 SSD drives (boot/OS partition)
SuperMicro X9DRI-F-B mainboard

The problem does not seem to be with the drives themselves as their write performance under Linux is fine (150 MB/s).

Latest drivers have been installed and we performed the following additional tests to analyse this problem:
  • Different SAS disk: tested a Seagate SAS disk, this performed to specs (write speeds 150 MB/s)
  • Test with SATA WD disk: We tested with a SATA Western Digital RE drive, this performs fine too (write speeds ~150 MB/s)
  • Different controller driver versions: Installed the latest version but also older versions, problem persisted on WD drives
  • Different controller: Tried a different LSI SAS controller, worked fine with other drives but not with WD drives
  • Different FreeBSD version: Installed FreeBSD 8, which did not make a difference
We also tested with just a single SAS disk connected, just to rule out some sort of conflict.

Right now we are pretty much stumped why we get such a dreadful performance. All the hardware manufacturers list everything as being compatible (controller, drives and OS). Is there anyone here that has an idea what might be going on or how this could be fixed?
 
This might be due to the drive using 4K clusters but lying to the OS about it. Try to align your partitions on 4K and see if that improves anything.
 
SirDice said:
This might be due to the drive using 4K clusters but lying to the OS about it. Try to allign your partitions on 4K and see if that improves anything.

Thanks for your reply! We have tried that as well (both 512/4k clusters) but that did not improve things.
 
Make sure that disks cache settings are equal (enabled), for example, via `camcontrol modepage`. With default kernel settings dd writes data in 128 K block, and if disk write cache is disabled: 128 * 7200 / 60 = 15360 ~= 15 MB/s. Other option is to queue more writes to the device simultaneously, as most of filesystems do, but dd doesn't.
 
Polynidos said:
Thanks for your reply! We have tried that as well (both 512/4k clusters) but that did not improve things.

But cluster size is not alignment. It's possible to use 4K clusters and still be misaligned. Please show the output of gpart show for that drive.
 
Thanks for the suggestions. I will ask my colleague to check those settings (I don't have access to the server right now). But for now a few comments/questions.

The drives seem to use 512 byte sectors, according to WD product specs (http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-771386.pdf). BTW, we use the following command to test write speeds:

dd if=/dev/zero of=/dev/da2 bs=1m

I have tested a few concurrent dd processes but that does not improve write speed. However, using the same command on a Seagate SAS drive gives me proper write performance.

I appreciate your input and will try to get you the answers ASAP. However, if you write directly to the disk like that, will you not circumvent any cluster/alignment issues?
 
Yes, if writes start at the beginning of the disk, alignment should not be a problem. Some RAID controllers put metadata at the beginning of the disk, but usually of a size that won't interfere with alignment.

WD has had serious quality control issues for a while. Between that and this, I'll pick a different brand the next time I need disks.
 
Polynidos said:
BTW, we use the following command to test write speeds:

dd if=/dev/zero of=/dev/da2 bs=1m

Note that kernel limits all I/Os from user-level to raw devices with MAXPHYS at a time, that is 128K by default. As result bs=1m will hardly differ from bs=128k. What you can try to do is set there bs=64k to see whether it will affect performance. If it write cache is disabled, that should reduce performance in half. If write cache is enabled, difference should be minimal.

Polynidos said:
I have tested a few concurrent dd processes but that does not improve write speed.

Several concurrent dd processes will give you random access instead of linear, potentially even hitting performance instead of improving.
 
mav@ said:
Make sure that disks cache settings are equal (enabled), for example, via `camcontrol modepage`. With default kernel settings dd writes data in 128 K block, and if disk write cache is disabled: 128 * 7200 / 60 = 15360 ~= 15 MB/s. Other option is to queue more writes to the device simultaneously, as most of filesystems do, but dd doesn't.

Thanks! This suggestion helped us solve the problem. Since we were using WD RAID Edition disks, the write cache on the disk was disabled by default. The Seagate SAS disk we tested had write cache enabled. After enabling write cache on the WD disks the write performance increased to ~175 MB/s.

For the record, we turned on the write cache using the following command:

echo "WCE: 1" | camcontrol modepage da0 -m 0x08 -e

where da0 of course is the disk that we are enabling write cache on.

Just another question about best practice. We are going to use ZFS with SLOG (ZIL) on a super capacitated SSD. We want to prevent data loss on a sudden power outage. Is it then better to disable write cache and can we then increase MAXPHYS for better write performance or is it OK to enable disk write caching?
 
ZFS actively uses cache flush commands when needed, so AFAIK disabling cache is not required for data consistency if the device implements flushing correctly.

ZFS issues many simultaneous read/write requests to hide the penalty from disabled caches and similar cases. So increasing MAXPHYS could just slightly reduce system load, but probably not so much affect performance. From the other side, ZFS has its own limit of 128 K somewhere inside, so even if you increase MAXPHYS, ZFS (unlike UFS) will not use it automatically.
 
Polynidos said:
::snip::

For the record, we turned on the write cache using the following command:

echo "WCE: 1" | camcontrol modepage da0 -m 0x08 -e

where da0 of course is the disk that we are enabling write cache on.

::snip::

I know this is an old thread but I wanted to say thanks and confirm for anyone searching on this topic that this fixes the problem with my configuration. I was getting about 120MB/sec sequential read with 256k blocks with dd, but only about 25MB/sec write. This is on an IBM M1115 SAS controller crossflashed to an LSI 9211-8i with Fujitsu MBA3300RC drives, both through an expander and directly. Same situation, CentOS both read and wrote 120MB/sec on the same hardware. This shows status for WCE for my SCSI drives, then enables on all and then shows status again:

Code:
#!/usr/local/bin/bash

for file in /dev/da? /dev/da??; do echo $file; camcontrol modepage $file -m 0x08 $file|grep WCE; done
for file in /dev/da? /dev/da??; do echo $file; echo "WCE: 1" | camcontrol modepage $file -m 0x08 -e; done
for file in /dev/da? /dev/da??; do echo $file; camcontrol modepage $file -m 0x08 $file|grep WCE; done
 
Back
Top