Gmirror read speed issue

Good day!

I have a system with 4x1Tb HDDs, I decided to mirror its partitions (like there) under a freshly installed FreeBSD 9.0-RC3. After I had done that I decided to benchmark read speed. Now I wonder, why read speed from four mirrored disks is not even close to multiplied of 4x?

Configuration:

[cmd=]gmirror status[/cmd]
Code:
gingema# gmirror status
            Name    Status  Components
  mirror/gm-boot  COMPLETE  ada0p1 (ACTIVE)
                            ada1p1 (ACTIVE)
                            ada2p1 (ACTIVE)
                            ada3p1 (ACTIVE)
mirror/gm-rootfs  COMPLETE  ada0p2 (ACTIVE)
                            ada1p2 (ACTIVE)
                            ada2p2 (ACTIVE)
                            ada3p2 (ACTIVE)
  mirror/gm-swap  COMPLETE  ada0p3 (ACTIVE)
                            ada1p3 (ACTIVE)
                            ada2p3 (ACTIVE)
                            ada3p3 (ACTIVE)
gingema#
[cmd=]gpart show[/cmd]
Code:
=>        34  1953525101  ada0  GPT  (931G)
          34         128     1  freebsd-boot  (64k)
         162    83886080     2  freebsd-ufs  (40G)
    83886242     2097152     3  freebsd-swap  (1.0G)
    85983394  1867541741        - free -  (890G)

=>        34  1953525101  ada1  GPT  (931G)
          34         128     1  freebsd-boot  (64k)
         162    83886080     2  freebsd-ufs  (40G)
    83886242     2097152     3  freebsd-swap  (1.0G)
    85983394  1867541741        - free -  (890G)

=>        34  1953525101  ada2  GPT  (931G)
          34         128     1  freebsd-boot  (64k)
         162    83886080     2  freebsd-ufs  (40G)
    83886242     2097152     3  freebsd-swap  (1.0G)
    85983394  1867541741        - free -  (890G)

=>        34  1953525101  ada3  GPT  (931G)
          34         128     1  freebsd-boot  (64k)
         162    83886080     2  freebsd-ufs  (40G)
    83886242     2097152     3  freebsd-swap  (1.0G)
    85983394  1867541741        - free -  (890G)

Benchmark:
  • Gmirror round-robin algorithm
    # gmirror configure -b round-robin gm-rootfs
    # dd if=/dev/mirror/gm-rootfs of=/dev/null bs=2M count=2000
    Code:
    4194304000 bytes transferred in 31.368543 secs (133710514 bytes/sec)

    # gstat -f 'ada.p2' -I 10s
    Code:
     L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
        1    258    257  32832    0.7      2     42    0.3   17.1| ada0p2
        0    258    257  32832    0.7      2     42    0.3   17.8| ada1p2
        0    258    257  32832    0.7      2     42    0.3   19.0| ada2p2
        0    258    257  32832    1.1      2     42    0.5   27.1| ada3p2

  • Gmirror split algorithm ( block size 2048 bytes)
    # gmirror configure -b split -s 2048 gm-rootfs
    # dd if=/dev/mirror/gm-rootfs of=/dev/null bs=2M count=2000
    Code:
    4194304000 bytes transferred in 36.875156 secs (113743356 bytes/sec)

    # gstat -f 'ada.p2' -I 10s
    Code:
     L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
        0    875    874  27961    0.5      1     38    0.3   41.1| ada0p2
        1    875    874  27958    0.5      1     38    0.4   40.0| ada1p2
        0    875    874  27961    0.5      1     38    0.6   42.7| ada2p2
        0    875    874  27961    0.5      1     38    0.3   41.5| ada3p2

  • Gmirror split algorithm (default block size 4096 bytes)
    # gmirror configure -b split -s 4096 gm-rootfs
    # dd if=/dev/mirror/gm-rootfs of=/dev/null bs=2M count=2000
    Code:
    4194304000 bytes transferred in 36.932793 secs (113565849 bytes/sec)

    # gstat -f 'ada.p2' -I 10s
    Code:
     L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
        0    886    886  28344    0.5      0      0    0.0   42.2| ada0p2
        1    886    886  28341    0.5      0      0    0.0   40.8| ada1p2
        0    886    886  28344    0.5      0      0    0.0   41.0| ada2p2
        0    886    886  28344    0.5      0      0    0.0   43.5| ada3p2

  • Gmirror split algorithm (block size 16384 bytes)
    # gmirror configure -b split -s 16384 gm-rootfs
    # dd if=/dev/mirror/gm-rootfs of=/dev/null bs=2M count=2000
    Code:
    4194304000 bytes transferred in 36.476575 secs (114986235 bytes/sec)

    # gstat -f 'ada.p2' -I 10s
    Code:
     L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
        0    888    888  28427    0.5      0      0    0.0   43.0| ada0p2
        0    888    888  28427    0.5      0      0    0.0   41.5| ada1p2
        0    888    888  28427    0.5      0      0    0.0   41.3| ada2p2
        1    888    888  28427    0.5      0      0    0.0   41.2| ada3p2

  • Gmirror load algorithm
    # gmirror configure -b load gm-rootfs
    # dd if=/dev/mirror/gm-rootfs of=/dev/null bs=2M count=2000
    Code:
    4194304000 bytes transferred in 30.788696 secs (136228700 bytes/sec)

    # gstat -f 'ada.p2' -I 10s
    Code:
     L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
        0      1      0      0    0.0      1     42    0.3    0.0| ada0p2
        1   1058   1056 135175    0.8      1     42    0.3   87.1| ada1p2
        0      1      0      0    0.0      1     42    0.3    0.0| ada2p2
        0      1      0      0    0.0      1     42    0.3    0.0| ada3p2

So can anybody explain this behaviour?
 
Why do you expect the read speed to increase fourfold?

Mirroring does not improve read or write speeds, if anything it'll slow it down.
 
Mirroring (RAID 1) is for redundancy, striping (RAID 0) is for speed. The balancing algorithm for gmirror(8) can affect speeds. round-robin just reads from each drive in turn. load or split could give an improvement over single-drive speed.
 
Tests with dd measure linear speed in single stream. For modern disks linear speed is a function of the rotation speed and the linear data density. You can't read data faster then platter turning. If you try to use "split" method to make each disk handle only part of each request, then each of them will have to wait for platter to turn over the area read by other 3 disks. What you can get in this setup is 4x read speed if you run 4 dds same time. gmirror "load" algorithm is well optimized for multiple concurrent reads.

Simple real life example: nine women won't make child in a month, but after 9 months you may have 9 of them.

If you don't need 4x redundancy, you may want to setup combination of gstripe and two gmirrors to create RAID10. It will give you 2x linear speed in one stream and 4x in two streams by the cost of lower 2x redundancy. The only limitation is that you can't boot from that partition, only use it for data.

Also, if your SATA controller has of of supported RAID BIOSes, you can setup RAID10 there and use new geom_raid module to handle it in FreeBSD. In that case you may even boot from RAID10.
 
ZFS is a very special beast, mixing huge caches with own implementation of RAID levels tightly integrated into the file system.

In theory, for standard RAID1, with building custom kernel and configuring huge read-ahead (>4 times more then the length of the HDD track) it is possible to get the situation when linear read will be split between drives to get almost full 4x speed in one stream. But in practice it is unreachable. To reach it in practice there are some fancy RAID levels exist, that have specially engineered data layout that may give full 4x performance in single stream linear read. But they will degrade much with different access patterns or in case of degraded arrays.
 
Back
Top