Extremely slow HDD write throughput

Ender117 · Sep 27, 2018

Hello, I am setting up FreeNAS (based on FreeBSD 11.1-STABLE) and have a problem with disk writes. It seems to be FreeBSD related because everything looks normal if I use Ubuntu on the same hardware and can reproduce with a FreeBSD live CD:

Code:

root@freenas:~ # dd if=/dev/zero of=/dev/da4 bs=1M
^C5111+0 records in
5110+0 records out
5358223360 bytes transferred in 369.763618 secs (14490943 bytes/sec)
root@freenas:~ # dd if=/dev/da4 of=/dev/null bs=1M
^C125744+0 records in
125744+0 records out
131852140544 bytes transferred in 670.443633 secs (196664021 bytes/sec)


ubuntu@ubuntu:~$ sudo dd if=/dev/zero of=/dev/sde bs=1M count=10K
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 63.386 s, 169 MB/s
ubuntu@ubuntu:~$ sudo dd if=/dev/sde of=/dev/null bs=1M count=10K
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 55.7659 s, 193 MB/s

Diskinfo looks fine but this might only be doing read test?

Code:

root@freenas:~ # diskinfo -t /dev/da4
/dev/da4
        512             # sectorsize
        4000787030016   # mediasize in bytes (3.6T)
        7814037168      # mediasize in sectors
        4096            # stripesize
        0               # stripeoffset
        486401          # Cylinders according to firmware.
        255             # Heads according to firmware.
        63              # Sectors according to firmware.
        HGST H7240B520SUN4.0T   # Disk descr.
        001502M0PSKL        NHG0PSKL    # Disk ident.
        id1,enc@n50050cc10203efb6/type@0/slot@c # Physical path
        Not_Zoned       # Zone Mode

Seek times:
        Full stroke:      250 iter in   4.790527 sec =   19.162 msec
        Half stroke:      250 iter in   3.465465 sec =   13.862 msec
        Quarter stroke:   500 iter in   4.048862 sec =    8.098 msec
        Short forward:    400 iter in   0.588699 sec =    1.472 msec
        Short backward:   400 iter in   1.981070 sec =    4.953 msec
        Seq outer:       2048 iter in   0.121217 sec =    0.059 msec
        Seq inner:       2048 iter in   0.421541 sec =    0.206 msec

Transfer rates:
        outside:       102400 kbytes in   0.573899 sec =   178429 kbytes/sec
        middle:        102400 kbytes in   0.591255 sec =   173191 kbytes/sec
        inside:        102400 kbytes in   1.113098 sec =    91995 kbytes/sec

And smart info

Code:

root@freenas:~ # smartctl -x /dev/da4
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               HGST
Product:              H7240B520SUN4.0T
Revision:             M54J
Compliance:           SPC-4
User Capacity:        4,000,787,030,016 bytes [4.00 TB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
Formatted with type 1 protection
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000cca2430146bc
Serial number:        001502M0PSKL        NHG0PSKL
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Thu Sep 27 12:19:39 2018 PDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled
Read Cache is:        Enabled
Writeback Cache is:   Disabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     36 C
Drive Trip Temperature:        85 C

Manufactured in week 02 of year 2015
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  8
Specified load-unload count over device lifetime:  600000
Accumulated load-unload cycles:  13
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 2059298825830400

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0       3128      13801.399           0
write:         0        0         0         0      11284      20562.991           0
verify:        0        0         0         0       1164          0.000           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed                   -      99                 - [-   -    -]
# 2  Background short  Completed                   -      90                 - [-   -    -]
# 3  Background short  Completed                   -      88                 - [-   -    -]

Long (extended) Self Test duration: 34237 seconds [570.6 minutes]

Background scan results log
  Status: waiting until BMS interval timer expires
    Accumulated power on time, hours:minutes 454:31 [27271 minutes]
    Number of background scans performed: 3,  scan progress: 0.00%
    Number of background medium scans performed: 3

Protocol Specific port log page for SAS SSP
relative target port id = 1
  generation code = 1
  number of phys = 1
  phy identifier = 0
    attached device type: expander device
    attached reason: power on
    reason: unknown
    negotiated logical link rate: phy enabled; 6 Gbps
    attached initiator port: ssp=0 stp=0 smp=1
    attached target port: ssp=0 stp=0 smp=1
    SAS address = 0x5000cca2430146bd
    attached SAS address = 0x50050cc10b158ebf
    attached phy identifier = 14
    Invalid DWORD count = 0
    Running disparity error count = 0
    Loss of DWORD synchronization = 0
    Phy reset problem = 0
    Phy event descriptors:
     Invalid word count: 0
     Running disparity error count: 0
     Loss of dword synchronization count: 0
     Phy reset problem count: 0
relative target port id = 2
  generation code = 1
  number of phys = 1
  phy identifier = 1
    attached device type: no device attached
    attached reason: unknown
    reason: power on
    negotiated logical link rate: phy enabled; unknown
    attached initiator port: ssp=0 stp=0 smp=0
    attached target port: ssp=0 stp=0 smp=0
    SAS address = 0x5000cca2430146be
    attached SAS address = 0x0
    attached phy identifier = 0
    Invalid DWORD count = 0
    Running disparity error count = 0
    Loss of DWORD synchronization = 0
    Phy reset problem = 0
    Phy event descriptors:
     Invalid word count: 0
     Running disparity error count: 0
     Loss of dword synchronization count: 0
     Phy reset problem count: 0

My hardware:

Dell R620 as head unit:

Dual E5 2690 v2

128GB RAM

H710P

LSI 9207 8e

Dell network card with 2 x540 2 i350

1 s3500 80G as boot drive

NetAPP DS4243 as DAS:

24 3.5 Bay

swapped in HB-SBB2-E601-COMP IO module because of price on minisas to qsfp cable

12 4TB HGST NL SAS drives HUS726040AL5210

I am pretty naive with, let's say, all Unix systems, but this still seems strange. Any idea how to trouble shoot this?

ralphbsz · Sep 27, 2018

Strange. Makes no sense. While I don't have LSI SAS controllers on my FreeBSD system at home, I know that they are capable of way more than the 14 MB/s you're seeing, as is FreeBSD itself (I get >100 MB/s with the on-board SATA controller).

Suggestion: The likely culprit is high up in the chain. Since you are willing to overwrite your disks anyway, just correctly partition one of them, configure a file system on it (UFS or ZFS, your choice), and run the read/write test against a file system (again, big files, large IOs, using dd as you were doing). You will have to be carefuly to measure the raw disk speed and not the buffer cache (using the right combination of mount and sync). If everything runs at a reasonable speed, the problem was somewhere on the dd end.

No, follow leebrown66's advice first.

Ender117 · Sep 28, 2018

ralphbsz said:
Strange. Makes no sense. While I don't have LSI SAS controllers on my FreeBSD system at home, I know that they are capable of way more than the 14 MB/s you're seeing, as is FreeBSD itself (I get >100 MB/s with the on-board SATA controller).

Suggestion: The likely culprit is high up in the chain. Since you are willing to overwrite your disks anyway, just correctly partition one of them, configure a file system on it (UFS or ZFS, your choice), and run the read/write test against a file system (again, big files, large IOs, using dd as you were doing). You will have to be carefuly to measure the raw disk speed and not the buffer cache (using the right combination of mount and sync). If everything runs at a reasonable speed, the problem was somewhere on the dd end.

Hi, thanks for the reply. Would you elaborate more on " measure the raw disk speed and not the buffer cache (using the right combination of mount and sync)"? I really only know little about FreeBSD

leebrown66 · Sep 28, 2018

It's a setting in the LSI firmware, but I can't for the life of me remember what, maybe write caching?
I had the same problem 20/200 for write/read for a LSI SAS9300-4i4e. Sorry I can't be more specific, that machine is in production now and I can't reboot it to look at the settings.

Ender117 · Sep 28, 2018

leebrown66 said:
It's a setting in the LSI firmware, but I can't for the life of me remember what, maybe write caching?
I had the same problem 20/200 for write/read for a LSI SAS9300-4i4e. Sorry I can't be more specific, that machine is in production now and I can't reboot it to look at the settings.

Great to hear that somebody had the same problem before, although my Dell server have some problem let me in the LSI BIOS.
I managed to remove the H710p and get into the 9207's configuration utility. Does any of these look familiar?

Ender117 · Sep 29, 2018

OK, I am thinking that this may be related to either disk cache or something called command queuing: https://lists.freebsd.org/pipermail/freebsd-scsi/2014-September/006487.html
Especially this:

Code:

Disabling caches globally heavily
affects performance, and in most cases is overkill. It means that
_every_ request will go to the media before the operation complete.
For disks without command queuing (like legacy ATA) that usually meant
_one_ I/O request per platter revolution. Do you want disk doing 120
IOPS peak? If you write huge file in 128K chunks, you will get limited
by 120/s * 128K = 15MB/s! Command queuing (NCQ for SATA) significantly
improved this situation since OS can now send more operations down to
the disk to give it more flexibility, in significant part compensating
disabled cache. But number of simultaneously active tags in disk, HBA
or application can be limited, creating delays.

Now when I do the dd test, gstat give me:

Code:

dT: 1.064s  w: 1.000s
L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    1    111      0      0    0.0    111  14195    8.9   98.9| da3

The write IOPS tops at ~112 during the write test, which matches the description above. Now I need a way to verify/change the on-disk cache/command queuing setting......

Update: I think the tags are set to max, but maybe somehow dd only take one queue (I am assuming the L(q) in the gstat means outstanding commands)?

Code:

root@freenas:~ # camcontrol tags /dev/da3 -v
(pass3:mps0:0:9:0): dev_openings  255
(pass3:mps0:0:9:0): dev_active    0
(pass3:mps0:0:9:0): allocated     0
(pass3:mps0:0:9:0): queued        0
(pass3:mps0:0:9:0): held          0
(pass3:mps0:0:9:0): mintags       2
(pass3:mps0:0:9:0): maxtags       255

I also tried run 3 dd at the same time with tmux, and was able to push IOPS to ~500 and throughput ~50MB/s. It may be either dd or something in FreeBSD limiting the queue to 1 per command.

leebrown66 · Sep 29, 2018

Sorry, no. Is that the latest firmware (2015.08.03 seems old)?

Oh hang on, you say it's a H710, not an actual LSI card. Dell have their own "special" firmware on those cards which usually have reduced functionality.

You could try to flash it with the LSI firmware (make sure to save the original), but I can't help with that. The freebsd-hardware list might be a better place to ask.

I chickened out and got a bona fide LSI card when I purchased my last Dell, on the recommendation from that list.

Ender117 · Sep 29, 2018

leebrown66 said:
Sorry, no. Is that the latest firmware (2015.08.03 seems old)?

Oh hang on, you say it's a H710, not an actual LSI card. Dell have their own "special" firmware on those cards which usually have reduced functionality.

You could try to flash it with the LSI firmware (make sure to save the original), but I can't help with that. The freebsd-hardware list might be a better place to ask.

I chickened out and got a bona fide LSI card when I purchased my last Dell, on the recommendation from that list.

OK, let me clearify, there is a H710p connected to the r620 internal backplane, and a lsi 9207 8e connected to the DAS, where all HDDs are connected. For some reason I cannot get into the lsi BIOS with H710p installed. And yes I believe it's the lasted firmware.

See my last post for more progress. It's either command queue depth or on disk cache got messed up.

BTW, does every post have to be manually approved by mods? That's really cubersome, maybe just because I am a new member?

ralphbsz · Sep 29, 2018

leebrown66 said:
Dell have their own "special" firmware on those cards which usually have reduced functionality.

You could try to flash it with the LSI firmware (make sure to save the original), but I can't help with that.

You can try. The card might refuse the LSI firmware. It might also brick itself. Been there, done that, got the T-shirt. Fortunately, the place where I worked was a large LSI customer, and when we bricked cards in the prototype lab (by the dozen), we would bring them to LSI engineering (which is fortunately only 1/2 hour away by car), and they would fix them for us.

I'm still amazed that the card is capable of creating a factor 10 slowdown in writes. That takes real talent. Color me amazed.

Ender117 · Sep 29, 2018

Hi I think I find one way solving it, the on-disk cache was disabled on these drives.

Code:

camcontrol mode /dev/da4 -m 0x08 -e

and change WCE to 1 to enable write cache. That brought speed to expected:

Code:

root@freenas:~ # dd if=/dev/zero of=/dev/da4 bs=1M count=1K
1024+0 records in
1024+0 records out
1073741824 bytes transferred in 5.725325 secs (187542510 bytes/sec)
root@freenas:~ # dd if=/dev/da4 of=/dev/null bs=1M count=1K
1024+0 records in
1024+0 records out
1073741824 bytes transferred in 5.690740 secs (188682301 bytes/sec)
root@freenas:~ # dd if=/dev/zero of=/dev/da3 bs=1M count=1K
1024+0 records in
1024+0 records out
1073741824 bytes transferred in 73.990152 secs (14511956 bytes/sec)
root@freenas:~ # dd if=/dev/da3 of=/dev/null bs=1M count=1K
1024+0 records in
1024+0 records out
1073741824 bytes transferred in 5.365380 secs (200124085 bytes/sec)

Note that the cache was enabled only on da4 and da3 was left disabled.

Now the problem is, is this the right/safe solution? I would think the on-disk cache was disabled for good reason, like prevent data loss on power outages, etc. I don't think they have super capacitors to flush the cache in emergencies. Maybe I should turn it off and explore the command queue approach?

leebrown66 · Sep 29, 2018

ralphbsz said:
I'm still amazed that the card is capable of creating a factor 10 slowdown in writes. That takes real talent. Color me amazed.

I was freaked when I ran the first tests. Having persueded my boss (who leans heavily toward Dell) and dropping $k's on a MD1420 (24 bay enclosure) and an R330 server to drive it, I knew it was a bit of a risk, but was not looking forward to telling my boss we could have just bought a $100 SATA disk and done better.

I am pretty sure it was write-caching I had to turn on, so I can only assume it was writing single sectors and defeating the NCQ'ing.

Well that puppy is still on 11.1, so it's up a for an upgrade in the next couple of days. When I reboot it, I'll take pictures of the settings. Not that any of this helps the OP.

ralphbsz · Sep 29, 2018

For large write IOs, the on-disk write cache should make not make that big a difference. Writing a whole MB takes about 5-10 ms (at the roughly 100-200 MB/s bandwidth of the head). Right after the first 1MB IO finishes, dd turns around and issues another one. Without write cache in the drive, the second IO might be delayed by one rotation of the platter (because the IO stack took some time to finish the first IO and start the second one, and the platter might have rotated too far to start the second one in the meantime). That would add 8.3 ms (for a typical 7200 RPM drive) to the second IO to wait for the platter to rotate back to the correct position. Compared to the 5-10 ms for the IO itself, the extra 8.3 ms might reduce the throughput to about half. But it can not account for a factor of 10.

That is, unless someone in the IO stack is cutting the IOs into tiny pieces: Do the same math as above, but for 128 KB IOs for example, and add a 8.3ms penalty after each IO, and you get a factor of 10. Unfortunately, a misconfigured HBA (or HBA and OS kernel driver combination) is capable of cutting IOs into tiny pieces. In Linux, you can check and configure the maximum IO size (it's in /sys/block/... somewhere), and in FreeBSD there is a sysctl for that (that I can't remember). But it is insane to assume that a stock FreeBSD installation has that sysctl already misconfigured. One should check anyway; one rational explanation for the factor 10 slowdown is that the HBA is slicing big IOs into tiny pieces, and delaying each piece by one rotation.

Ender117 · Sep 29, 2018

ralphbsz said:
For large write IOs, the on-disk write cache should make not make that big a difference. Writing a whole MB takes about 5-10 ms (at the roughly 100-200 MB/s bandwidth of the head). Right after the first 1MB IO finishes, dd turns around and issues another one. Without write cache in the drive, the second IO might be delayed by one rotation of the platter (because the IO stack took some time to finish the first IO and start the second one, and the platter might have rotated too far to start the second one in the meantime). That would add 8.3 ms (for a typical 7200 RPM drive) to the second IO to wait for the platter to rotate back to the correct position. Compared to the 5-10 ms for the IO itself, the extra 8.3 ms might reduce the throughput to about half. But it can not account for a factor of 10.

That is, unless someone in the IO stack is cutting the IOs into tiny pieces: Do the same math as above, but for 128 KB IOs for example, and add a 8.3ms penalty after each IO, and you get a factor of 10. Unfortunately, a misconfigured HBA (or HBA and OS kernel driver combination) is capable of cutting IOs into tiny pieces. In Linux, you can check and configure the maximum IO size (it's in /sys/block/... somewhere), and in FreeBSD there is a sysctl for that (that I can't remember). But it is insane to assume that a stock FreeBSD installation has that sysctl already misconfigured. One should check anyway; one rational explanation for the factor 10 slowdown is that the HBA is slicing big IOs into tiny pieces, and delaying each piece by one rotation.

You are absolutely right, the max IO was set to 128K:

Code:

root@freenas:~ # dd if=/dev/zero of=/dev/da3 bs=32K count=1K
1024+0 records in
1024+0 records out
33554432 bytes transferred in 8.636776 secs (3885064 bytes/sec)
root@freenas:~ # dd if=/dev/zero of=/dev/da3 bs=64K count=1K
1024+0 records in
1024+0 records out
67108864 bytes transferred in 8.969837 secs (7481614 bytes/sec)
root@freenas:~ # dd if=/dev/zero of=/dev/da3 bs=128K count=1K
1024+0 records in
1024+0 records out
134217728 bytes transferred in 9.347974 secs (14357949 bytes/sec)
root@freenas:~ # dd if=/dev/zero of=/dev/da3 bs=256K count=1K
1024+0 records in
1024+0 records out
268435456 bytes transferred in 18.637970 secs (14402612 bytes/sec)
root@freenas:~ # dd if=/dev/zero of=/dev/da3 bs=512K count=1K
1024+0 records in
1024+0 records out
536870912 bytes transferred in 37.292697 secs (14396141 bytes/sec)

Note that throughput scales well with bs until 128K

I believe what you are referencing is MAXPHYS. But looks like it's a compile time setting, not a tunable. This is about a year old so things probably haven't changed:
http://freebsd.1045724.x6.nabble.com/Time-to-increase-MAXPHYS-td6189400.html

leebrown66 · Sep 29, 2018

Ender117 said:
Now the problem is, is this the right/safe solution? I would think the on-disk cache was disabled for good reason, like prevent data loss on power outages, etc. I don't think they have super capacitors to flush the cache in emergencies. Maybe I should turn it off and explore the command queue approach?

I just checked my drives and they all have WCE turned on. I'm guessing that setting on my controller must be what it initializes the disks with.

I suppose it is a sane default, unless the disk manufacturer likes to make assumptions like (1) there is a UPS & (2) there is software to shutdown all the drives when the UPS signals. That's what I do, unmount all the filesystems, issue a camcontrol stop to everything befrore issuing a shutdown now. I couldn't figure how to issue a shutdown command to the enclosure itself, there isn't even a power button on the thing.

Ender117 · Sep 30, 2018

leebrown66 said:
I just checked my drives and they all have WCE turned on. I'm guessing that setting on my controller must be what it initializes the disks with.

I suppose it is a sane default, unless the disk manufacturer likes to make assumptions like (1) there is a UPS & (2) there is software to shutdown all the drives when the UPS signals. That's what I do, unmount all the filesystems, issue a camcontrol stop to everything befrore issuing a shutdown now. I couldn't figure how to issue a shutdown command to the enclosure itself, there isn't even a power button on the thing.

Yeah I suppose that's what happened. I discovered that the WCE setting got reset every time the drives got power cycled, then performance would tank. Would love to see the settings on your HBA. Though a cron job applying the WCE setting every 30min will also do the job, I would prefer the more elegant solution to have FreeBSD to set it correctly and automatically.

I did some little testing and find out zfs is aware of the on-disk cache. It will try to flush the cache at least with sync commands, I could only imagine it will also do so after each transaction group. That's good enough for me. https://forums.freenas.org/index.php?threads/on-disk-cache-and-zfs-performance.70267/

leebrown66 · Sep 30, 2018

Ender117 said:
Would love to see the settings on your HBA

I'll should be able to do that tomorrow after I tear down all my VM's

Ender117 said:
Though a cron job applying the WCE setting every 30min will also do the job, I would prefer the more elegant solution to have FreeBSD to set it correctly and automatically.

I am going to add this to my /etc/rc.d so I can just shutdown when the UPS signals it. I added a variable to set WCE on startup.

Code:

#!/bin/sh

# PROVIDE: enclosure
# REQUIRE: disks
# BEFORE: fsck
# KEYWORD: nojail

. /etc/rc.subr

name="enclosure"
desc="Enclosure"
rcvar="enclosure_enable"
start_cmd="enclosure_start"
stop_cmd="enclosure_stop"
camcontrol_cmd="/sbin/camcontrol"

enclosure_start()
{
    for disk in ${enclosure_disks}; do
        echo "Starting ${disk}"
        ${camcontrol_cmd} start ${disk}
        if [ "${enclosure_wce_enable}" = "YES" ]; then
            echo WCE: 1 | ${camcontrol_cmd} modepage ${disk} -m 0x08 -e
        fi
    done
}

enclosure_stop()
{
    for disk in ${enclosure_disks}; do
        echo "Stopping ${disk}"
        ${camcontrol_cmd} stop ${disk}
    done
}

load_rc_config ${name}
run_rc_command "$1"

/etc/rc.conf sample:

Code:

enclosure_enable="YES"
enclosure_disks="da15 da14"
enclosure_wce_enable="YES"

leebrown66 · Oct 1, 2018

Ender117 said:
Would love to see the settings on your HBA

Here are the LSI and the PERC 330 hosting the OS. I see there's a setting for 'cache' on the PERC. I now remember I had to enable WCE for each drive one at a time.

Ender117 · Oct 2, 2018

leebrown66 said:
Here are the LSI and the PERC 330 hosting the OS. I see there's a setting for 'cache' on the PERC. I now remember I had to enable WCE for each drive one at a time.

Thanks for the pics. Just to make sure that the disk cache option is from H330 only? And does this setting affects the WCE or you have to enable it in FreeBSD anyway.

It is still strange though that this problem only occurs on FreeBSD, and only on some drives.

BTW, what does camcontrol stop actually do? the manual says

Code:

Send the SCSI Start/Stop Unit (0x1B) command to the given device with the start bit cleared

which is kind of just self-repeating.

leebrown66 · Oct 2, 2018

I reorganized that folder, it was confusing. There was both LSI and 330, they are labelled now.

The H330 only has one disk, the OS. The cache is disabled and gets 20MB/sec write. camcontrol won't talk to that disk and mfiutil seems only able to control caching on a volume, which I'm not sure what that is as I have it in "non-raid' mode. There's a "switch to HBA" on the 330 I noticed last night so I'm now confused as to the difference between raid, non-raid and HBA. Performance on that isn't important to me as it's a single disk to boot the OS and there's not much activity after that.

Only the LSI can I control WCE from FreeBSD, which is where the 16 disks are connected to.

camcontrol stop seems to prevent any further commands from reaching the disk until a start is issued, I can't do a modepage for example. However it doesn't seem to prevent writing to the disk, so that seems pretty useless in retrospect.

Well I'm certainly more confused than when I started.

Ender117 · Oct 2, 2018

leebrown66 said:
I reorganized that folder, it was confusing. There was both LSI and 330, they are labelled now.

Thanks that's a lot more clear. So after you enable these cache options, you don't have to set WCE inside BSD? I cannot find these options on mine, guess that was introduced with SAS3 cards.

leebrown66 said:
The H330 only has one disk, the OS. The cache is disabled and gets 20MB/sec write. camcontrol won't talk to that disk and mfiutil seems only able to control caching on a volume, which I'm not sure what that is as I have it in "non-raid' mode. There's a "switch to HBA" on the 330 I noticed last night so I'm now confused as to the difference between raid, non-raid and HBA. Performance on that isn't important to me as it's a single disk to boot the OS and there's not much activity after that.

If it's the same with 12th gen Dell (my R620), then non-raid can co-exist with raid virtual disks, while switch to HBA I guess would be just, an HBA, presumably with IT firmware. I also found that in FreeBSD non-raid got loaded with mfi drivers, as oppose to disk loaded with mrsas drivers. Camcontrol does not work with mfi devices but you can get around that using like /dev/passX instead of /dev/mfi*, you may find the X with camcontrol devlist. Hope this helps.

leebrown66 said:
camcontrol stop seems to prevent any further commands from reaching the disk until a start is issued, I can't do a modepage for example. However it doesn't seem to prevent writing to the disk, so that seems pretty useless in retrospect.

Well I'm certainly more confused than when I started.

That's interesting, maybe it's just trying to flush outstanding writes? Have you tried to start a new write after camcontrol stop?

leebrown66 · Oct 3, 2018

Ender117 said:
So after you enable these cache options, you don't have to set WCE inside BSD? I cannot find these options on mine, guess that was introduced with SAS3 cards.

Yes, although I had to set it manually for every disk the LSI is attached to. I'm just glad this was only 16 disks.

Ender117 said:
If it's the same with 12th gen Dell (my R620), then non-raid can co-exist with raid virtual disks, while switch to HBA I guess would be just, an HBA, presumably with IT firmware.

It's a R330, so 13th gen. When I get a chance I'll switch it to HBA, I prefer the FreeBSD geom framework anyway.

Ender117 said:
I also found that in FreeBSD non-raid got loaded with mfi drivers, as oppose to disk loaded with mrsas drivers. Camcontrol does not work with mfi devices but you can get around that using like /dev/passX instead of /dev/mfi*, you may find the X with camcontrol devlist. Hope this helps.

I couldn't see a /dev/pass device for the mfi, only the MD1420 disks and the external USB ones.