Solved unbuffered disk I/O (like OSX's /dev/rdisk)

What do I need to do on FreeBSD to get direct read/write access to a disk device?

I usually write FreeBSD memstick images or wipe my pendrives on my Apple laptop, because I noticed that dd is significantly faster on OS X using /dev/rdisk1 than on FreeBSD (where there is no rdisk entry under dev). Using /dev/disk1 on OS X instead of /dev/rdisk1 gives me the same speed as /dev/da0 on my FreeBSD desktops, so the magic is not in the OS.

Wiping a 8GB pendrive using dd if=/dev/zero of=/dev/da0 bs=32m on FreeBSD takes ages, while doing the same job on OS X using dd if=/dev/zero of=/dev/rdisk1 bs=32m completes much faster, because of the unbuffered I/O provided by the raw device rdisk.

I am just wondering if I could achieve a comparable write speed on FreeBSD.
Do we have a sysctl() tunable or any other hack one can use to achieve something similar to what the OS X rdisk devices provide?
 
You can use direct IO, and achieve the same result of unbuffered IO. That's done by giving the "O_DIRECT" option to the open(2) system call. Unfortunately, the stock dd version that ships with FreeBSD does not seem to have an option to use direct IO (at least that I could quickly find, that option is available on some modern Linux versions). So you would have to write yourself a tiny little C (or perl or python or ...) program that performs that function; that's extremely easy.

I'm somewhat surprised that buffered IO causes such a performance hit. Usually, it actually gets better performance for sequential operations, such as what you're doing with zeroing a drive. Strange.
 
I'm somewhat surprised that buffered IO causes such a performance hit. Usually, it actually gets better performance for sequential operations, such as what you're doing with zeroing a drive. Strange.

It depends on the nature of that sequential operation. If you read something from the device, and a little later you read some more, then the buffer improves performance (same applies for write). In my particular case however, the buffer is completely useless. For one reason, I write far more sequential data than the size of the buffer. Hence the contents of the buffer need to be updated constantly, therefore the buffer speeds up nothing. In fact, it slows the process down by being an unnecessary extra gear in the process.
 
Here is what I could research for this topic.

Wiping clean a 4GB USB 2.0 stick on macOS, using raw-disk round1:
Code:
root@demomac ~ # date;dd if=/dev/zero of=/dev/rdisk2 bs=4m;date
Sun Aug 16 18:57:16 CEST 2020
dd: /dev/rdisk2: short write on character device
dd: /dev/rdisk2: Input/output error
963+0 records in
962+1 records out
4034969600 bytes transferred in 586.373191 secs (6881231 bytes/sec)
Sun Aug 16 19:07:04 CEST 2020
This took about 10 minutes.

Let me do that one more time, to confirm that the speed/time is consistent.
macOS raw-disk round2:
Code:
root@demomac ~ # date;dd if=/dev/zero of=/dev/rdisk2 bs=4m;date
Sun Aug 16 19:07:44 CEST 2020
dd: /dev/rdisk2: short write on character device
dd: /dev/rdisk2: Input/output error
963+0 records in
962+1 records out
4034969600 bytes transferred in 570.989026 secs (7066632 bytes/sec)
Sun Aug 16 19:17:15 CEST 2020
Again, took about 10 minutes.

Now the same task, but instead of dev/rdisk use dev/disk.
Note the lack of the initial "r" letter in the device name!
macOS std-disk round1:
Code:
root@demomac ~ # date;dd if=/dev/zero of=/dev/disk2 bs=4m;date
Sun Aug 16 19:29:18 CEST 2020
dd: /dev/disk2: end of device
963+0 records in
962+1 records out
4034973696 bytes transferred in 8379.199622 secs (481546 bytes/sec)
Sun Aug 16 21:48:59 CEST 2020
This took about 140 minutes.

macOS std-disk round2:
Code:
root@demomac ~ # date;dd if=/dev/zero of=/dev/disk2 bs=4m;date
Sun Aug 16 22:09:46 CEST 2020
dd: /dev/disk2: end of device
963+0 records in
962+1 records out
4034973696 bytes transferred in 8439.354806 secs (478114 bytes/sec)
Mon Aug 17 00:30:27 CEST 2020
Again, took about 140 minutes.

So zeroing out the entire storage area of a 4GB USB 2.0 stick takes 10 minutes when access through the directIO /dev/rdisk device, compared to 140 minutes using the standard /dev/disk bufferedIO device. Therefore using dev/rdisk for this particular kind of task is 14-times faster than using dev/disk. That is significant.

And I was hoping to zero out my drives at this significantly higher speed on FreeBSD too, where there is no dev/rdisk equivalent (imagine something like dev/rada0 or dev/rda0). Hence my original post, asking for a way to use directIO on FreeBSD disk devices instead of the standard bufferedIO devices.
I went on investigating Ralph's suggestion, writing my own block-copy tool which uses the "O_DIRECT" option of open(2)(), and ended up wasting a lot of time building something that does not work properly.
Then I spent a lot of time browsing through the ports collection in the hope of finding an existing tool that I can modify. And I found one which does the job well and does not even need any modificacion. Behold misc/cstream.

==== FreeBSD ====
Messages on console when the USB stick is plugged in.
This is the very same 4GB USB 2.0 stick I used with the Mac tests above.
Code:
ugen0.2: <AI210 Mass Storage> at usbus0
umass0 on uhub1
umass0: <AI210 Mass Storage, class 0/0, rev 2.00/1.00, addr 1> on usbus0
umass0:  SCSI over Bulk-Only; quirks = 0xc100
umass0:2:0: Attached to scbus2
da0 at umass-sim0 bus 0 scbus2 target 0 lun 0
da0: <AI Mass Storage > Removable Direct Access SPC-2 SCSI device
da0: 40.000MB/s transfers
da0: 3848MB (7880800 512 byte sectors)
da0: quirks=0x2<NO_6_BYTE>

BSD dd round1:
Code:
root@demobsd:~ # date;dd if=/dev/zero of=/dev/da0 bs=4m;date
Mon Aug 17 08:04:11 CEST 2020
dd: /dev/da0: short write on character device
dd: /dev/da0: end of device
963+0 records in
962+1 records out
4034969600 bytes transferred in 608.828416 secs (6627433 bytes/sec)
Mon Aug 17 08:14:20 CEST 2020
This took about 10 minutes. What? I expected more, around 140 minutes.

BSD dd round2:
Code:
root@demobsd:~ # date ; dd if=/dev/zero of=/dev/da0 bs=4m ; date
Mon Aug 17 08:23:14 CEST 2020
dd: /dev/da0: short write on character device
dd: /dev/da0: end of device
963+0 records in
962+1 records out
4034969600 bytes transferred in 606.762631 secs (6649997 bytes/sec)
Mon Aug 17 08:33:21 CEST 2020
Again, took about 10 minutes.
That is what I was hoping to achieve in the first place.
Turns out, I did not need any jiggery-pokery for that.


For comparison, here is what cstream can do.
FreeBSD cstream with direct output:
Code:
root@demobsd:~ # date ; cstream -i /dev/zero -b 4m -o /dev/da0 -OD -n 4034969600 ; date
Mon Aug 17 22:50:30 CEST 2020
Mon Aug 17 23:00:36 CEST 2020
This took about 10 minutes. So far so good.

FreeBSD cstream without direct output:
Code:
root@demobsd:~ # date ; cstream -i /dev/zero -b 4m -o /dev/da0 -n 4034969600 ; date
Mon Aug 17 23:01:11 CEST 2020
Mon Aug 17 23:11:17 CEST 2020
Also took about 10 minutes.
So direct output (or the lack of it) makes no difference here.

FreeBSD cstream with direct input:
Code:
root@demobsd:~ # date ; cstream -i /dev/zero -ID -b 1m -o /dev/da0 -n 4034969600 ; date
Tue Aug 18 00:07:09 CEST 2020
Tue Aug 18 00:17:15 CEST 2020
Yet again, 10 minutes. Not faster, not slower.

Conclusion:
I falsely assumed that the FreeBSD dev/adaN or dev/daN devices give similarly slow write speeds with dd to macOS's dev/diskN devices compared to the much faster dev/rdisk devices. Turns out, writing dev/daN and dev/adaN with dd on FreeBSD actually gives the performance equal to using dev/rdisk on my Mac. Meaning, I can wipe my USB sticks under FreeBSD without taking over 10 times more than what it would on OSX.
 
Thx for the update on this topic! Somehow related:
  • RTFM mknod(8): As of FreeBSD 4.0, block devices were deprecated in favour of character devices. As of FreeBSD 5.0, device nodes are managed by the device file system devfs(5), making the mknod utility superfluous.
  • zfs(8) volmode=default | geom | dev | none. Default set via sysctl(8) knob vfs.zfs.vol.mode (1 | 2 | 3 for geom | dev | none).
 
[…] Using /dev/disk1 on OS X instead of /dev/rdisk1 gives me the same speed as /dev/da0 on my FreeBSD desktops, so the magic is not in the OS.
Wiping a 8GB pendrive using dd if=/dev/zero of=/dev/da0 bs=32m on FreeBSD takes ages, while doing the same job on OS X using dd if=/dev/zero of=/dev/rdisk1 bs=32m completes much faster, because of the unbuffered I/O provided by the raw device rdisk.

That’s not correct. FreeBSD’s standard disk devices already are raw devices (also called character devices), just like the “rdisk” devices on MacOS, i.e. /dev/da0 on FreeBSD is the same as /dev/rdisk1 on MacOS (not /dev/disk1). You don’t have to do anything special, just use /dev/da0 or whatever. (*)

Historically, FreeBSD inherited the distinction between buffered devices (“block devices”) and raw devices (“character devices”) from its ancestors in the 1990s. However, there was no good reason to keep the block devices, because file systems and the VM system do their own buffering anyway. And even if you needed a buffered block device (for whatever reason), you could easily put an appropriate GEOM layer on top of a raw device, such as gcache(8). Consequently, block devices were removed in FreeBSD 4.0, and since then there are only raw devices. Apparently, MacOS still has block devices (I don’t know why).

(*) If you still get abysmal speed, then something else must be wrong; in this case I would suspect that the USB part of the kernel might be the culprit. Note that – unfortunately – FreeBSD does not support the newer UASP protocol, in particular for USB3 SSDs. This means that transfers are limited to about 120 MB/s (and consume a lot of CPU, often blocking the whole system for a while), even if the drive supports 500 MB/s, like the popular Samsung T5, for example.
 
If anyone out there wants to implement UASP on FreeBSD, I would be happy to supply some hardware. (This would be in the form of me ordering something on eBay / Amazon, and having it delivered to the developer who wants to accept the workload for implementing this enhancement in FreeBSD.)

If I can get hold of one (they've been discontinued), the StarTech PEXUSB312EIC is a very good card to support. It includes USB 3.1 support, and it also includes an integrated USB 3.0 hub for the four USB3.0 ports. It supports UASP. The StarTech PEXUSB312C2 is also very good - as is its successor, the StarTech PEXUSB312C3. These also support UASP, and all three cards represent the ASmedia 1142, 2142 and 3142 chipset line.

For the disk side, I use the Fantec HDD Sneaker 2 (model number: 1968) - it uses the ASmedia ASM235, which supports UASP. I also called up Fantec to check, and they confirmed that it supports UASP, even though that selling point is not currently on their sales literature. I get transfer rates (normal Windows 7 large file copies) of ~170-180MB/second with an old 12TB IronWolf Pro disk, from internal (SATA2-connected) WD Black 6TB disks, with UASP support in the ASMedia Windows 7 driver. Not SSD speeds, to be sure - but not bad for spinning rust over USB 3.1 - and considerably more than the 120MB/second rate that would be the limit without UASP.
 
Doesn't openbsd also has rdisk ?
Note, on linux sometimes I have to type sync because the dd write is otherwise only done in memory buffer.
 
Wiping clean a 4GB USB 2.0 stick on macOS, using raw-disk round1:
Code:
root@demomac ~ # date;dd if=/dev/zero of=/dev/rdisk2 bs=4m;date
Sun Aug 16 18:57:16 CEST 2020
dd: /dev/rdisk2: short write on character device
dd: /dev/rdisk2: Input/output error
963+0 records in
962+1 records out
4034969600 bytes transferred in 586.373191 secs (6881231 bytes/sec)
Sun Aug 16 19:07:04 CEST 2020
This took about 10 minutes.

What does "short write on character device" mean?
That at the end wrote less than 4m?
 
As you said, the last successful write was smaller than the expected size. Most likely because the device ended there. The next "IO error" was probably caused by the end of the device too. If you look at the final number of bytes transferred, it is not divisible by 4M.
 
Most likely because the device ended there.
That particular short-write was exactly because the storage area of my device reached its end there. For that (last) 4MB of zeros, there was not enough room (the remaining space to be zeroed-out was less than 4MB). Consequently, the amount of data written out successfully, was SHORTER than the size of the 4MB block of zeros.
Mind you, the same message may appear under very different circumstances. For example, I have seen it while attempting to write data to damaged Flash-drives, even though I was not yet near the (supposed) end of the storage area. I have also seen this on dodgy flash devices when they overheated during write.
So that message just tells you that what dd managed to successfully write-out was actually less (shorter) than what was read-in (generated, or otherwise prepared to be written out). And WHY that happened, can be for multiple reasons. However, dd displays the number of records successfully read-in, and the number of records successfully written-out. And you also know the size of those records, and the total amount you intended to write-out. These allow you to derive (or estimate) how much of the intended data was NOT written out (how far in things failed), which usually points out just fine if this is an ignorable/expectable issue (as it was in my case) or an error.
 
Back
Top