FreeBSD 13.0: Observing corruption in SGL

Hello ,

In freeBSD 13.0, while running IO with blocksize 1MB observed corruption in SGL received.

Ran IO with FIO (version 3.28)IO tool.

Command used :

fio --filename=/dev/da0: -direct=1 -iodepth=32 -ioengine=posixaio -rw=randrw -bs=1024k -numjobs=8 -runtime=30 -group_reporting -name=stress
While running IO with 1MB block size, observed EINPROGRESS status for bus_dmamap_load_ccb. After receiving the EINPROGRESS status for bus_dmamap_load_ccb(),
the callback function is called . Then observed corruption in the SGL received.
Buffer size of IO was 0x100000 (1.04MB) but the total mapped SGL length was 0x3002c0 (3.14MB).
When tried in FreeBSD 12.2, the maximum block size allowed to run in fio is 128k.
If we increase the block size we are getting the below error.

Code:
root@freebsd12:~ # fio --filename=/dev/da0: -direct=1 -iodepth=32 -ioengine=posixaio -rw=randrw -bs=129k -numjobs=8 -runtime=30 -group_reporting -name=stress
stress: (g=0): rw=randrw, bs=(R) 129KiB-129KiB, (W) 129KiB-129KiB, (T) 129KiB-129KiB, ioengine=posixaio, iodepth=32
...
fio-3.28
Starting 8 processes
fio: pid=1991, err=45/file:engines/posixaio.c:180, func=xfer, error=Operation not supported
fio: pid=1985, err=45/file:engines/posixaio.c:180, func=xfer, error=Operation not supported
fio: pid=1986, err=45/file:engines/posixaio.c:180, func=xfer, error=Operation not supported
fio: pid=1987, err=45/file:engines/posixaio.c:180, func=xfer, error=Operation not supported
fio: pid=1990, err=45/file:engines/posixaio.c:180, func=xfer, error=Operation not supported
fio: pid=1988, err=45/file:engines/posixaio.c:180, func=xfer, error=Operation not supported
fio: pid=1989, err=45/file:engines/posixaio.c:180, func=xfer, error=Operation not supported
fio: pid=1984, err=45/file:engines/posixaio.c:180, func=xfer, error=Operation not supported

stress: (groupid=0, jobs=8): err=45 (file:engines/posixaio.c:180, func=xfer, error=Operation not supported): pid=1984: Tue Oct 5 14:11:05 2021
 cpu : usr=16.13%, sys=22.58%, ctx=44, majf=0, minf=48
 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
 submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
 complete : 0=50.0%, 4=50.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
 issued rwts: total=2,6,0,0 short=0,0,0,0 dropped=0,0,0,0
  latency : target=0, window=0, percentile=100.00%, depth=32

Is there any issue in SGL handling with FreeBSD 13.0 ?
Can anyone help ?
 
That question/statement would have to be the most confusing mumbo-jumbo I've seen in a while; but that's just me. :rolleyes:
Firstly, fio is a port.
Second, "the observed corruption" ? Where, how, why? Man, my brain hurts just reading it.

So anyway in the little excerpt you give for the unrelated issue, you know the maximum block size for this port is 128kb, then you specify 129kb and wonder why it fails? Well colour me pink!

This has zero to do with FreeBSD, in the first instance and you should log a PR with the port maintainer of this linux tool. I'm surprised it even compiles let alone runs on FreeBSD, incidentally.

I'm going to now take some headache pills...
 
Your post is very unclear. What is SGL? What is the corruption you saw: what happened, and what did you expect? How did you see kernel-internal functions like bus_dma_...?
 
OP: I would think that the -bs=129k is part of the problem... fio(1) suggests that block size should be a power of 2. If it's not, I'd suggest you look at -bs_unaligned option to avoid errors. fio(1)is awfully long, I'd suggest using Ctrl-F to look for options you need.
 
That question/statement would have to be the most confusing mumbo-jumbo I've seen in a while; but that's just me. :rolleyes:
Firstly, fio is a port.
Second, "the observed corruption" ? Where, how, why? Man, my brain hurts just reading it.

So anyway in the little excerpt you give for the unrelated issue, you know the maximum block size for this port is 128kb, then you specify 129kb and wonder why it fails? Well colour me pink!

This has zero to do with FreeBSD, in the first instance and you should log a PR with the port maintainer of this linux tool. I'm surprised it even compiles let alone runs on FreeBSD, incidentally.

I'm going to now take some headache pills...
Observed corruption:
Buffer size of IO was 0x100000 (1.04MB) but the total mapped SGL length was 0x3002c0 (3.14MB).
This has plenty to do with FreeBSD - fio(1) stands for Flexible I/O. Even if it is a tool that originated from Linux guys, so did Git. I think we can allow for the possibility that on FreeBSD, fio works a little different, and needs to be run with different options to obtain the same results. FWIW, fio is a fantastic I/O benchmarking tool, one that has the potential to prove FreeBSD's superiority over Linux.

FWIW, in FreeBSD 13, default block size seems to be 4096k, as per fio(1)

Your post is very unclear. What is SGL? What is the corruption you saw: what happened, and what did you expect? How did you see kernel-internal functions like bus_dma_...?
I thought the post was plenty clear - what OP was doing, the observed output, and what was expected. Just a quick look at the fio manpage for a few options was enough to tell me the story. As for kernel-internal functions - that's an implementation detail hidden by fio.

BTW, I/O is not even my area of expertise. This is just something that required a little reading of the manual and paying attention to the details - just like the rest of FreeBSD.

😩😤
 
Observed corruption:

This has plenty to do with FreeBSD - fio(1) stands for Flexible I/O. Even if it is a tool that originated from Linux guys, so did Git. I think we can allow for the possibility that on FreeBSD, fio works a little different, and needs to be run with different options to obtain the same results. FWIW, fio is a fantastic I/O benchmarking tool, one that has the potential to prove FreeBSD's superiority over Linux.

Phooey. All it proves is a buffer difference. That could be a bug in the application OR it could be a bug in FreeBSD, but NONE of the OPs message shows that. That's why I said, in the first instance, if it is a bug, report it to the port maintainer.

He seems to state that it works under one version but not another, thereby implying it's FreeBSD not the port. The port should be the first point to start looking at a culprit, not FreeBSD.

Oh, and it's posted in the FreeBSD Development section.

Case dismissed.
 
Phooey. All it proves is a buffer difference. That could be a bug in the application OR it could be a bug in FreeBSD, but NONE of the OPs message shows that. That's why I said, in the first instance, if it is a bug, report it to the port maintainer.

He seems to state that it works under one version but not another, thereby implying it's FreeBSD not the port. The port should be the first point to start looking at a culprit, not FreeBSD.

Oh, and it's posted in the FreeBSD Development section.

Case dismissed.
Not being well-versed in options and parsing precedence can lead to unexpected results and errors. It is very possible that the term 'corruption' was mis-used by OP in describing the issue. But, it is also possible to mess up (Corrupt) the output by an improperly constructed list of options. By 'corrupted output', I mean output that cannot be valid. Whether that's a bug in the utility or elsewhere - hard to say without deeper analysis of the problem.

For comparison: Apache (www/apache24) also has a truckload of options and rules of precedence for parsing directives. And if you specify port 443 in too many places, the browser will complain about invalid SSL certs, even if you have valid ones installed. Relation to this thread is that benchmarks/fio can very well be operating using a very similar logic pattern.

I already pointed out one potential difference between FreeBSD 12 and FreeBSD 13 in fio(1): Default block size of 128k for v. 12, versus 4096k for v. 13.

I do agree that it's better to focus on the port when troubleshooting the error.
 
Back
Top