This is an odd one. I have a FreeBSD 11.4 server as a NAS for my small network. It is equipped with a 10G NIC connected to a 10G switch, to which is also connected a Linux client with a 10G NIC as well.
The ZFS config is 4 HDDs in a 2x2 mirror.
The ZFS filesystem /share is exported to the network and mounted by the Linux machine using NFSv3:
The problem is as follows. If I initiate a large transfer that fsync()s before closing the file, I get a multiple-minute IO stall on the FreeBSD box, but ONLY if the file is above ~5.6GB. It is extremely reproducible. On the Linux machine, this DD command demonstrates the issue:
As you can see, this worked fine (aside: block size doesn't seem to matter). I ran this command at least a dozen times without incident.
Adding a hundred megs or so displays the broken behavior:
What I experience during the transfer is:
1. Initially, the file transfer proceeds apace. I have individual activity LEDs on each HDD on the FreeBSD machine; all show high activity.
2. After the transfer is almost complete, all disk activity stops dead. Not a single HDD LED flicker. All attempts to access files stalls, including directly from a shell on the FreeBSD machine. Network activity on the client drops to zero, the FreeBSD server still seems to have high network activity but no packets reach the client.
3. About two minutes pass.
4. Network activity resumes, disk activity resumes, remainder of data is transferred, operation completes successfully once all data is sent.
If I remove conv=fsync, the transfer completes nice and quick every time:
Checking the Linux machine's dmesg, I see many lines like this:
If I run the same DD command on the FreeBSD box directly (via the coreutils port), everything is fine:
Any clues? Could this have something to do with an NFS ID rollover given the machine is using a fast interconnect?
Aside: The resilver is probably not relevant; I replaced a disk that I didn't trust. I have no reason to believe any of the other disks are failing.
The ZFS config is 4 HDDs in a 2x2 mirror.
Code:
[root@343-guilty-spark /share/tmp]# zpool status -v
pool: zroot
state: ONLINE
scan: resilvered 2.25T in 0 days 14:32:10 with 0 errors on Thu Sep 3 01:02:39 2020
config:
NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ada1p3.eli ONLINE 0 0 0
ada0p3.eli ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
ada2p1.eli ONLINE 0 0 0
ada3p1.eli ONLINE 0 0 0
The ZFS filesystem /share is exported to the network and mounted by the Linux machine using NFSv3:
Code:
192.168.1.15:/share on /share type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.15,mountvers=3,mountport=855,mountproto=udp,local_lock=none,addr=192.168.1.15,_netdev)
The problem is as follows. If I initiate a large transfer that fsync()s before closing the file, I get a multiple-minute IO stall on the FreeBSD box, but ONLY if the file is above ~5.6GB. It is extremely reproducible. On the Linux machine, this DD command demonstrates the issue:
Code:
user@linux:~/share/tmp$ dd conv=fsync if=/dev/zero of=test2 bs=4k count=1370000
1370000+0 records in
1370000+0 records out
5611520000 bytes (5.6 GB, 5.2 GiB) copied, 8.76319 s, 640 MB/s
Adding a hundred megs or so displays the broken behavior:
Code:
user@linux:~/share/tmp$ dd conv=fsync if=/dev/zero of=test2 bs=4k count=1400000
1400000+0 records in
1400000+0 records out
5734400000 bytes (5.7 GB, 5.3 GiB) copied, 191.214 s, 30.0 MB/s
1. Initially, the file transfer proceeds apace. I have individual activity LEDs on each HDD on the FreeBSD machine; all show high activity.
2. After the transfer is almost complete, all disk activity stops dead. Not a single HDD LED flicker. All attempts to access files stalls, including directly from a shell on the FreeBSD machine. Network activity on the client drops to zero, the FreeBSD server still seems to have high network activity but no packets reach the client.
3. About two minutes pass.
4. Network activity resumes, disk activity resumes, remainder of data is transferred, operation completes successfully once all data is sent.
If I remove conv=fsync, the transfer completes nice and quick every time:
Code:
user@linux:~/share/tmp$ dd if=/dev/zero of=test2 bs=4k count=1400000
1400000+0 records in
1400000+0 records out
5734400000 bytes (5.7 GB, 5.3 GiB) copied, 8.95905 s, 640 MB/s
Checking the Linux machine's dmesg, I see many lines like this:
Code:
[271233.699379] call_decode: 15859 callbacks suppressed
[271233.699751] nfs: server 192.168.1.15 not responding, still trying
[271234.533861] nfs: server 192.168.1.15 OK
[271532.292673] nfs: RPC call returned error 13
If I run the same DD command on the FreeBSD box directly (via the coreutils port), everything is fine:
Code:
[user@343-guilty-spark /share/tmp]# gdd conv=fsync if=/dev/zero of=test2 bs=4k count=1400000
1400000+0 records in
1400000+0 records out
5734400000 bytes (5.7 GB, 5.3 GiB) copied, 16.432 s, 349 MB/s
Any clues? Could this have something to do with an NFS ID rollover given the machine is using a fast interconnect?
Aside: The resilver is probably not relevant; I replaced a disk that I didn't trust. I have no reason to believe any of the other disks are failing.