Why is stdout locked against read?

When I do this
cmd > outfile &
the outfile should slowly grow along with the produced output. And with ls -l I can observe that growth.
However, when I now do
cat outfile
in order to read the produced output, that cat will block with 0 bytes output, until cmd has terminated, and only then will produce the entire output.
I am quite sure that behaviour was not the case from the beginning, and I would very much like to get rid of it.

Specifically my usecase was that I wanted to watch a movie, and my movie collection is on NFS. When I'm on travel, that NFS gets routed via VPN, and -depending on location- it can become quite slow, too slow to get a synchronized playback even when the throughput itself might still cope with playback speed.
So I thought I could copy the movie to local storage, and watch whlle copying. Didn't work because of the beforementioned: while cp copies the file, the outfile is locked against read. I thought the problem is somehow with NFS, and so, instead of doing cp /remoteshare/file /local/file, I tried to do cat /remoteshare/file > /local/file which should be independent of what gets fed to that stdout. But that also doesn't work: /local/file is locked against read until not written anymore.
°°°°°°°°°°°°°°°°°°°

Finally, doing the Randal Schwartz classic (useless use of cat) seems to solve the issue:
cat /remoteshare/file | cat > /local/file
So this was quite certainly not here from the beginning, because then it would have been mentioned back there as a useful use of cat.
 
I'm not quite sure that I understood this correctly, but when I want to monitor file as it grows, I'll use tail -f, not cat
 
Reaching the end-of-file on an ordinary file (read(2) returns a zero byte count) causes cat to terminate. This is true regardless of whether any process is also ("occasionally") writing to the file.

The rules change if the standard input is a pipe, which has a slightly different code path for the read(2). If a writer has the pipe open, the reader will stall, rather than return zero.

I believe that tail -f was invented to deal with the situation you describe, as vmisev has already suggested.

Edit: to clarify: when there is a pipe involved, the reader will only see an end-of-file (read zero bytes) after the writer exits. The writer will get SIGPIPE if the reader exits.
 
you should not rely on the filesystem size indicator of an NFS file to be correct when a file is open for writing. You are running into a block buffering issue. The filesystem information isn't updated until a complete disk block has been buffered and written. To minimize nework IO the NFS system will use "buffered IO".
 
I tried a simple test (read a line from nfs file, write to localfile, sleep for a second, repeat) and catting the localfile worked fine. Something other than locking must be going on. You can try ktrace -di on cat to see what syscalls it is making, to debug this.... Or may be your shell does something funky? (redirection is done by the shell).

It may be better to use rsync and link the local temporary filename to a different filename & watch under that name (since the temp name will disappear once the xfer is over).
 
  • Like
Reactions: PMc
you should not rely on the filesystem size indicator of an NFS file to be correct when a file is open for writing.
In this case the filesystem size indicator was correct, but the cat would just not want to read that (apparently present) data.

You are running into a block buffering issue.
In this case the effect concerned the entire file (of a few hundred megabytes), and I doubt all of it would be buffered.

... on NFSv2 or NFSv3 for parallel IO, meaning multiple processes are doing IO on a file at the same time. In this case, on process writing, another process reading. NFS is designed for many things, but this is not one of its design goals.
In this case it should have been NFSv4.

Let's try to reproduce it:

Code:
pmc@cora:~ $ df . Media
Filesystem                         1K-blocks    Used     Avail Capacity  Mounted on
zl/home                            164994224 3576200 161418024     2%    /home
edge-e:/media                      206399236 2642380 203756856     1%    /media
pmc@cora:~ $ ls -l Media
lrwxr-xr-x  1 1001 staff 10 Mar  6  2023 Media -> /media/pmc
pmc@cora:~ $ /sbin/mount | grep media
edge-e:/media on /media (nfs, nfsv4acls)
pmc@cora:~ $ grep media /etc/fstab
edge-e:/media              /media          nfs     nfsv4,bgnow,rw  0       0
pmc@cora:~ $ cp Media/Movies/The_Man_Who_Fell_to_Earth-1976.avi . &
[1] 6709
pmc@cora:~ $ ls -l The_Man_Who_Fell_to_Earth-1976.avi
-rw-r--r--  1 pmc staff 2883584 Oct 21 13:05 The_Man_Who_Fell_to_Earth-1976.avi
pmc@cora:~ $ cat The_Man_Who_Fell_to_Earth-1976.avi
 --- HANGS ---
^Z
--- IGNORED ---

New Terminal:
Code:
pmc@cora:~ $ ps axl
1100 6709 6299 0  20  0  13816  2292 nfsreq   D     5   0:00.17 cp Media/Movies
1100 6713 6299 0  20  0  13808  2400 range    T+    5   0:00.00 cat The_Man_Who

Wow, here we have it: "range"

This is certainly new. But what is it, what is "range", and why doesn't it work?
A T-state hanging indefinitely on an (actually independent) D-state is not what should happen.
 
I tried a simple test (read a line from nfs file, write to localfile, sleep for a second, repeat) and catting the localfile worked fine. Something other than locking must be going on.
It is a kind of locking: rangelock_enqueue()

Code:
6713 100989 cat                 -                   mi_switch+0xc3 _sleep+0x205 rangelock_enqueue+0x161 kern_copy_file_range+0x33d sys_copy_file_range+0x78 amd64_syscall+0x118 fast_syscall_common+0xf8

You can try ktrace -di on cat to see what syscalls it is making, to debug this.... Or may be your shell does something funky? (redirection is done by the shell).
In this case the redirect is just taken into clan liability (which shouldn't happen either) - the effect appears with or without, all the same.

It may be better to use rsync and link the local temporary filename to a different filename & watch under that name (since the temp name will disappear once the xfer is over).
In this case I decidedly wanted to start playing the (partial) video while it is still in copying.
 
See copy_file_range(2). Looks like cat(1) is now using in_kernel_copy() (see /usr/src/bin/cat.c:269), which calls copy_file_range(). What are the exact arguments to copy_file_range() (from output of ktrace or truss) just before it hangs? What happens when you do cat < /remoteshare/file > /local/file? May be the remote fs is mounted where locking is not possible? -- these are some of the things I'd do to try figure out the underlying cause.

By git blame cat.c we find it was added on 2023-07-08 in commit 8113cc8276. git log 8113cc8276 says
cat: use copy_file_range(2) with fallback to previous behavior

This allows to use special filesystem features like server-side
copying on NFS 4.2 or block cloning on OpenZFS 2.2.
May be it should check that these conditions are met? That is, both files should be remote or both files should be local for it to be really worth it. In any case IMHO this should not be the default behavior. I suggest you talk to the author & reviewer of this code! Still, it should not hang... may be the author can shed more light.
 
  • Like
Reactions: PMc
See copy_file_range(2). Looks like cat(1) is now using in_kernel_copy() (see /usr/src/bin/cat.c:269), which calls copy_file_range().
Yep. cp and cat both do.

I now went here: https://lists.freebsd.org/archives/freebsd-fs/2025-October/004696.html
Thank You for commenting!

May be it should check that these conditions are met? That is, both files should be remote or both files should be local for it to be really worth it. In any case IMHO this should not be the default behavior. I suggest you talk to the author & reviewer of this code! Still, it should not hang... may be the author can shed more light.
Apparently this was meant for protection (among other things): we shouldn't read files while they are written.
 
One more comment. I suspect the read is not allowed since the range is locked for writing but copying gigabytes will take a long time. May be a compromise without changing kernel code would be for cat/cp to break up a copy file range in smaller chunks. Or... we might discover that it doesn't really improve the situation much! But this is something you can try on your own (by changing cat).
 
One more comment. I suspect the read is not allowed since the range is locked for writing but copying gigabytes will take a long time. May be a compromise without changing kernel code would be for cat/cp to break up a copy file range in smaller chunks. Or... we might discover that it doesn't really improve the situation much! But this is something you can try on your own (by changing cat).
I am now thinking along a bit different line: basically I do not want or need extra protection from reading a file while it is written. To me it seems useful to protect against two tasks writing into the same file simultanously, because that does usually result in garbage written. But reading a file while it is written does not damage the file; only the read itself can be inconsistent - which may or may not be a problem.

So I would like to get rid of this part of the new protection altogether. The difficulty is, this is coded into kern_rangelock.c - it could be easily changed there (or switched via sysctl), but these functions are used for a couple of different things. and not all of them are immediately obvious. Also clarification is necessary where this is exported and applications might start to depend on it. So this is rather something for a boring winter night or such...
 
Back
Top