Solved "mount -u -o ro" gives "device busy", but nothing is open for write

Bill Blake · Jul 7, 2024

Somewhere in the mists of time, I arranged that my /usr partition be mounted read-only. On the occasions when I need to update something on /usr, I have a script that does "mount -u -o rw /usr; <the command from the script arguments>; mount -u -o ro /usr". Until a few years ago, this worked without issue. Then, the "mount -o ro" started--sometimes--failing with "device busy". (I don't recall which kernel version.) But there are no files opened for write on /usr (according to fstat) and the data in the file system itself doesn't change. I have not attempted to add -f to the "mount -u -o ro" because I have no idea what I might break. Any suggestions as to how I might track down what's going on before I issue what might be a fatal command?

Cath O'Deray · Jul 7, 2024

Bill Blake said:
I don't recall which kernel version.

The past aside, for a moment. What's the current situation?

Which version of FreeBSD, exactly?

`freebsd--version -kru ; uname -aKU`

Bill Blake · Jul 7, 2024

Cath O'Deray said:
The past aside, for a moment. What's the current situation?

Which version of FreeBSD, exactly?

`freebsd--version -kru ; uname -aKU`

Sorry about that; somehow I forgot to mention that I had just upgraded to to 14.1:

14.1-RELEASE
14.1-RELEASE
14.1-RELEASE
FreeBSD laptop 14.1-RELEASE FreeBSD 14.1-RELEASE LAPTOP amd64 1401000 1401000

"LAPTOP" is a rather stripped down kernel. Its file system related options not in GENERIC are:

options FUSEFS
options GEOM_ELI
options GEOM_UZIP
options LINDEBUGFS
options NULLFS
LINDEBUGFS is new; I had to add it starting with 14.0 to deal with my display hardware; this problem predates that addition.

/etc/make.conf contains:

KERNCONF=LAPTOP
MODULES_OVERRIDE=vmm linuxkpi_hdmi
WRKDIRPREFIX=/home/admin/work/Make

Since I'm digging, here's a sorted df output:

/data on /home (pefs, local, soft-updates)
/dev/ada0p5 on / (ufs, local, read-only)
/dev/ada0p6 on /usr (ufs, local)
/dev/ada0p7 on /var (ufs, local, soft-updates)
/dev/ada0p9 on /data (ufs, local, soft-updates)
/dev/md0.uzip on /usr/ports (ufs, local, read-only)
/dev/md1.uzip on /usr/src (ufs, local, read-only)
/home/adminpub/log on /var/log (nullfs, local, soft-updates)
devfs on /dev (devfs)
tmpfs on /tmp (tmpfs, local)

(I haven't rebooted since /usr got stuck r/w.) The problem predates the addition of the .uzip file systems.

ralphbsz · Jul 8, 2024

I know it's a lot of work, but: Retry with a stock (standard options) kernel, even better with a kernel that's from a binary install, instead of locally compiled.

VladiBG · Jul 8, 2024

Code:

 -u      The  -u    flag  indicates     that the status of an already mounted
           file system should be changed.  Any of  the  options  discussed
           above (the -o option) may be changed; also a file system    can be
           changed from read-only to read-write or vice versa.  An attempt
           to  change  from    read-write to read-only    will fail if any files
           on the file system are currently    open for writing unless    the -f
           flag is also specified.    The set    of options  is    determined  by
           applying     the  options  specified in the    argument to -o and fi-
           nally applying the -r or    -w option.

Cath O'Deray · Jul 8, 2024

Bill Blake said:
… I had just upgraded to to 14.1:

14.1-RELEASE
14.1-RELEASE
14.1-RELEASE
FreeBSD laptop 14.1-RELEASE FreeBSD 14.1-RELEASE LAPTOP amd64 1401000 1401000

Upgraded before 19th June 2024?

The absence of a patch level is eyebrow-raising.

Bill Blake · Jul 8, 2024

Cath O'Deray said:
Upgraded before 19th June 2024?

The absence of a patch level is eyebrow-raising.

The tarballs have a date of May 31 on them. I did see the SSH security advisory but since I don't run sshd I didn't worry about it beyond making the suggested mitigation. I can certainly grab a new set of tarballs and reinstall. Reinstalls are quite easy for me, so if that's recommended, point me at some new tarballs and I'll do it. But that won't fix the current problem, since it predates the current release by quite a bit. (I just finally got annoyed enough to do something about it.

)

Since my running system won't let me mount -u -o ro /usr, my current plan is to do a reverse startup, turning off everything in reverse order and try the remount after each step. I might get a clue from that. But it would be nice to have a tool like fstat but which does more digging.

ralphbsz · Jul 8, 2024

VladiBG said:
... An attempt to change from read-write to read-only will fail if any files on the file system are currently open for writing ...

And that's the problem. Two possibilities: (a) Some running process has some file open for writing. But the OP has checked with fstat, and by reasoning, and there shouldn't be any. (b) No running process has a file open for writing, but the kernel wrongly believes so, due to some kernel bug. We don't know how to debug either option, other than trial and error. An experienced kernel developer could go into the kernel data structures for open files, but for the average user or user-space developer, that's a "voyage of discovery", so not efficient. I like the OP's most recent debugging proposal, less work than various sledge hammer methods.

Bill Blake · Jul 8, 2024

ralphbsz said:
And that's the problem. Two possibilities: (a) Some running process has some file open for writing. But the OP has checked with fstat, and by reasoning, and there shouldn't be any. (b) No running process has a file open for writing, but the kernel wrongly believes so, due to some kernel bug. We don't know how to debug either option, other than trial and error. An experienced kernel developer could go into the kernel data structures for open files, but for the average user or user-space developer, that's a "voyage of discovery", so not efficient. I like the OP's most recent debugging proposal, less work than various sledge hammer methods.

I've been with FreeBSD since before it was called FreeBSD and I even wrote device drivers and other kernel code back in the day (for my own use). I haven't kept my skills current, which makes it much harder to figure this (and a few other annoyances that have persisted for a few years) out. Still, if push comes to shove and I have to do some kernel debugging, I can dust off my old skills. Anyway, if I end up all the way back to single-user without being able to remount /usr, I'll reboot using GENERIC per your suggestion and then see if I can reproduce the problem.

Bill Blake · Jul 8, 2024

First real data point: I started taking down processes and nothing made a difference--until I killed the X server. At which point I could remount /usr r/o. So I booted GENERIC and am running that now. My next goal is to see if I can make /usr stick r/w.

Added: I just realized that I had not killed xconsole before I took down the X server. So in theory that could be the issue. But my money is on the X server.

Andriy · Jul 8, 2024

It should be possible to use dtrace fbt probes to see where exactly in kernel the error is set.
Maybe that will give some ideas what to check next.

VladiBG · Jul 9, 2024

ralphbsz said:
And that's the problem. Two possibilities: (a) Some running process has some file open for writing. But the OP has checked with fstat, and by reasoning, and there shouldn't be any. (b) No running process has a file open for writing, but the kernel wrongly believes so, due to some kernel bug. We don't know how to debug either option, other than trial and error. An experienced kernel developer could go into the kernel data structures for open files, but for the average user or user-space developer, that's a "voyage of discovery", so not efficient. I like the OP's most recent debugging proposal, less work than various sledge hammer methods.

If you have /usr/ mounted as r/w and then under it have /usr/mnt mounted as md is it still lock on /usr or not?

/dev/ada0p6 on /usr (ufs, local)
/dev/ada0p7 on /var (ufs, local, soft-updates)
/dev/ada0p9 on /data (ufs, local, soft-updates)
/dev/md0.uzip on /usr/ports (ufs, local, read-only)

ralphbsz · Jul 9, 2024

VladiBG said:
If you have /usr/ mounted as r/w and then under it have /usr/mnt mounted as md is it still lock on /usr or not?

It should not. Mounting a child file system inside a parent file system is a strange operation, as it does not permanently modify the parent: after the child is unmounted, the parent is in the same state.

T-Aoki · Jul 9, 2024

Any possibility that named pipe is/are used?

T-Aoki · Jul 9, 2024

And if I understand/recall correctly, mount -u first unmounts the filesystem, then, remounts. So even opened-read-only files/directories avoids unmounting. If /usr/local/libexec/gvfsd-trash (installed by devel/gvfs) or something alike is sniffing any of the filesystem to be unmounted, it prevents unmounting, thus, remounting, too.

Bill Blake · Jul 10, 2024

VladiBG said:
If you have /usr/ mounted as r/w and then under it have /usr/mnt mounted as md is it still lock on /usr or not?

As I said, the problem predates my use of md mounts.

Bill Blake · Jul 10, 2024

T-Aoki said:
Any possibility that named pipe is/are used?

Not by anything I'm doing. Also, it appears to be related to the X server, and I can't think of any reason it would be creating named pipes.

Bill Blake · Jul 10, 2024

T-Aoki said:
And if I understand/recall correctly, mount -u first unmounts the filesystem, then, remounts. So even opened-read-only files/directories avoids unmounting. If /usr/local/libexec/gvfsd-trash (installed by devel/gvfs) or something alike is sniffing any of the filesystem to be unmounted, it prevents unmounting, thus, remounting, too.

That is not correct. The whole point of -u is to avoid actually unmounting the file system, as an actual unmount would invalidate all open file handles.

T-Aoki · Jul 10, 2024

Bill Blake said:
That is not correct. The whole point of -u is to avoid actually unmounting the file system, as an actual unmount would invalidate all open file handles.

Unfortunately, before gvfsd-trash was modified not to sniff /media and network filesystems (like smb), it actually prevented unmounting /media/* (i.e., /media/cd0) and shares on NAS. Unless related codes on base are modified, I'm definitely correct with this regards. Any changes in codes were made?
Before gvfsd-trash changed its behavior, I need to run a script as a daemon to stop it in every 2 second. Killing it caused immediate restart and couldn't help.

Andriy · Jul 10, 2024

T-Aoki said:
And if I understand/recall correctly, mount -u first unmounts the filesystem, then, remounts. So even opened-read-only files/directories avoids unmounting.

This is not correct.
It calls mount system call with update flag.

T-Aoki · Jul 10, 2024

Andriy said:
This is not correct.
It calls mount system call with update flag.

Hm, so I'd been recalling too old memories or puzzled with other (D)OS.
Sorry for the noise.

Bill Blake · Jul 15, 2024

I finally got to a place where I could conveniently start/stop my system and did some testing and found a reproducible way of causing the problem. I tried this on my custom kernel and the GENERIC kernel. So, here are some sequences, each beginning with a reboot. "runwr -u <command>" remounts /usr r/w, runs the command, then remounts /usr r/o.

BOOT
runwr -u pkg add -f some.pkg
# remount succeeds

BOOT
xstart
runwr -u pkg add -f some.pkg
# remount fails

BOOT
xstart
runwr -u foo # a script that creates and deletes a file in /usr
# remount succeeds

BOOT
xstart
runwr -u pkg add -f some.pkg
# remount failed
mount -fu -o ro /usr
runwr -u pkg add -f some.pkg
# remount succeeds (!)

BOOT
xstart
xstop
runwr -u pkg add -f some.pkg
# remount succeeds

So, something in the interaction between the X server and pkg is leaving /usr with something seemingly open for write, even though fstat says not. Any thoughts on how to track it down? I haven't done kernel debugging in ages, so if that's the suggestion, please point me at a kernel debugging primer to get me started.

T-Aoki · Jul 15, 2024

I've completely forgotton one thing.
Do you mount /usr partition (dataset) with atime enabled?
If yes, accessing files in /usr partition EVEN WITH READONLY CAUSES WRITES OF METADATA (to record file access time).

Bill Blake · Jul 15, 2024

T-Aoki said:
I've completely forgotton one thing.
Do you mount /usr partition (dataset) with atime enabled?
If yes, accessing files in /usr partition EVEN WITH READONLY CAUSES WRITES OF METADATA (to record file access time).

That is not true.

ralphbsz · Jul 17, 2024

Two suggestions, first one easier, second one harder. Apply both only to the failing case.

1: To have a file open, you need a process. Let's assume (comments below) that the process that is holding a file open for write is a new process started by the pkg command. So run "ps aux", save the output. Run the pkg command. Run "ps aux" again, and compare the results. This is going to be a bit tedious, as there will be lots of changes; you may want to write a small script to do the compare. Is there a new process, left over from the pkg run?

2: Let's assume that the opening of the file for write is done by pkg or one of its child processes (comments below). Run pkg under truss. Use the truss flag that causes all child processes started to also run under truss. Save the (very length) output from truss. Then go through that output, and find all the places where a file is opened, and match them to all the places where files are closed. Is anything left? Again, that will be tedious, and you will need to use a little script.

Comment: Both of these suggestions are under the assumption that the file that's being held open was opened by the pkg command, or by one of its child processes that has failed to exit. But there is a more insidious possibility: While pkg runs, another process (that was already running) grabs a file and holds it open. My educated guess is: that process is part of X. For example, you could have a file manager running, which sees one of the directory changes caused by pkg, and then looks at it, and forgets to close it. Debugging this would be harder, as you would have to attach truss to all processes related to X, and there are too many for that to be practical. Simpler suggestion: Strip your X setup down as much as possible, then enable things one at a time.

EDIT: I confused dtrace and truss; I always use truss, but dtrace can do the same thing.