C O_NOATIME option in open

Linux has an option O_NOATIME on the open(2) call, which indicates that this open shall not change the atime of a file due to this open/read operation. It is intended for indexing and backup operations. As far as I know, FreeBSD does not have any functionality like that? Did I miss something?

I'm not talking about the NOATIME option on mount, or on file system creation (which is a good idea, and makes the whole question above irrelevant if set). I'm just fixing up an old file copy program, and I'm wondering whether I can make it a little bit more efficient on FreeBSD systems, for just those where the admin forgot to disable atime on the file system.
 
If there's nothing in open(2), there can't be such a feature. It wouldn't make sense as some "global setting", and after opening a file, it would be too late.

But then, I'd consider this a misfeature anyways. The purpose of "atime" is to know when a file was accessed, not to know it "except if some program doesn't want you to know for whatever reason".
 
I'm with zirias. Misfeature.

FWIW I share the concern about backups messing up atime. But I solve this by saving and restoring the atime around the backup run.
 
[…] shall not change […]
I have reviewed the O_NOATIME flag for open. It seems to always have been a discretionary flag: “Could you not update the access time, please?” 😬 Case in point an NFS mount won’t respect it, yet the open is successful regardless.​
[…] not change the atime of a file due to this open/read operation. […]
The documentation of O_NOATIME only mentions read, i. e. inspection of the payload. 🕚 Merely opening a file does not change any time stamps.​
[…] I'm just fixing up an old file copy program, and I'm wondering whether I can make it a little bit more efficient on FreeBSD systems, […]
Today, cp(1) uses copy_file_range(2). 🏃‍♀️💨
[…] But I solve this by saving and restoring the atime around the backup run.
Yeah, I wanted to mention that workaround, too. ⏱️ An extra fstat(2) and futimens(2) to restore the access time.​
 
I'm with zirias. Misfeature.

FWIW I share the concern about backups messing up atime. But I solve this by saving and restoring the atime around the backup run.
Misfeature but a workaround... Forgive my ignorance, but if you need a workaround, doesn't deserve it the name of "missing feature" instead? Furthermore, if I understand correctly, you can anyway fool the atime system?

O_NOATIME (since Linux 2.6.8)
Do not update the file last access time (st_atime in the
inode) when the file is read(2).

This flag can be employed only if one of the following
conditions is true:

• The effective UID of the process matches the owner UID
of the file.

• The calling process has the CAP_FOWNER capability in
its user namespace and the owner UID of the file has a
mapping in the namespace.

This flag is intended for use by indexing or backup
programs, where its use can significantly reduce the
amount of disk activity. This flag may not be effective
on all filesystems. One example is NFS, where the server
maintains the access time.
Seems there are some security-like conditions to this flag be active.
 
Thank you Kai for reminding me of copy_file_range. I'll put in a todo comment to start using that, as it makes copying even more efficient. By the way, there is a whole philosophical debate one could have about "the most efficient way of copying a file". On one hand, one can use heuristics to detect holes in files (areas that are all zero), and create sparse output files. On the other hand, one can detect existing holes in files, and copy only non-hole files. Or one can ignore holes, and just copy everything. There are pros and cons for all these approaches, and the best choice depends on the workload, machine, and file layouts.

About the question whether "not updating atime" is a misfeature: I actually agree that it is. But it is a "compensating misfeature", for the fact that all of the atime implementation in common file systems is a misfeature to begin with. The problem here is twofold:
  • Sometimes people look at atime, and if it increases, they think "someone looked at the content of the file". That's jumping to conclusions. If you want to audit whether files have been read, and use atime as a security mechanism: don't, because (a) users can reset the atime and cover their tracks if they have write access, and (b) you'll confuse legitimate accesses with maintenance operations such as backup/indexing. For security, use real audit logs.
  • Sometimes people look at atime, and if it increases (in particular increases regularly), they think the file will be read again in the future. But the past is not always a good predictor of the future. That's particularly true with indexing and backup operations: once a file has been processed by those, it will NOT be processed again, so the past is an anti-predictor of the future. And a good backup operation will be HSM aware, and not backup files which are already moved to a lower tier.
In a nutshell, I think atime is useless, and should be turned off on all file systems. On my home machines, that's already the case.

Having said that, I'm updating a file copy program that is used in a backup script, and I fear it will sometimes run on file systems where atime is still enabled. The backup script will look at all the metadata in the file systems (file names, link counts, size/mtime), and then it will read a lot of the files to verify their content. I want that read operation to be as efficient as possible. If I can disable the reads forcing the file system to update times (just for performance reasons), great; if I can't disable atime updates: fine, it will just run a little slower. For similar reasons, I'm using fadvise(...sequential, ...noreuse, ...dontneed) in the copy program, to tell the underlying file system to prefetch read data, but not keep caching once it has been read. If the file system implements that and it helps with performance: fabulous. If it doesn't, at least I've tried to do my best. to my knowledge, few file systems implement posix_fadvise anyway.

So in summary: On FreeBSD, there is no O_NOATIME flag on open, so I just won't use it there. For my own machines, that's fine.
 
In a nutshell, I think atime is useless, and should be turned off on all file systems.
Personally, I fully agree with that. Unfortunately, it's too late to get rid of it entirely. 😉

I'd still be against implementing something like O_NOATIME in FreeBSD. Thanks to the more detailed info preseted by Kai Burghardt and Emrion, I now know there's some complex logic behind whether it actually applies at all, but IMHO, that only makes it worse, being a convoluted hack around trying to "fix" atime semantics.

I'd still say if someone really thinks atime is useful for them, they most likely don't want any hard to understand "loopholes". And everyone else should just remember to always disable that thing in the mount options.
 
Back
Top