mv -h
works only with files, not with directories (otherwise the scheme would be: create another symlink targeting the directory, change the first symlink to point to the other symlink, then mv the directory to replace that other symlink).There is not, as far as I know. I don't think every Linux file system has it either; if I remember right, it was being faked (either in user space or in the VFS) for some file systems in Linux, which makes all of the atomicity guarantees into a joke. It's actually hard to implement. The regular rename() system call is already difficult, because it needs to be atomic, and side-effect free in the case of failure; rename2() is even harder, and it is very rarely used.
Can you explain what you are trying to accomplish? Maybe there is a better solution? Really, the only atomic multi-object operation in a file system is rename(), but often one can work around all this by using atomic create with lock files instead.
mv sitedir tmp
mv newsitedir sitedir
This opens up horrific consequences for a chroot jail.
Rename affects the following entities:Isn't rename just changing a directory entry to be a different name? It seems to me to be trivial except for possibly the failure handling.
How about this, would this work? I'm assuming the server will follow soft-links, and wants to read content from sitedir:Not on a live webserver. If a high traffic webserver is serving multiple sites, and you want to alter just one of these site directories, the (apparently naive) way to do that would be: ...
ln -s old_sitedir link_to_old_sitedir # old_sitedir is a real directory, contains the old content
ln -s new_sitedir link_to_newsitedir # new_sitedir is also a real directory, also has content
ln -s link_to_old_sitedir sitedir
# Start the server, it will work and read from sitedir -> old
# Now, when you want to switch to the new one:
mv link_to_new_sitedir sitedir
Well, there is a whole lot of cases that are not related to the rename problem which also need to be though through. For example: Someone loaded index.html from the old site. While they're staring at it, the sitedir is replaced with the new version. Then they click on a link on index.html. You have to make sure that following that old link does something sensible. I don't know what that sensible thing is, that depends on context.It turns out that in the window of time between these two moves, the site appears broken. If your traffic is high enough, this is a lot of people who think the site is down or broken. So I was looking for a better way. I do know you can get fancier and try to block the site while you are updating it, or use a load balancer and do rolling one-at-a-time upgrades, however I'm pretty sure there are other use cases which would
The problem is all in the atomicity and the error handling: After you're done, either the object is in the old directory at the old name, and the object being replaced is still at the new name.
...
After that has succeeded, they rename the temporary file to atomically replace the old file; and they know exactly (from the error code that the rename() call returned) whether the old file or the new file is now in place or readable, tertium non datur. If you do writing a file this way, then at any point there is a correct and readable file in place, perhaps the old or the new, but never both or a mix or none.
How about this, would this work? I'm assuming the server will follow soft-links, and wants to read content from sitedir:
This atomically overwrites the link, but the underlying directories remain accessible.Code:ln -s old_sitedir link_to_old_sitedir # old_sitedir is a real directory, contains the old content ln -s new_sitedir link_to_newsitedir # new_sitedir is also a real directory, also has content ln -s link_to_old_sitedir sitedir # Start the server, it will work and read from sitedir -> old # Now, when you want to switch to the new one: mv link_to_new_sitedir sitedir
Well, there is a whole lot of cases that are not related to the rename problem which also need to be though through. For example: Someone loaded index.html from the old site. While they're staring at it, the sitedir is replaced with the new version. Then they click on a link on index.html. You have to make sure that following that old link does something sensible. I don't know what that sensible thing is, that depends on context.
My concern related to chroot is that moving a directory is a common jailbreak technique. Adding a new way to move directories is a new attack vector.
No, the rename algorithm doesn't care at all whether the object it is renaming is a file, directory, soft-link, or purple flying elephants. Even if you take a file and rename it so it replaces (unlinks) a directory. In a correctly written file system, a rename will either atomically succeed, or completely fail. (I know there is one small exception, during the rename there may be a period where the object is visible at *both* old and new location, and there may be another exception that I forgot about).It would appear you are claiming that file renames are more atomic than directory renames. Is this correct?
Yes, that problem is really gnarly. People can have arbitrarily old documents on the screen, and then they click on a link. What do you do? If the parent document is old, and they click on a link to a file of which a newer version exists? I think the answer is "it depends", and can't be solved by a computer person without input from a web designer. For example, the old page may be for a widget, and the link is to buy a mounting bracket for the widget. The web page is being replaced because there is a new model of widget, which is smaller but even stronger. If they are looking at the old web page, then the link needs to lead them to the correct mounting bracket, because clearly the new bracket won't fit the old widget. On the other hand, a link that shows "today's stock price" should probably always go to the most recent version of the stock price page. It's complicated....the webserver atomicity issues you indicate will be solved by a symlink; I can't easily prove that it will be.
You can toggle between A and B.
You can have two deployments - an A and a B. You must of course remember which to use. That can also be solved in a script.
Deployment strategy is a whole other animal. I don't believe in keeping old deployments around for very long on the target machine. A local copy is nice for a fast rollback. If you need to rollback multiple versions, your certification process is broken. You archive the files elsewhere so a rollback is just a (re)deployment of an older version. Since they're offline, you can compare versions side by side to see what changed (or is about to change). This can be done by junior staff without granting them direct production access.That's correct. That would be a possible approach. It only fails when you want to potentially revert an arbitrary number of versions.
Yes, you do use a "new2" and then a "new3" only they're not called that. This is where you use a timestamp. And those can be used in a programmatic, automated fashion. Each new directory to which you cut over processing will have the timestamp of its creation in its name. But the symlink won't. The application will always use the same symlink to access the data, but the buckets of data being accessed are segregated by creation time.And the solution as proposed by ralphbsz does work, but it does work only a single time: you have an "old" entry and a "new" entry, and instead of swapping them, you switch the processing entirely to the new entry. Fine, problem solved - but what do we do the next time we need this? Create a "new2" entry, and then "new3", and so on?
So, this works when doing it manually; it works one time, maybe a second and a third. But it is not code-able, it is not useful for automated deploy.
All right. I didn't even think that far - I would have got that one free of cost, if the move were possible.Deployment strategy is a whole other animal. I don't believe in keeping old deployments around for very long on the target machine. A local copy is nice for a fast rollback. If you need to rollback multiple versions, your certification process is broken. You archive the files elsewhere so a rollback is just a (re)deployment of an older version. Since they're offline, you can compare versions side by side to see what changed (or is about to change). This can be done by junior staff without granting them direct production access.
Yes, you do use a "new2" and then a "new3" only they're not called that. This is where you use a timestamp. And those can be used in a programmatic, automated fashion. Each new directory to which you cut over processing will have the timestamp of its creation in its name. But the symlink won't. The application will always use the same symlink to access the data, but the buckets of data being accessed are segregated by creation time.
Why not? It's a linux-ism. FreeBSD prefers to stick to POSIX rather than every thought bubble out of GNU/Linux...Linux has thissystem call which ostensibly performs an atomic swap of two directories. Is there anything equivalent in FreeBSD? If not, why not?Code:renameat2()
I do know symlinks can be used to do this. For various reasons, I cannot use that strategy.
Why not? It's a linux-ism. FreeBSD prefers to stick to POSIX rather than every thought bubble out of GNU/Linux...
I think if you can't accomplish this hack in FreeBSD then you're not thinking about the problem correctly, whether that be use of symbolic links or whatever. Approach the problem differently?
Good luck getting it to work with a non-local file system.
Sometimes the cry of "approach the problem differently" ignores the effort someone actually has spent in looking at the problem. You should know this problem has already been "approached differently" even as the original post was being typed. I have a workaround, it's not 100% perfect, but it mostly works for our use cases.
I do not agree that the use case of atomically swapping two directories is specific to linux, nor do I want to give linux so much power over FreeBSD by declaring "linux-isms are bad" in the way you appeared to do. This is a general problem that I believe is useful to solve, regardless of the operating system you are using.
As expected, my beliefs do not drive development priorities.
Fair enough, but listen to this:Sometimes the cry of "approach the problem differently" ignores the effort someone actually has spent in looking at the problem. You should know this problem has already been "approached differently" even as the original post was being typed. I have a workaround, it's not 100% perfect, but it mostly works for our use cases.
I do not agree that the use case of atomically swapping two directories is specific to linux, nor do I want to give linux so much power over FreeBSD by declaring "linux-isms are bad" in the way you appeared to do. This is a general problem that I believe is useful to solve, regardless of the operating system you are using.
As expected, my beliefs do not drive development priorities.
Get used to it. Another thing that I would much appreciate is open/NOATIME, and I also have little hope that might appear - it would allow to remove stuff that wasn't used for some time while still have backups work normally (my backup tool supports it), but it's probably also a linuxism.
Simple answer: because the backup software shall kick in on an updated ctime - and rightfully so, to catch moves - so this gets either no moves saved or an endless loop of full backups. (The matter is nicely discussed here and below.)Why not use utimes(2) or the more modern utimensat(2) to reset it?
I've got to say that documentation you linked to is rather confusing. Why is backup software using atime as a trigger?
Then it says you'll avoid a race condition if 2 or more pieces of software are accessing file X and one's not modifying atime. In other words, this is a problem of GNU/Linux's making by having noatime flag on open and allowing programs to use it.