Suggestion on port for file sync

tuaris · Mar 15, 2024

I need to be able to synchronize changes for a specific file (and in another implementation a directory) between multiple servers running FreeBSD. The file should be updated as soon as it's changed on any one of the servers. It does not need to occur synchronously, but file permissions/ownership need to be retained. I can tolerate a delay, the main requirement is that it happens automatically without me having to manually kick off a command. I'd also like to be able to setup a hook or watch so that when the file is updated, a command is executed.

I don't think net/rsync is the solution here, unless there is a way to have it behave in the way I described above. I looked at net/syncthing a few years ago. Last I checked it has a slight issue where is complains about being run as root. I was thinking I could run multiple instances of it, but seems like the rc script on that doesn't support that at the moment. GlusterFS is obviously not an option since there are some low level issues with it in regards to it depending strongly on epoll that the upstream devs do not want to address (and I haven't had the time to port it to kqueue/kevent).

user rude · Mar 15, 2024

I still would recommend rsync, though, since it's a fast and reliable tool for syncing file.
It transfers changes only, not like cp, when the file first needs to be rm, then completely transferred again.
And it comes with lots of options you can finetune it.
For a start
/usr/local/bin/rsync -auq yoursource.file destination.file could bring you already a good way down the road.

The most simplest idea would be to execute rsync by cron. e.g. every second/minute.
This may not exactly

tuaris said:
updated as soon as it's changed

but could be short termed enough.
On a single file rsync works quickly.
Downside: rsync could also update the file while it's still in an unfinished editing state, if you safe it before finished.

Or you write a small script (I barely use pure commands in cron, but mostly scripts) that is executed by cron frequently which checks if the file was modified, and if then executes rsync.

Another more sophisticated but also more complex way (depening on what you have and know about those topic, of course) could be to use a version control system like CVS, SVN, Git. (Since it's only one file, not a software project, I'd recommend to choose the one that's the easiest one to set up/understand/make to work, not the most efficient/popular one.)
This way all the files are updated when you finished editing and committed it.
Downside: you have to set up version control for the file on all machines.
And it could be overkill for your purpose.

Hiroo Ono · Mar 15, 2024

If using cloud storage can be an option, running net/rclone as rclone sync -u for upload and download with cron in each server between the server and the cloud storage might do the trick. (I am personally doing this to sync the ssh public key between servers.) Rclone has bisync mode, but I feel it is still unstable.
Or, the choice of cloud storage becomes limited to Microsoft's OneDrive, but --monitor mode of net/onedrive is quite good, I think. It detects local change instantly via libinotify. Remote changes are detected by polling and you can change the interval with --monitor-interval option.

Hiroo Ono · Mar 15, 2024

I'd also like to be able to setup a hook or watch so that when the file is updated, a command is executed.

Sorry, I missed this phrase. Neither rclone nor onedrive have a hook to execute a command when a change is detected, but if the command you want to run is to notify a change to you, onedrive can notify to desktop via libnotify interface.

Hiroo Ono · Mar 15, 2024

I have not used these softwares, but you might combine net/unison and sysutils/incron to sync files and directories and run some commands when there are changes.

tuaris · Mar 15, 2024

Hiroo Ono said:
Sorry, I missed this phrase. Neither rclone nor onedrive have a hook to execute a command when a change is detected, but if the command you want to run is to notify a change to you, onedrive can notify to desktop via libnotify interface.

The hook is a nice to have. I can write a monitor to reload the appropriate service to re-read the file.

tuaris · Mar 15, 2024

user rude said:
I still would recommend rsync, though, since it's a fast and reliable tool for syncing file.
It transfers changes only, not like cp, when the file first needs to be rm, then completely transferred again.
And it comes with lots of options you can finetune it.
For a start
/usr/local/bin/rsync -auq yoursource.file destination.file could bring you already a good way down the road.

The most simplest idea would be to execute rsync by cron. e.g. every second/minute.
This may not exactly

but could be short termed enough.
On a single file rsync works quickly.
Downside: rsync could also update the file while it's still in an unfinished editing state, if you safe it before finished.

Or you write a small script (I barely use pure commands in cron, but mostly scripts) that is executed by cron frequently which checks if the file was modified, and if then executes rsync.

Another more sophisticated but also more complex way (depening on what you have and know about those topic, of course) could be to use a version control system like CVS, SVN, Git. (Since it's only one file, not a software project, I'd recommend to choose the one that's the easiest one to set up/understand/make to work, not the most efficient/popular one.)
This way all the files are updated when you finished editing and committed it.
Downside: you have to set up version control for the file on all machines.
And it could be overkill for your purpose.

You may be on to something with the version control. In this specific case, that would be a plus. I could make the edits outside the server, a daemon could constantly poll for changes on version control, download the new file, and reload the service.

DtxdF · Mar 15, 2024

Maybe cpdup(1) is what you need. If you cpdup(1) to a file or directory and send it to another system, cpdup(1) removes, adds, changes any changes to the destination depending on the source. I use it in combination with cron(8) for long term file synchronization, such as backups on my laptop created using tar(1), databases for programs like keepassxc, etc.

Of course, you need to configure your SSH server and it depends on your needs. This task is meant to be automated without user intervention, so you should probably use public keys. In my case, since I use SSH Certificates, I have created a certificate that never expires, and it is used by an unprivileged user, both at source and destination.

tuaris · Mar 15, 2024

Hiroo Ono said:
DtxdF said:

Maybe cpdup(1) is what you need. If you cpdup(1) to a file or directory and send it to another system, cpdup(1) removes, adds, changes any changes to the destination depending on the source. I use it in combination with cron(8) for long term file synchronization, such as backups on my laptop created using tar(1), databases for programs like keepassxc, etc.

Of course, you need to configure your SSH server and it depends on your needs. This task is meant to be automated without user intervention, so you should probably use public keys. In my case, since I use SSH Certificates, I have created a certificate that never expires, and it is used by an unprivileged user, both at source and destination.

Click to expand...

I have not used these softwares, but you might combine net/unison and sysutils/incron to sync files and directories and run some commands when there are changes.

hmm, I've not heard of this. It looks like it might also solve another problem I have... to keep mirrors of my static web assets in sync.

bakul · Mar 15, 2024

tuaris said:
I looked at net/syncthing a few years ago. Last I checked it has a slight issue where is complains about being run as root.

Are you trying to update multiple users’ files? If so see this: https://github.com/syncthing/syncthing/pull/5479

Syncthing is most likely the right tool for what you want to do. Essentially what you want seems very similar to cache coherency protocol for multiprocessor systems, where there is any of the processors can update data and everyone must always get the latest version when they this data & not a stale copy.

Alain De Vos · Mar 16, 2024

"sysutils/clone"

user rude · Mar 16, 2024

I thought about it again.
I think the tool that does the actual cloning/syncing/copying/updating is one thing, that of course be chosen right,
depending on type, size, and number of files also as the number of targets, considering transferrates...

but the main catch seems to be the automatically updating when you finished editing it.
A cron job as I recommended, would be of no good use.

Also a version control system could be a lot of effort, and not a good idea, depending on the certain situation, since you'd need a central repository 24/7, and all "targets" looking for an update permanently - I don't think this idea was really good.
But of course you know the exact situation way better, and will know what to chose after getting several ideas.

However, another idea:
Open the program you edit the file with not directly manually, but within a sh-script.
If it's a textfile you may do something like this

Code:

#shebangs

# edit the file
set write permissions to file
vim /path/to/file
remove write permissions from file

# updating files on server
while read targetfileaddress; do
rsync /path/to/file $targetfileaddress
done < file_with_target_addresses

exit 0

(you may of course use another editor than vim if you wish to do so.)

The idea is, instead of having permanentely processes run checking frequently if theres is something to do,
the shellscript simple waits until the editor is closed, and then starts updating all the server files,
so when there is something to, only.
you may change write permissions, so the file cannot accidently be edited otherwise.

Additionally you can add/remove target file paths within an indepent file.

BTW: rsync preserves or not preserves file attributes like permissions, and is capable of a lot of more other things, especially for doing jobs over the net (anyway it's worth a peek in its man page)

If you create an alias in your .cshrc (or what shell you're using)

 alias klonk /path/to/the/script

you may just type "klonk" (you'll find a better name), edit your file, and then the rest runs automatically, exactly when you have a new version of the file.

And maybe think about to additionally check for errors, before any updating starts.
Because no tool can check if your file is bug free, or if you clone/sync/copy/update/commit rubbish.
If your targets may cause serious probs if you accidently update some garbage,
you consider to put the sync-loop into an if-condition, that only goes on, if the test was okay.
Of course, this test routine is your job.
If it's some kind of source, you could let the script run a compiler over it,
and only continue if the compiler did not produce messages aka no errors ...- something like that.

So, I'm out of ideas, and so here.
wish you success.