iNotify for FreeBSD?

zirias@ · Apr 4, 2016

Andrew Schmidt said:
filewatcherd doesn't seem to watch the fs, doesn't seem to be kernel based, or like watchman, uses kqueue.
[...] Please forgive me if I'm off base.

No, you aren't; seeing this thread, I wanted to comment roughly the same about devel/libinotify, which Carpetsmoker already did briefly:

All these tools/libraries have to use what the OS gives them, which is either just polling (bad) or kqueue (nice) on FreeBSD. In the case of devel/libinotify, the API exposed is that of Linux' inotify, which is great for quick porting of Linux software -> it will use kqueue without actually changing a lot of code. But in the corner cases where the problem is actually limitations of kqueue, of course none of all these suggestions will do any good.

Andrew Schmidt · Apr 6, 2016

So all is lost? Should I switch my server back to Linux? I'm not asking to be "snarky" or whatever, but what are the chances of a new API being developed that is useful as a FS monitor?

I would MUCH rather stick with FreeBSD. MUCHO

zirias@ · Apr 6, 2016

Andrew Schmidt said:
So all is lost? Should I switch my server back to Linux? I'm not asking to be "snarky" or whatever, but what are the chances of a new API being developed that is useful as a FS monitor?

Well, I didn't look into kqueue so far, but used inotify in an own project ... that's because I am new to FreeBSD. So I can only write on this topic what I read about it.

From what I understand, kqueue IS useful as a FS monitor and this is the purpose of this API. The only drawback compared to Linux' inotify is the need for multiple open file descriptors while inotify works on one single file descriptor that is configured to report you events for whatever you are interested in. This drawback is only relevant when monitoring a huge number of files/directories simultaneously. This thread has already seen the hint that the kernel allows to increase limits here and that in the default settings, you already have a high number -- so are you sure you actually need something that's not possible with kqueue?

Andrew Schmidt · Apr 6, 2016

Zirias said:
Well, I didn't look into kqueue so far, but used inotify in an own project ... that's because I am new to FreeBSD. So I can only write on this topic what I read about it.

From what I understand, kqueue IS useful as a FS monitor and this is the purpose of this API. The only drawback compared to Linux' inotify is the need for multiple open file descriptors while inotify works on one single file descriptor that is configured to report you events for whatever you are interested in. This drawback is only relevant when monitoring a huge number of files/directories simultaneously. This thread has already seen the hint that the kernel allows to increase limits here and that in the default settings, you already have a high number -- so are you sure you actually need something that's not possible with kqueue?

I want to monitor my plex library. I'll probably have to switch back to Linux. A viable FS monitor kqueue is not. It may be good for watching a folder or a maybe a few, but an entire ZFS volume?

This is a basic function of any modern operating system, just not FreeBSD?

arp242 · Apr 6, 2016

Andrew Schmidt said:
I want to monitor my plex library. I'll probably have to switch back to Linux. A viable FS monitor kqueue is not. It may be good for watching a folder or a maybe a few, but an entire ZFS volume?

This is a basic function of any modern operating system, just not FreeBSD?

Again, it *does* work for FreeBSD, it just uses up file descriptors. The default limit is about 12k, which is rather low − conservative upper limits are not necessarily a bad thing, by the way.
The current default for fs.inotify.max_user_watches is 524k. I'm not sure what a feasible upper limit is for the maximum number of open file descriptors, but you should be able to set it to something in the order of hundreds of thousands, if not millions.

Does inotify scale *better* here? Sure. But I don't think that kqueue is as dysfunctional as you make it out to be ...

tobik@ · Apr 6, 2016

Carpetsmoker said:
The default limit is about 12k, which is rather low − conservative upper limits are not necessarily a bad thing, by the way.

The default limit for kern.maxfiles is much higher normally since it's scaled based on how much memory you have (same for kern.maxfilesperproc): http://fxr.watson.org/fxr/source/kern/subr_param.c#L260

For example here on my ThinkPad with 4 GB kern.maxfiles is 124965 and on a server with 20 GB it's 651785.

Andrew, obviously you seem to have hit some kind of a practical problem/limitation with kqueue. Can you tell us more?

Andrew Schmidt · Apr 8, 2016

tobik said:
The default limit for kern.maxfiles is much higher normally since it's scaled based on how much memory you have (same for kern.maxfilesperproc): http://fxr.watson.org/fxr/source/kern/subr_param.c#L260

For example here on my ThinkPad with 4 GB kern.maxfiles is 124965 and on a server with 20 GB it's 651785.

Andrew, obviously you seem to have hit some kind of a practical problem/limitation with kqueue. Can you tell us more?

I can't tell you more, I don't have the knowledge, my limitation

. Plex devs seem to think kqueue just won't work and can't/won't implement support for it.

I keep seeing issues with descriptor limits, possibly differences in media locations, and the fact that folders being monitored must be open. I'm not sure what that last part means, I don't use the GUI, so everything seems "open" to me.

I, myself, can't find much info about kqueue except for monitoring sockets. I haven't (yet) tried experimenting with it, due to the fact I'm a single father with a full-time job... Haven't had much time.

sremick · Apr 12, 2016

For what it's worth: Andrew isn't alone here, and there are many other FreeBSD users (either directly, or by virtue of using Plex on FreeNAS which is an extremely-popular option) who are foaming at the bit for this. Unfortunately there is a strong sense among the Plex devs (who certainly know a lot more about inotify and the inner-workings and needs of Plex than I) that kqueue just is not a workable option for this need. From reading the historical posts in this thread, it seems Plex is not unique in this either. It would be really great if instead of end-users being caught in the middle, frustrated at the loss of functionality and trying to relate info back and forth, there could be a direct dialog between FreeBSD devs and Plex devs ("gbooker02" would be a prime contact) to sort this all out. There must be a workable solution by one team or another, but right now it really seems we're in a stalemate due to miscommunications/misunderstandings between the two projects about what the options are and what can and cannot be done with them.

Plex on FreeBSD already gets second-fiddle to Linux due to lack of Gracenote support. I don't hold any hope of an answer there in that 3-way problem. But the kqueue/inotify issue is strictly between Plex and FreeBSD so it seems that surely there could be an answer if we could only get the proper people talking to each other...

Adrian Williamson · Apr 12, 2016

Whilst I can't pretend to know anything about inotify or kqueue, I am one of those 'Plex on FreeNAS' users who would love it if Plex could automatically detect changes to my media dataset and update it's library accordingly.

gbooker · Apr 13, 2016

I created an account over here to hopefully help with the dialog.

From my assessment, using kqueue to monitor for file changes requires opening the fd for every file and directory contained within the directory tree. I've contacted a few Plex users to ask the size of these library as well as the amount of RAM they have to better assess the feasibility. One example I obtained the user has over 500,000 files and directories with 16G of RAM. If he were on FreeBSD (not Windows) and a kqueue-based file change monitoring was done with his library, the process would exceed the limit of the maximum number of open files. Several others come close to the limit and this is among a sample of about 10 users I've personally asked. There are several "build threads" in Plex's forums that tell you they would either come close or exceed the limit as well. Add to this that Plex has recently added support for photo libraries and many of Plex's users are avid or professional photographers, it is easy to see how these directory trees can grow to be enormous in terms of file count. Given how detrimental hitting these limits would be to the application as a whole, this would also require reading the limit as well as a scan to see if the limit is in danger of being hit before enabling the function. Furthermore, if enabled, the count would need to be monitored to disable it if the library were to grow dramatically (such as first-run setup adding all the libraries) and be again in danger of hitting the limit.

On the practice of using kqueue: When using kqueue, the code is not told a path for the change but rather the fd. This means the application must keep a mapping from fd -> path. Furthermore, the information about the change is somewhat limited, meaning that if the fd corresponds to a directory, that directory must nearly always be rescanned. This means there must be a mapping from path -> fd to determine if a file/directory within the scanned directory is already monitored or not. (In reality these maps would be to a common data structure rather than just path <-> fd.) This is a large amount of accounting that must be done by the application. So large in fact that an initial stab at implementing this monitoring for FreeBSD produced code about as large as the monitoring for MacOS, Linux, and Windows combined (as well as including the common functions used across all platforms). Some of this can be saved by using `udata` inside `kevent` but at most that would be one map and very few lines of code.

On the philosophy of using kqueue for FS monitoring: I see how kqueue has great purposes, but it strikes me as its intent was to monitor sockets more so that files. Extending this use to directories/files seems to me like a bit of a hack, but I suppose it works well enough for a small set. When it is scaled to such a large number as would be used in several users' Plex libraries, it looks far more like it is being used for a task it was never designed to handle. Even more so when these libraries are mostly quiescent with occasional additions/subtractions/modifications of files/directories. Holding hundreds of thousands of fds open for what is likely a 10s of changes a day seems excessive. It strikes me that FreeBSD really needs some kernel API that's truly designed for file system monitoring over using an existing API that seems ill-suited for this scale.

On inotify: Personally, I'd prefer FSEvents over inotify as inotify still requires opening an fd-like object for every directory. It does give information about changes to files contained within the directory though and provides path information on the changed item. This does reduce the amount of accounting the application must perform but there's still some for every directory. The FSEvents monitors an entire directory tree at once and provides rich information in its callback. This is much easier on the developer.

mnd999 · Apr 15, 2016

This sounds like something that might need to go to freebsd-current to get the attention of those in the know.

rigoletto@ · Jan 2, 2018

I know I am necro-posting but I am digging in the subject a bit on IRC (#freebsd) and I am leaving it here for eventual future interested people.

It seems kqueue(2) is just a event notification framework, but would be "very easy" (for who knows how to do that, not me) to implement FS monitoring system (like FSEvents) over kqueue(2).

In other words, if someone implement FSEvents/inotify equivalent he/she would just need to "connect" it to kqueue(2) to have a proper FS filesystem monitoring tool.

tobik@ · Jan 2, 2018

But isn't this what devel/libinotify does?

rigoletto@ · Jan 2, 2018

tobik@

Based on what I understood, I think devel/libinotify is just an API (kqueue <-> inotify) wrapper, but it would be possible ("very easy") to write a proper FS monitoring system (which would just need to use a few file descriptors - like FSEvent/inotify) and connect it to the kqueue(2) framework to "asynchronously dispatch information to userspace".

tobik@ · Jan 2, 2018

Ah ok.

Btw, who claims that it is "very easy" and why haven't they done it already?

rigoletto@ · Jan 2, 2018

This is the question I do not have the answer. Probably people who does not have interest in this kind of function.

EDIT:

Btw, I do not have idea how filesystem monitoring works but it seems lang/go is quite used for this kind of objective.

Snurg · Jan 2, 2018

Plex is not the only application that needs to watch a lot of directories/files.

Thus just my interested layman's question regarding the practicability of the kqueue approach, which was allegedly designed for watching sockets:
How long does it take to traverse half a million files, only to open the file handles kqueue requires to work?

Personally I subjectively feel the FSEvents approach looks much more elegant.

rigoletto@ · Jan 2, 2018

Well, just for the record, I downloaded the Linux kernel (4.14.11). The inotify stuff is in fs/notify/inotify/*. There are 5 files, including the Makefile, with a total of 1061 lines (including comments).

I do not know if it actually is what it seems to be, but does not seem that much of code indeed (C code I guess). Well, there may have a lot of more code somewhere, and I also can't help in availing the complexity of that implementation.

EDIT:

Btw, everything in fs/notify folder result in a total of 20+ files and 4588 lines of code in total.

poorandunlucky · Jan 5, 2018

Andrew Schmidt said:
I want to monitor my plex library. I'll probably have to switch back to Linux. A viable FS monitor kqueue is not. It may be good for watching a folder or a maybe a few, but an entire ZFS volume?

This is a basic function of any modern operating system, just not FreeBSD?

Plex detects changes just fine on my system... (?)

I use minidlna now, though...

Also, sorry to butt-in like this, but could someone just kind of summarize what the difference between kqueue and inotify is?

I was under the impression kqueue was fine to monitor filesystem changes... even if it has "open files" which I'm sure is just a number, and it just watches for any access at the kernel level... I'm not even sure how else something could detect changes to a folder... I mean, a folder and a file are the same, no? just an inode... no? I'm just not sure where this is all going...

rigoletto@ · Jan 5, 2018

poorandunlucky

See: Thread 38162/page-2#post-317478

ralphbsz · Jan 5, 2018

The code in the inotify directory is the tip of the iceberg. The real work happens in the file systems: those need to generate the events that inotify then channels to consumers. Not all file systems do that. Some file systems use other mechanisms to get the same effect. One that's popular for large systems is to run DMAPI (which wasn't intended for getting file change notifications, but for backup and HSM), and issue DMAPI events. Another approach is to not attempt to notify on every change, but instead implement fast scanning of the whole file system metadata; for large and networked systems it can actually be more efficient to regularly scan all the file system than to continuously monitor for all changes.

So the real number of lines of code for the whole family of file change notification is much higher, and not in a single place.

Snurg · Jan 5, 2018

ralphbsz said:
... consumers.

I know you didn't mean it that way, but I think maybe it's true.
If I am right this immediate notifying is mainly a typical desktop/laptop, thus a "consumer" functionality.
And then this would explain, why this for the developers of a server OS has no high priority.
...

ralphbsz said:
So the real number of lines of code for the whole family of file change notification is much higher, and not in a single place.

It is actually very complex. The whole stuff, the inode->path associations etc, that are described by gbooker above.This is to be done by kqueue, an application allegedly made for another purpose.

I guess Apple has made a quite different, maybe more sophisticated approach: when a file is opened, then there is a check whether it matches the FSEvent rules.
If and only if it matches, it becomes necessary to store and watch the inode until the file gets closed.
But, this would require deep integration into the filesystem, instead just putting a layer over it. Much work, which possibly only a small fraction of FreeBSD users would use.

rigoletto@ · Jan 5, 2018

Snurg said:
If I am right this immediate notifying is mainly a typical desktop/laptop, thus a "consumer" functionality.

But not just! For instance, this very useful for syncing services, specially large ones.

EDIT:

I heard iXsystems is porting the ZFS native encryption and it should (hopefully) land in 12-RELEASE, in about a year.

This functionality (notify) seems something of their interest too, specially due to FreeNAS. However, I am not aware of any work of them about that.

ralphbsz · Jan 5, 2018

When I said "consumer" above it, I didn't mean desktop/laptop as opposed to enterprise/server computing, but I meant the party that uses the change notification events (eats them, or consumes them).

And indeed, these type of mechanisms are used in some large enterprise systems. Not to update a little file browser window on your laptop screen (duh). That's why there are some industrial-strength versions of notifications in some systems. They tend to not use inotify or kqueue, but their own mechanisms, for efficiency sake.

poorandunlucky · Jan 6, 2018

ralphbsz said:
[...] Some file systems use other mechanisms to get the same effect. One that's popular for large systems is to run DMAPI (which wasn't intended for getting file change notifications, but for backup and HSM), and issue DMAPI events. Another approach is to not attempt to notify on every change, but instead implement fast scanning of the whole file system metadata; for large and networked systems it can actually be more efficient to regularly scan all the file system than to continuously monitor for all changes. [...]

This.

For things like Plex libraries, IMO just scanning directories' metadata periodically seems quite enough... I don't think mechanisms like kqueue or inotify were meant to monitor for such trivial changes...

https://wiki.netbsd.org/tutorials/kqueue_tutorial/

Indeed, it wasn't meant to monitor file system changes... I concur that maybe it's a hack to do it on a small scale, but it's definitely not meant to be used to monitor a collection of files for changes, or folders for new/deleted/modified files...

Mercurial does that pretty well, AFAIK... why not just use a script like that?