Originally Posted by GroupInode
I'm trying to find how the information is retrieved from dinode and printed on the console or to the file. Once I know how it is retrieved, I'll try to add a field to dinode structure last_modified_by i.e it would have the username who lastly modified that file. I'll probably try to modify ls.c also to display information on console.
OK, interesting project and very helpful to know what your motivation is...
First the easy stuff... If what you mean by "retrieve from dinode" and "printed", then yes, the most common method is by the ls command. BUT it is far, far from the only mechanism. Many, many apps directly access file and inode information, e.g. tar, dump, chmod, even sh, just to name a few, nevermind large apps like DBs, webservers, etc, etc. So, for example, if you want to back up your new info by tar or dump, you will have to modify these utilities. Consider that, if you are successful, you will certainly want to modify find(1) so that you can do searches for all files that have been modified by specific users.
The simpliest description of ls(1) is that it reads directories, opendir(), etc, then does stat() calls to grab the inode info for each file.
As far as "username", I presume and hope that you actually mean uid/gid, since it makes no sense to have usernames in the inodes. Luckily a uid/gid pair will fit in the spare entries of a dinode so you are set there. I would strongly recommend not changing the size of a dinode.
Ok, now for the harder stuff. Again, my knowledge is old and rusty -- its been years/decades since I've poured over the kernel. But I'll take a stab... My gut reaction is that this is an interest idea (record uid/gid of last inode modifier), but it is not going to be easy to do 100% airtight, and it is likely for these reasons why it hasn't been done yet.
To be more explicit, I think what you want is the record the uid/gid each time an inode's mtime (and maybe ctime) is updated/changed. Sounds straight forward (but...). The first thing I would do is to grep all occurences of mtime and ctime in the kernel and study them.
Here's where it gets more subtle: 1) user programs change inodes in several different ways, many involving creating file descriptors (e.g. open()). By the time a filedes is created, info about the uid/gid is lost. There is no uid/gid associated with a file table entry. 2) user programs aren't the only way in which inodes are modified.
For 1) consider the case of when you use the su(1) command. It is an SUID program, so merely running it means that 1) a new process is created with a NEW set of uid/gid, yet 2) it inherits filedes from the parent shell, which has the original uid/gid. Both sets of filedes point to the same file table entries, which in turn point to the same inodes. So you cannot depend on the file table to tell you the proper uid/gid to update the inode with. The only real source of appropriate uid/gid info is going to be the user struct, the record of the active process that the kernel is acting on behalf of. I haven't even begun to consider what happens with threads since I don't really know about the current implementations are of threads. This means you will have to be very careful in studying the kernel to determine at any given point when mtime is changed, whether the info in struct user is actually the right info to use.
On the other hand, you will probably want to avoid, if possible, hacking every FS type (think UFS vs ZFS) to get the job done, unless you really don't care about anything other than UFS. So working at the vnode/vattr level (va_mtime) would be a strong candidate I would think... You would have to add muid/mgid fields to struct vattr. If you are lucky, maybe something simple at the vnode layer would work, like checking for changes to va_mtime in vput() or similar (just guessing here). Otherwise you may still have to hack every filesystem you care about.
You might say, why bother? why not just tackle the problem at the syscall level? I will guess/posit that doing that will miss many paths and situations when inodes get changed. Now you may not care about those situations and a 90% solution/accuracy may be good enough. This second idea is simply in each syscall routine that might change an inode, to include code to update the muid/mgid (let's call it the modifying uid/gid). I would consider this a much messier, less clean idea and here is one possible major hole: Consider the workings of a filesystem server like NFS or Samba. (I read that NFS is now stateful, which may make things even worse) I believe such server daemons do the appropriate authentication themselves and then operate on local filesystems as root. Therefore it is likely that the proper muid/mgid info is never in the kernel at all. The syscalls issued by the user may be from another machine! It will be messy to carry it around and inject it back in. You probably will have to hack every inode-modifying server daemon to get this all 100% correct -- unless you don't care about this level of accuracy.
Again my knowledge is very old and rusty, so sorry for any inaccuracies. But I hope this helps. Overall I'll guess that what you are trying to do will be medium difficult to get 80-90% right and very difficult to get 100% right.