Other Learning more about filesystems

Hello! While reading chapter 11 of Lucas's Absolute FreeBSD I got fascinated by filesystems. I have been using cd, cp, mv and rm for 30 years now, but I know next to nothing about the inner workings of files. I'd like to learn more – are there some good resources that people who are experts on the topic could recommend? Offline books or online tutorials?
  • I have no intention at all to start developing filesystems, so I am asking about theoretical introductions and conceptual explanations, not about technical nitty-gritty.
  • It would be perfect, however, if the recommended resource came with some simple playground or tutorial to create a toy model of a filesystem. Learning by doing works best.
  • I am mostly interested in learning more about UFS and ZFS, so the book Unix Filesystems by Pate that covers a lot of historical filesystems seems less relevant – but then again, maybe this is exactly what I need?
Thank you!
 
This is a giant and complex field, big enough to have dedicated scientific conferences (FAST, MSS, ...) and journals (ToS ...). Embarrassingly, I don't know of a basic textbook about "what is a file system, how do you build a simple one, what are further challenges". A few more resources: (a) Tanenbaum's "Modern Operating Systems" book has a chapter on file systems, makes a fine starting point. (b) Kirk McKusick teaches a ZFS in depth class, for example at Usenix conferences, and I've seen the slides online somewhere. (c) Steve Pate wrote an overview book a few decades ago, when VxFS was considered "one of the the best" file systems, and while it is strongly biased towards "his baby", it has a fine general section. (d) I know there is a book from about 10 or 15 years ago about ZFS, which at the beginning has a nice overview of the design choices (in particular log structured versus extent based), before going into the details of how to administer ZFS. But sadly, I don't remember which of the ZFS books it is (there seem to be a half dozen now).

The "black book" that cracauer pointed out above is great, and every serious FreeBSD user should have it and read it. The problem with it is that it goes into way too much detail for most uses. The way to read it is to read the first few pages of each chapter, and the first few paragraphs of each section, and after that initial pass start drilling down where needed or interested.
 
Questions worth pondering:
  1. what makes a (minimal) filesystem?
  2. how is a filesystem accessed (e.g. API/syscalls in Unix like OS)?
  3. how might this API be implemented?
  4. how might an implementation deal with concurrent access from multiple user processes?
  5. how to provide secure access?
  6. what additional features you may want out of a filesystem?
  7. why do operating systems invariably move to more complex filesystems?
  8. how may filesystems be accessed remotely?
  9. how many multiple disks may be used to hold a single filesystem?
  10. how may a distributed filesystem may be built?
  11. what needs to be done to scale filesystems in terms of storage (Petabytes!) and concurrent access (zillions!)?
  12. how to do backups and restore?
  13. how are disk or network failures handled?
If you take time to (at least partially) figure out some of these questions on your own, only referring to the papers and books others have pointed out to fill in more context, you will learn a lot more, more deeply and faster. You can even create your own "toy" filesystem and evolve it! And write a series of blogs if you do this :)
 
I also think it's worth mentioning the VFS. How the OS supports multiple file systems and is able to provide a abstract overlay to files, where you don't really care which file system the file is living on.
vnode paper for sunos
The BSD VFS is similar to the one of sunos.
Linux uses a slightly different version (superblocks, no vnode -> only inodes etc.)
 
Back
Top