What type of tree data structures FreeBSD use for its kernel?

bsdnoob · Jun 11, 2022

I'm an end user of FreeBSD. I came to know that linux kernel use red black tree but as I'm interested only in FreeBSD, I want to know what kind of tree structures used in freebsd kernel and its file system I mean What are the trees I need to learn to became freebsd system developer. I tried to find it but the sheer size/volume of documentation is a big impediment to a noob like me and it seems linux books and docs are everywhere. Could you please give me the information about it.
Thank you.
P.S I tried to find it out in Developer's & Porter's Handbook and TDIOFOS but failed to understand how it has been depicted in those books.

ralphbsz · Jun 11, 2022

Read the black book: Design and Implementation of the FreeBSD Operating System, by Kirk McKusick. Then read the source code.

mer · Jun 11, 2022

"...its file system"
Right there you've taken a big bite of the apple.
Just staying with 2 of the most used filesystem UFS and ZFS and you have different implementations. But they expose similar interfaces up through the VFS layers so that a user "opens a file" and doesn't care exactly how that's done.
I believe different type of tree structures are used through out the kernel, each one specifically chosen for it's use cases.
One needs more than trees to understand the forest.

I agree with ralphbsz
With any kernel FreeBSD, Linux, Windows, BeOS, etc, figure out where you want to help or what interests you the most. Maybe it's device drivers, maybe it's filesystems, maybe it boot loaders, maybe it's memory management and concentrate there.

Above just my opinions.

ralphbsz · Jun 11, 2022

mer said:
One needs more than trees to understand the forest.

Chainsaws.

bsdnoob · Jun 11, 2022

mer said:
"...its file system"
Right there you've taken a big bite of the apple.
Just staying with 2 of the most used filesystem UFS and ZFS and you have different implementations. But they expose similar interfaces up through the VFS layers so that a user "opens a file" and doesn't care exactly how that's done.
I believe different type of tree structures are used through out the kernel, each one specifically chosen for it's use cases.
One needs more than trees to understand the forest.

I agree with ralphbsz
With any kernel FreeBSD, Linux, Windows, BeOS, etc, figure out where you want to help or what interests you the most. Maybe it's device drivers, maybe it's filesystems, maybe it boot loaders, maybe it's memory management and concentrate there.

Above just my opinions.

Well, I've nibbled this; "I believe different type of tree structures are used through out the kernel, each one specifically chosen for it's use cases."

mer · Jun 11, 2022

bsdnoob said:
Well, I've nibbled this; "I believe different type of tree structures are used through out the kernel, each one specifically chosen for it's use cases."

Absolutely.
"What do I need to know to be kernel developer" == "How do I eat an elephant"

Andrey Lanin · Jun 11, 2022

mer said:
... How do I eat an elephant"

In parts

bsdnoob said:
I want to know what kind of tree structures used in freebsd kernel and its file system...

Kernel source code is the answer.

bsdnoob · Jun 11, 2022

mer said:
Absolutely. "What do I need to know to be kernel developer" == "How do I eat an elephant"

Quite right. ...and "How do I put the leftover elephant into a refrigerator."

bsdnoob · Jun 11, 2022

Andrey Lanin said:
Kernel source code is the answer.

Yep, but what about those uninitiated noobs who want to learn and in order to do so, ask this kind of question out of simple curiosity?

ralphbsz · Jun 11, 2022

Noobs? The trick is to not remain one. I would NOT start by reading the actual kernel source code. Because that will show you a dozen different ways of how it is done, but without understanding why it is done that way, and what the history is, you won't really understand. It's like going to the store, and getting a can of pre-cooked elephant tail soup: Yes, you are eating elephant; but you still don't know how to hunt, prepare, cook, eat, and store leftover elephant.

The difference is: You need to learn about the tradeoffs, and decision processes. For example, one type of tree implementation might be complex, difficult to read the source, and hard to update without introducing bugs, but that complexity might be needed for decent performance. This for example happens in database implementations, and the BtrFS immutable/CoW B-tree is an example of the use of a complex but efficient structure in the kernel. On the other hand, you might find boringly trivial trees in the kernel, in places where correctness and simplicity is more important. Reading the source code, you will learn what the result of engineers evaluating these tradeoffs was, but you don't know why the decisions were made.

Furthermore, if you read just the source code, you are looking at a palimpsest: There is really old code still in use (which probably wouldn't be written this way today, and the tradeoffs were different 40 years ago), there is old code that has been partially modified, and there is new code. You might find code that has been written by different people with different styles, which will be confusing.

I would start with a good operating systems textbook. My personal recommendation is to start with Tanenbaum's book (which has coding examples and is more entry level), and also read Silberschatz' book (the one with a dinosour on the cover) for an in-depth treatment. After doing that, go read Kirk McKusick's book about how FreeBSD does it (make sure to get the new edition, it's mostly black, the previous ones were mostly red). Somewhere in this process, open the actual source code, and it will become much clearer.

Deleted member 70435 · Jun 11, 2022

my first recommendation to him is to read Michael W. Lucas' book one thing at a time he will have a great understanding of how FreeBSD works. every book, it is essential even what the FreeBSD project has available.

Michael W. Lucas: books, biography, latest update

Follow Michael W. Lucas and explore their bibliography from Amazon's Michael W. Lucas Author Page.

www.amazon.com

msplsh · Jun 11, 2022

bsdnoob said:
Yep, but what about those uninitiated noobs who want to learn and in order to do so, ask this kind of question out of simple curiosity?

Couple things
1. Not a simple question
2. Doesn't "teach" you anything by just telling you the answer
3. Can't do anything useful with the answer
4. No actionable items for learning

gpw928 · Jun 11, 2022

You may find it easier to start with John Lions' Commentary on V6 Unix. It's well documented, and, once mastered, makes extrapolation to modern Unix versions easier.

Holger · Jun 12, 2022

When I started with FreeBSD, I was also like “Yeah, I will become a kernel developer, gimme that source code”. But then I realized, I wouldn't even know how to properly compile and install that source code.

Then I found out I would not know how to recover from a wrong change in the kernel source code.

Then I found out that I actually know nothing about this operating system. –

What has been most profitable for me up to this point (in terms of growth and learning experience), was to start simply using FreeBSD as a daily driver, with no excuses. This way I became humble quite quickly:

Can I handle basic OS and Desktop setup and usage?
Can I handle updates and recovery (using, e.g., ZFS)?
Do I know how to install software and maybe even patch it, if there is a problem?

The last point was a big one for me, involving using ports, applying patches, compile them in a safe environment (jails), etc.

I am still dreaming of becoming a kernel dev (I just want to this darn USB-card reader to get working ...) but until then I am having lots of fun doing (seemingly) simple things (like getting Google Chrome to run, yikes).

bsdnoob · Jun 12, 2022

Holger said:
...gimme that source code.

It reminds me of ESR, hope he's well & doing fine.

Vadim Alexandrov said:
my first recommendation to him is to read Michael W. Lucas' book one thing at a time he will have a great understanding of how FreeBSD works. every book, it is essential even what the FreeBSD project has available.

Michael W. Lucas: books, biography, latest update

Follow Michael W. Lucas and explore their bibliography from Amazon's Michael W. Lucas Author Page.

www.amazon.com

I've got the book. I follow it. I started with Lehey and Smith's FreeBSD complete reference.

Erichans · Jun 12, 2022

Browsing Lehey and Smith's FreeBSD complete reference it looks to me that that book is sort of in between Absolutely FreeBSD (more user oriented) and The Design and Implementation of the FreeBSD Operating System, 2nd Edition (more ..., well just as the title states), as to the depth of description of FreeBSD and its (kernel) structures. It is also from 2003: a lot older than the other two books; some things have changed (the ULE scheduler became the default scheduler as of FreeBSD 8). Have a look at the free Chapter 4: Process Management or pdf of The Design and Implementation of the FreeBSD Operating System, 2nd Edition. For more related stuff see also: FreeBSD Development: Books, Papers, Slides

Some might consider reading the source code is the ultimate reference. I think, especially for such a complex piece of software, it is more useful to first get the essentials of the inner workings of an OS; more specifically the FreeBSD OS and its kernel. With that knowledge you'll probably be much more effective in understanding the kernel source code. The effectiveness gets amplified if, at this moment, you're not familiar with the basics of OS architecture & internals and not (very) fluent in C.

As process management is at the core of the fair distribution of CPU time and memory management is also an important OS aspect, the following may be interesting:

Design elements of the FreeBSD VM system or pdf version
An easy-to-follow description of the design of the FreeBSD virtual memory system
Memory Management in FreeBSD 12.0 by Mark Johnston, BSD Taiwan in 2017 - slides only
Exploring Swap on FreeBSD by Mark Johnston (written for Klara Systems), January 14, 2021
Though more oriented towards swapping, it describes also how threads are being moved from one queue to another; as such it describes process management as well as memory management.
An Overview of Scheduling in the FreeBSD Kernel by Marshall Kirk McKusick - EuroBSDcon 2021; video & slides
An Overview of Scheduling in the FreeBSD Kernel by Marshall Kirk McKusick - BSD Can 2020; video & slides
This is basically the same talk as in #4 but, there are interesting differences, see below.
ULE: A Modern Scheduler For FreeBSD by Jeff Roberson - BSDCon 2003
The FreeBSD ULE scheduler by Marshall Kirk McKusick and Jeff Roberson - Science, Systems and FreeBSD, September / October 2014 - The FreeBSD Journal, Vol 1, Issue No. 5, page 20-26
The FreeBSD ULE scheduler - just the article as mentioned above, from Jeff Roberson's web page.
The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS by Justinien Bouron, Sebastien Chevalley, Baptiste Lepers, and Willy Zwaenepoel - USENIX ATC 2018; pdf - slides - audio of presentation

While both articles #6 & #7 describe the ULE scheduler and #6 might look more of an extensive (academic-like) article, I personally find it less cramped (for the lack of a better word) than the article in the FreeBSD Journal.

The 4BSD scheduler has long been the default scheduler of FreeBSD. As of FreeBSD 8, the new ULE scheduler took its place as the default scheduler. The 4BSD scheduler is still part of FreeBSD and you can use it instead of the ULE scheduler! You might consider using it on a small (embedded) system, like the Raspberry Pi; see the (part of the) Q&A session below.

If you're really interested in schedulers in general and their implementation details and differences in particular, #8 should also be interesting. Basically #4 and #5 are the same talk. However, because he had more time for his talk in Canada, Kirk McKusick discusses the aspects of the comparison between the Linux scheduler and the FreeBSD ULE scheduler in his talk at BSD Can for some 10+ minutes. That comparison is not in #4 (EuroBSDcon 2021); however, in EuroBSDcon 2021 the online Q&A session is appended following the presentation. In the Q&A he discusses some interesting aspects of the 4BSD scheduler, the ULE scheduler and the Linux scheduler; the last two also in the context of the comparison of #8.

Some interesting lines from Kirk McKusick during the Q&A (in video, from ca. 47:27 min):

The benefit of the 4BSD scheduler is, it's drop dead simple. It's like 100 lines of code and 40 of them are comments [...] The 4BSD scheduler works very well for small embedded systems, anything up to about 4 CPUs, you just don't need the complexity that comes from ULE but, as soon as you come over 4 CPUs, ULE is just the winning strategy. […]

[Q]: For a system like the Raspberry Pi with 4 cores 4BSD might be good?
[A]: It's certainly worth trying. [...] It is pretty evident that it's either going to work well for you or it's not. For many many years, people that were doing embedded systems just, they wanted a small footprint in the kernel and certainly 4BSD is a much smaller footprint scheduler than, than ULE is.

I actually, when I gave the version of this talk at BSD Can, I had an hour, so I actually discuss a paper that was written that appeared in a USENIX conference where they compared the schedulers of Linux versus FreeBSD. And one of the data points was, you know 4BSD scheduler is a 100 lines of code, the ULE scheduler is 2000 lines of code, the Linux scheduler is 20000 lines of code. So you know, right there you can sort of see the scaling of complexity. It was kind of an interesting paper and I do recommend it, or you can go back and look at this BSD Can where I talk about this for the better part of 10 or 15 minutes. But the way they did the comparison is that they went into Linux and they removed the Linux scheduler and put ULE in its place. [...] and as they pointed out, it was easier to put ULE into Linux than to put the Linux scheduler into FreeBSD, because ULE is so much smaller and compact.

Anyway, the bottom line is that for the most part the two schedulers do about the same. There are a few cases where Linux does dramatically better, there's a few more cases where FreeBSD does dramatically better. The real issue is that the Linux scheduler has gazillions of special cases in it to deal with certain workloads. And every now and then you get a workload where the Linux scheduler misgauges, what, whether it's a special case and so it slips into using a special case which is a distinct bad choice. So they end up having somewhere, they’re just really bad, whereas ULE, it does have one where it doesn’t do as well as Linux, mostly because Linux has a special case code for that workload but, it doesn’t have any that are just horrible.

George · Jun 12, 2022

bsdnoob said:
What are the trees I need to learn to became freebsd system developer.

You need to learn the c programming language.

What type of tree data structures FreeBSD use for its kernel?

Deleted member 70435

Guest