The Case for Rust (in the base system)

Interestingly enough, there is at least one already, it's called redox-os. I haven't tried it, so I've no idea whether it's any good.
I hope posting some info on it doesn't break forum rules!

Just for interest:-


I noticed their FAQ says "Redox is still under development, so our list of supported applications is currently limited, but growing", so presumably it's still being developed.
 
Late to the discussion, but there was a recent Debian package maintainer that dropped a Rust project because the tooling is very poor.


Rust feels like it's changing too quickly to be used in any core OS components.

... for Debian. This particular difficulty is rooted in an incompatibility of the Rust-written tools going forward in main releases only (no stable branches with backports) and the Debian concept of not doing major updates during a Debian releases' lifetime. It wouldn't have happened in FreeBSD ports, where everything is newish releases, too.
 
Late to the discussion, but there was a recent Debian package maintainer that dropped a Rust project because the tooling is very poor.


Rust feels like it's changing too quickly to be used in any core OS components.
So standardizing is important.
Once (de-jure) standardized internationally by, for example, ISO/IEC and/or IEEE, one can state "This code is written in ISO/IEC nnnnn:20xx spec." and toolchains supporting the standard should sanely build it. I believe toolchains for standardized programming languages SHALL NOT drop supports for older standards (at least non-optional specs) unless explicitly discarded by the standard itself.
 
He was asking for a defined interface spec so that if the interface changes, the C code has to change AND the Rust code. The interface change precipitates all code change, not C code change precipitating Rust code change.

In this way, just "writing C" does not break Rust, which is what they were trying to emphasize, but writing C that breaks the agreed upon interface does.
Yeah, fair enough, I was just trying to put it in general terms :)
 
Yeah, fair enough, I was just trying to put it in general terms :)
Though do note that having to suddenly start agreeing on interfaces (which could otherwise be private) just because Rust needs bindings ontop of it all will still add some serious burdon on the developers.

Many of the Rust developers get upset if these additional interfaces (I believe lifetime objects also comes under this) get rejected.
 
Interestingly enough, there is at least one already, it's called redox-os. I haven't tried it, so I've no idea whether it's any good.
I hope posting some info on it doesn't break forum rules!

Just for interest:-


I noticed their FAQ says "Redox is still under development, so our list of supported applications is currently limited, but growing", so presumably it's still being developed.
Redox-OS is a very very interesting project, not just because it is written in Rust. It is a Unix-like OS but it brings a lot of new ideas to the table.

In general most OS kernels written in the past few years are written in Rust.

Just google "site:github.com rust micro kernel"
 
Though do note that having to suddenly start agreeing on interfaces (which could otherwise be private) just because Rust needs bindings ontop of it all will still add some serious burdon on the developers.

Many of the Rust developers get upset if these additional interfaces (I believe lifetime objects also comes under this) get rejected.
It's serious Quite, almost fatally serious.
This would indicate we have only 2 way to go.
  1. Resurrect introducing Rust at all and wait for another memory-safe language which defaults C calling conversion.
  2. Wait for Rust to drop the additional things from calling conversions itseld and treat them differently (shared memory?).
  3. Force toolchain on FreeBSD always generate the interface regardless with which language to be used, mean, comlpetely change C calling conversion.
To go with the third option, the calling conversion must be standardized to promise forever backward compatibilities.
Maybe Proof-of-Concept works (not really to be committed) would be OK in the meantime, but actual rewrites for base should wait for it.
 
It's serious Quite, almost fatally serious.
This would indicate we have only 2 way to go.
  1. Resurrect introducing Rust at all and wait for another memory-safe language which defaults C calling conversion.
  2. Wait for Rust to drop the additional things from calling conversions itseld and treat them differently (shared memory?).
  3. Force toolchain on FreeBSD always generate the interface regardless with which language to be used, mean, comlpetely change C calling conversion.
To go with the third option, the calling conversion must be standardized to promise forever backward compatibilities.
Maybe Proof-of-Concept works (not really to be committed) would be OK in the meantime, but actual rewrites for base should wait for it.

Option 4. Wait until Linux implodes on itself, and let all the good C engineers migrate to FreeBSD.
 
... wait for another memory-safe language which defaults C calling conversion.
I don't think memory safety is possible when calling C functions using ONLY the information in the C function header files. If you have a function with the signature "void foo(int* p_integers)", you have no idea whether p_integers is intended to be shared, what the thread safety of that pointer is (can the thing pointed to be changed from another thread?), whether ownership of the pointed to thing can be transferred to the function, how many integers the function expects, and so on. And when it gets to more interesting functions (such as "int* bar(int* p_array[])"), the problem gets harder. With data structures that have referential guarantees (which pervade a file system implementation), I'm not even sure that Rust's semantics can capture everything we need.

In theory, I would love it if the specification of EVERY function in a piece of software (whether the kernel or otherwise) would be very clear on how to you use any reference or pointer data type, every argument to all functions, the exact semantics of the return type, and error behavior. Rust forces us to do that, and that's a good thing. But retrofitting this into existing code, while the existing code needs to continue being maintained, modified, and enhanced, is a tough problem. It is not a purely technical problem: In theory, if a competent (group of ...) software engineer could stop the world, prevent all other changes from happening for a short period (of just a few months or half a year), they could go through the kernel and document and clarify every single function as to what arguments and return values exactly are, in terms of memory and thread safety, locking requirements, and so on. It would turn a 1-line function definition into a small chapter of text. The non-technical problem that the Rust maintainer seems to have stumbled over is that doing this on a moving target (there Linux kernel) is like rebuilding a plane in flight. That requires buy-in from all parties, a clear commitment to all the extra work that needs to be done, and an understanding of costs and risks. If that was not there before the Linux kernel Rust project started, that was going to sooner or later create trouble.
 
I've never studied Zig yet, so not sure about Pros & Cons between Rust and Zig, but if Zig better fits with C calling convention than Rust and having the same level of memory safety, Zig seems to be the way to try.

By the way, is adding new section to ELF object format and move ALL non-C parts of calling convention of Rust into it make sense?

For example,
  1. allocate memory region there
  2. put non-C interfaces into the allocated region
  3. set wanted access control using MMU for the allocated region
  4. push the address of the region (convert to physical address if needed)
  5. call function as C calling convention (cdecl)
  6. cleanup parameters other than the address of allocated region (cdecl)
  7. cleanup the address of allocated region (Rust)
  8. dealloc the non-C interface region

This way, C codes just links/runs as usual (as usually the additional region is not needed and ignorable, and if accessed, MMU should expose error).
And if the linked object was compiled from Rust, additional interfaces are available.

This it a quite rough thoughts. Maybe I would missing some important points. But not limited to Rust, moving to other languages would need to be continuous (continual) improvement basis, slow but steady way.
So this kind of difficulties are mutually not avoidable, unless the language has 100% backward compatibilities with cdecl in its resulting objects.
 
Though do note that having to suddenly start agreeing on interfaces (which could otherwise be private) just because Rust needs bindings ontop of it all will still add some serious burdon on the developers.

Many of the Rust developers get upset if these additional interfaces (I believe lifetime objects also comes under this) get rejected.

These "additional interfaces" are always there (ownership and lifetime). It's just that in C these interfaces are not made explicit, but are expressed in code - consistent throughout the code base, if you're lucky. At least for key elements, defining and documenting these "additional interfaces" would actually increase code quality and readability, by a lot. If you think this is too much of a burden, then you're not meant to develop in a team effort.

Many C developers get upset if they get called out for being decades behind industry standards in (C) code quality. And there's no way to introduce a memory safe language without touching that topic, unfortunately. Sure there are technical difficulties to be discussed, but this is more of a cultural issue.

Using the C calling convention while being memory safe otherwise is supposed to be the domain of Zig.

I give up. Zig has its merits, but memory safety is very explicitly left to the programmer. You're trolling, right?
 
These "additional interfaces" are always there (ownership and lifetime). It's just that in C these interfaces are not made explicit, but are expressed in code - consistent throughout the code base, if you're lucky.
Indeed. And if they change, things carry on. Once they are made explicit (i.e to babysit Rust), and they change, things break.

by a lot. If you think this is too much of a burden, then you're not meant to develop in a team effort.
Which is why the Rust guys are leaving. They don't want to create or maintain the C interfaces; they don't truely want to be a part of the team.

At least for key elements, defining and documenting these "additional interfaces" would actually increase code quality and readability, by a lot
I would recommend keeping them private. That would lead towards a cleaner architecture. You can document them but exposing them as an interface is incorrect. When you get some experience in large projects, you will come to appreciate that.

Many C developers get upset if they get called out for being decades behind industry standards in (C) code quality.
Hah, no. They are the industry standard. Some people are simply disappointed that the industry standard moves slowly and isn't trendy. It actively hurts some guys that POSIX/SUS dictates C99 (it used to be C89 until only recently!). "Thats like over 20 years old!!!"
 
I give up. Zig has its merits, but memory safety is very explicitly left to the programmer. You're trolling, right?

Kinda, yes. I am mostly interested in Zig because of the macro and compile-time computing capabilities. I didn't have time to explore them. But while Zig might not offer much in the way of (de)allocation safety it does at least have real collection classes with access control and algorithms that you can apply without casting and without re-stating collection and element sizes.

I partially brought this up to point out that extensive interfacing to C doesn't have to be as hard as Rust (and most other languages) make it.
 
Or Ada, which is the most strongly standardized programming language in the world, if I recall/understand correctly. No compilers which does not passed the validation test is allowed to call itself as Ada.
(I myself don't have any experience in Ada, though.)

It's old, surviving for long years language and surprisingly also listed as one of the memory-safe languages. And states that it can link with other languages including C. (Sorry, not yet look into in details.)

Putting human resources aside, most singificant drawback would be the absense of BSD-compatiblly-licensed toochains, at least in ports tree.
Yes, there are lang/gnat12 and lang/gnat13 that can compile Ada codes, but unfortunately, they are licensed under GPL, means, cannot merge into base.
 
It's old, surviving for long years language and surprisingly also listed as one of the memory-safe languages. And states that it can link with other languages including C. (Sorry, not yet look into in details.)
Its actually not a bad language. I have used it (briefly) for telecoms in the UK rather than defense, which is where it is typically cited back in the day.

The issue comes with the C binding process. If you have time, check out this link, particularly the section "Adapting bindings". This provides some insight into the issue. The biggest issue for me is System.Address (a C pointer datatype). Not only is it still not memory safe, but now you have lost all type safety too. Bindings suck.

Ada also has a form of RAII, which means lifetimes are an issue too; the underlying C could potentially strip out the data from underneath Ada/Rust/C++'s assumption of RAII lifetime and you are back to some fairly complex debugging. Only now, you have to weave between different language layers.

RedoxOS or R9 are our only hope ;)
 
Indeed. And if they change, things carry on. Once they are made explicit (i.e to babysit Rust), and they change, things break.

If they change, things break in every place you forgot to adjust. Rust or not. It's actually a shortcoming of C.

Which is why the Rust guys are leaving. They don't want to create or maintain the C interfaces; they don't truely want to be a part of the team.

One Rust guy is leaving. In Linux. Because the gatekeeper there refuses to cooperate with them, not because they refuse to do additional work or maintenance.

I would recommend keeping them private. That would lead towards a cleaner architecture. You can document them but exposing them as an interface is incorrect. When you get some experience in large projects, you will come to appreciate that.

The objects in question are not private in a monolithic kernel, that's part of the current design. They have to be shared, and that requires an interface. So does making the kernel less monolithic. Someone with experience in large projects should know that.

Hah, no. They are the industry standard. Some people are simply disappointed that the industry standard moves slowly and isn't trendy. It actively hurts some guys that POSIX/SUS dictates C99 (it used to be C89 until only recently!). "Thats like over 20 years old!!!"

That's the POSIX interface, not what has to be used to implement the kernel. Apart from that you seem to have no idea what code quality means. Linux and FreeBSD make up a lot with reviews and testing, but code quality is definitely subpar if I compare it to typical embedded development today. Which is like the only remaining domain of C.
On a side note, a majority of the embedded projects I see are in C++ now, for such "trendy" features like type safety and const.
 
Kinda, yes. I am mostly interested in Zig because of the macro and compile-time computing capabilities. I didn't have time to explore them. But while Zig might not offer much in the way of (de)allocation safety it does at least have real collection classes with access control and algorithms that you can apply without casting and without re-stating collection and element sizes.

I partially brought this up to point out that extensive interfacing to C doesn't have to be as hard as Rust (and most other languages) make it.

There's some good stuff in Zig, treating types as actual parameters is something I would definitely prefer to C++ style templates. But any memory safe language would require more efforts to interface with C, which is probably one of the reasons Zig is not.
 
Back
Top