The Case for Rust (in the base system)

Good intention - https://www.circle-lang.org/site/intro/
But wouldn't it be better to offer this as a proposal for the next ISO/IEC 14882 standard, rather than create a non-standard compliant C++ compiler?
I agree. And in some ways it shouldn't even need to be an addition to the language standard but simply a replacement to the "standard" library (which is crap, almost like Bjarne's cfront's SC was also crap).

Things such as:

Code:
inline T& vector::operator[](const VecLock& _index) { _index.lock(m_criticalpin); ... } /* index is an RAII lock*/
inline VecLock::VecLock(const size_t& _index) { ... } /* Lock has a conversion contructor so it is treated like a size_t */

Goes a long way to creating 100% memory safety. Simply lock memory upon access (and during lifespan of access). Sure, it might not be as flexible (pointers *can't* dangle) but that is still less of a refactor than a rewrite in a different language where pointers also can't dangle). Best thing is that it can be stripped out at compile time for the fastest unsafe builds.

The problem is that for the last decade, the C++ community has been like a little fat child wanting more and more features rather than actually considering safety. The last time was Technical Report 1 (TR1) pre C++0x and pre-c++11. That is a long time since actually actioning safety.

At work we use such a replacement (i.e <sys/vector>, <sys/list>) and it has been remarkably successful. So much so that we are in the process of writing a paper and try to get it out there. The problem is that being *the* industry standard, C++ is filled with noise from all random twits.
 
I think that the problem is that it cannot be “replaced”, because “compatibility” with C must remain (or is it time to forget about it?).

I'm afraid we're getting further and further away from the topic (the topic is about Rust after all) :)
 
I think that the problem is that it cannot be “replaced”, because “compatibility” with C must remain (or is it time to forget about it?).
Its interesting because no matter how much we break compatibility with C in C++, it will *still* have better compatibility with C than a language like Go/Rust/Swift which is not close to a superset. They require binding generators (SWIG/bindgen) which can only do ~80% coverage of C.

Besides, I don't think compatibility with C *or* C++ needs to be broken. std:: just needs to be frozen and a new (safe in debug+runtime) standard library developed.
 
Sometimes you have to wonder what's a 'better approach' ... they all have advantages and drawbacks.

For example, in C/C++ there is a proliferation of non-standard libs and templates, so the result is a toolkit (like x11-toolkits/qt5). The drawback is that you can get lost trying to keep track of scope where a particular call even applies, so debugging can be a major pain.

Or you can try to do memory safety as a standard feature in the language, like Java - but then the compiled code is not that efficient.

And even if you try to resolve Java's shortcomings with Rust, the crates rear their ugly heads in every single Rust-based project. ?
 
But wouldn't it be better to offer this as a proposal for the next ISO/IEC 14882 standard, rather than create a non-standard compliant C++ compiler?
Circle C++ does look to be a good way to provide a proposal in the form of a complete vision as opposed to lots of little suggestions. I mean, you can have both, right?
 
Back on topic - does Rust have a bootstrap compiler? Something written in C/C++ that could be used to compile a full fledged Rust toolchain? This smaller compiler could fit in base and possibly be complete enough to compile Rust coded kernel drivers (which - in my opinion - should be crate-less).

Edit: seems the answer is no.
 
Back on topic - does Rust have a bootstrap compiler? Something written in C/C++ that could be used to compile a full fledged Rust toolchain? This smaller compiler could fit in base and possibly be complete enough to compile Rust coded kernel drivers (which - in my opinion - should be crate-less).

Edit: seems the answer is no.
Of course, at the first place. But maybe no longer maintained, at least the one which does NOT include codes written in Rust at all.

If I recall correctly, even C had bootstrap compiler at the first place (in AT&T Bell rabo) written in asm (or even directly typed in [or punched] in binary codes). Without it, how can C compiler codes written in C be built?
Just the same SHALL be applicable to Rust. Not sure it was written in C or not, as any programming language which already has compilers or interpreters and possibe to write compilers can do the job sanely.

And once the first compiler which can cross compile to other CPU/OS is finished, the supported CPU/OS can have full compiler.

Frankly, the previous version of any compilable language can build at least the bootstrap compiler for the next version. Without it, how the compiler developer build the new version? And Rust has a port for building bootstrap for next version, lang/rust-bootstrap.
 
Its interesting because no matter how much we break compatibility with C in C++, it will *still* have better compatibility with C than a language like Go/Rust/Swift which is not close to a superset. They require binding generators (SWIG/bindgen) which can only do ~80% coverage of C.

Besides, I don't think compatibility with C *or* C++ needs to be broken. std:: just needs to be frozen and a new (safe in debug+runtime) standard library developed.
In my humble opinion, must-kept compatibility is the ability to sanely link with codes written in C.

The best would be let LLVM to have Rust frontend, using backends already existing in FreeBSD base. Just pull in Rust "language" only without external (not being managed by FreeBSD project) crates (ecosystem), generates same object format just as C frontend (clang, cc).
 
Same with CMU Common Lisp, which you could only compile with CMU Common Lisp. Once upon a time it was bootstrapped from Spice Lisp, but that was long gone during most of CMUCL's time.

CMUCL's derivative brought back bootstrapping from select other CL implementations.
 
A lot of the posts here talk about the need to have a very simple and easy way to use existing C code in whatever new language we want. Supposedly because for the foreseeable future we'll want to call the existing code.

I will make one counter-argument, and tell one anecdote to strengthen it.

Counter-argument: There are two kinds of C interfaces you would want to call. The first kind is things like the POSIX standard library, with functionality like open file, connect socket, and such. Those things a new language will want to wrap into interfaces that are idiomatic in the new language anyway, so normal code won't call them. And even if it did, those things are very very safe and hard to abuse (they're designed that way).

The second kind of C code you will want to call is stuff that is written by "yourself" (in big systems that usually means in-house teams). This is exactly the kind of code you want to get away from, because it is unsafe, and too often not well done. You don't want to call this code, you want to replace it with well-written and safe code.

In reality, there is a third kind of code: well-tested libraries that you acquire from outside, for example open source, or commercially (yes, commercial library code still exists, but is not often seen by amateurs). I think this category is rare, and many things in it (in particular open source code that comes from badly run development projects, see the recent xz debacle) is in the second category.

Now the anecdote. Three decades ago, I worked in a company whose main code base was 1/2 million lines of very messy C code, which had been partially upgraded with C++ features (we have a few classes), and was very badly written and unstable. The company was not a software company, many "software engineers" were part-timers from other departments (be it image processing, electrical engineering, or manufacturing technicians), and it didn't have a culture of good software processes. Since I was already a C++ expert, I was explicitly hired for the great project that was going to save the code base: A new framework, explicitly designed in C++ for safety and good coding practices, intended to be friendly for less skilled software engineers (since the reality of having to use part-time engineers didn't go away), and able to call some of the existing code after vetting it. That was a TOTAL disaster. Why? Several reasons. First, it is very difficult to write leak-free and memory-safe C++, and with the state of the language in 1997, it was virtually impossible. Second, C++ is such a hard language to master (and you need mastery to write safe code that is correct even under error conditions, while any beginner can hack up something that functions on the good path) that working a large project with not all great programmers didn't work. And most importantly: Every time you called any of the old code, things blew of left and right ... just like the old code was blowing up all the time in its normal production usage.

At this point, an important decision was made: We gave up on C++. We switched to Java, the best choice available at the time, but explicitly embedded a Python interpreter in the Java code. The single most important rule was: We will never ever call old code. Instead, we rewrite it in Java, while simultaneously documenting, sanitizing, and organizing it. If a part-time programmer who is not a skilled software engineer needs to use the system, they can write Python scripts. About 1% of the code base was determined to be performance critical, and that was re-coded from scratch in C and called via JNI. The way we did this is that a scout team of 5 people (I was one of them) implemented a tiny part of the 500K line system in Java from scratch, allowing a lot of scaffolding (no DB interfaces, no UI/UX) in one year, as a proof of concept, and to get the basic coding standards, software processes and generic libraries done. Then a rapidly increasing team of up to 150 engineers was added to the team, and within 3 years, they had re-implemented the whole thing into a functional state.

This started about 1997 or so. I last talked to former colleagues a few years ago (sadly, at the funeral of a colleague): the system is now up to 17M lines, covers a much larger product line, can do networked (cloud-like or scale-out) operation, is used by multiple corporations, and has been touched by thousands of engineers. The basic design decisions are holding up fine. Why? Because early on we decided to intentionally give up compatibility and calling old, badly written code, and making "clean design and good craftsmanship" the guiding principle.

From this viewpoint: I prefer a language that can NOT call existing C code.

talking of rust

in the 1980s my Dad had a Lancia Delta car when he lived by the sea in Brighton
and all the salt in the sea air caused the car to rust and you could poke your finger through the body work

Ah, Lancia, a close relative of Fiat and Alfa Romeo. I drove an Alfa Spider when I was a graduate student. It is the only car that begins rusting when its picture is printed in the sales flyer.
 
In my very own opinion there are deep differencese between a generic software and an operating system. They are "programs" written in some "language" but the first use the second and not viceversa. The first can be more easily reengineered, not the second, unless you want to make a new product separated by the original one.

talking of rust

in the 1980s my Dad had a Lancia Delta car when he lived by the sea in Brighton
and all the salt in the sea air caused the car to rust and you could poke your finger through the body work

Ah, Lancia, a close relative of Fiat and Alfa Romeo. I drove an Alfa Spider when I was a graduate student. It is the only car that begins rusting when its picture is printed in the sales flyer.

Ahahahaha I never owned a FIAT (or Lancia, Alfa, Autobianchi...Ferrari)
 
A lot of the posts here talk about the need to have a very simple and easy way to use existing C code in whatever new language we want. Supposedly because for the foreseeable future we'll want to call the existing code.

I will make one counter-argument, and tell one anecdote to strengthen it.

Counter-argument: There are two kinds of C interfaces you would want to call. The first kind is things like the POSIX standard library, with functionality like open file, connect socket, and such. Those things a new language will want to wrap into interfaces that are idiomatic in the new language anyway, so normal code won't call them. And even if it did, those things are very very safe and hard to abuse (they're designed that way).

The second kind of C code you will want to call is stuff that is written by "yourself" (in big systems that usually means in-house teams). This is exactly the kind of code you want to get away from, because it is unsafe, and too often not well done. You don't want to call this code, you want to replace it with well-written and safe code.

In reality, there is a third kind of code: well-tested libraries that you acquire from outside, for example open source, or commercially (yes, commercial library code still exists, but is not often seen by amateurs). I think this category is rare, and many things in it (in particular open source code that comes from badly run development projects, see the recent xz debacle) is in the second category.

Now the anecdote. Three decades ago, I worked in a company whose main code base was 1/2 million lines of very messy C code, which had been partially upgraded with C++ features (we have a few classes), and was very badly written and unstable. The company was not a software company, many "software engineers" were part-timers from other departments (be it image processing, electrical engineering, or manufacturing technicians), and it didn't have a culture of good software processes. Since I was already a C++ expert, I was explicitly hired for the great project that was going to save the code base: A new framework, explicitly designed in C++ for safety and good coding practices, intended to be friendly for less skilled software engineers (since the reality of having to use part-time engineers didn't go away), and able to call some of the existing code after vetting it. That was a TOTAL disaster. Why? Several reasons. First, it is very difficult to write leak-free and memory-safe C++, and with the state of the language in 1997, it was virtually impossible. Second, C++ is such a hard language to master (and you need mastery to write safe code that is correct even under error conditions, while any beginner can hack up something that functions on the good path) that working a large project with not all great programmers didn't work. And most importantly: Every time you called any of the old code, things blew of left and right ... just like the old code was blowing up all the time in its normal production usage.

At this point, an important decision was made: We gave up on C++. We switched to Java, the best choice available at the time, but explicitly embedded a Python interpreter in the Java code. The single most important rule was: We will never ever call old code. Instead, we rewrite it in Java, while simultaneously documenting, sanitizing, and organizing it. If a part-time programmer who is not a skilled software engineer needs to use the system, they can write Python scripts. About 1% of the code base was determined to be performance critical, and that was re-coded from scratch in C and called via JNI. The way we did this is that a scout team of 5 people (I was one of them) implemented a tiny part of the 500K line system in Java from scratch, allowing a lot of scaffolding (no DB interfaces, no UI/UX) in one year, as a proof of concept, and to get the basic coding standards, software processes and generic libraries done. Then a rapidly increasing team of up to 150 engineers was added to the team, and within 3 years, they had re-implemented the whole thing into a functional state.

This started about 1997 or so. I last talked to former colleagues a few years ago (sadly, at the funeral of a colleague): the system is now up to 17M lines, covers a much larger product line, can do networked (cloud-like or scale-out) operation, is used by multiple corporations, and has been touched by thousands of engineers. The basic design decisions are holding up fine. Why? Because early on we decided to intentionally give up compatibility and calling old, badly written code, and making "clean design and good craftsmanship" the guiding principle.

From this viewpoint: I prefer a language that can NOT call existing C code.



Ah, Lancia, a close relative of Fiat and Alfa Romeo. I drove an Alfa Spider when I was a graduate student. It is the only car that begins rusting when its picture is printed in the sales flyer.
You missed some points.
First of all, FreeBSD would be categorized into your case 3.

Secondly, if compiled codes by new compiler language like Rust to be imported cannot link with C codes, it measn ALL CODES INCLUDING WHOLE ECOSYSTEM (not disclosed but using FreeBSD as its base or platform) SHALL BE REWRITTEN ALL AT ONCE. Otherwise, it means unworkable systems in the wild. It can be real disaster, if any of important infrastructure in the world are affected. It SHALL not happen. This is because I'm repeatedly stating Rust in FreeBSD base must be forced to use cdylib at minimum. FreeBSD base is an operating system. Platform. Not a middlewares nor applications.

Of course, codes for independent utilities, which does NOT provide any libraries and do NOT call anything except fundamental system libraries like libc just for system calls would be no problem to reimplement.
 
I think we all are missing the point. What we need is some new hardware which enforces the memory access according to rights for any object itself. Otherwise, there will always be ways to trample on some memory you are not allowed to trample on. We can try to make languages memory safe, make one context memory safe - the system itself will not be memory safe. And we will not change that. We should invest the energy into that. Change my mind.
 
I think we all are missing the point. What we need is some new hardware which enforces the memory access according to rights for any object itself. Otherwise, there will always be ways to trample on some memory you are not allowed to trample on. We can try to make languages memory safe, make one context memory safe - the system itself will not be memory safe. And we will not change that. We should invest the energy into that. Change my mind.
Something like Cheri hardwares?
 
On a much finer grid. Means, you have a key which is part of the pointer and memory assigned to that key. You don't have the original pointer (and thus not the key) you get a trap upon accessing any memory with it. A buffer would thus be bound to the pointer from it's creation and doing things like stomping on return addresses or messing with heap management would fail. Still no fix for rowhammer, but a lot better than trying to guarantee this from a compiler. Because inline assembler will come up and the tricks you can do with the relocation information in ELF (that s*** is touring complete. What the f...?)
 
I think we all are missing the point. What we need is some new hardware which enforces the memory access according to rights for any object itself. Otherwise, there will always be ways to trample on some memory you are not allowed to trample on. We can try to make languages memory safe, make one context memory safe - the system itself will not be memory safe. And we will not change that. We should invest the energy into that. Change my mind.
It is a nice dream. The ancient Borroughs 5xxx machines actually had the first such thing, where each memory cell knew whether it contains an integer or a floating point number, and trying to use it the wrong way caused the computer to stop. This was in the 1960s, where the idea of complex data structures in memory, multiple threads, and all the modern complexities didn't exist yet.

One of the reasons it remains a dream is that sometimes you need to violate memory access protections. For example, in a von Neumann architecture, you need to load a program from disk into memory. At this point, it is an unstructured array of bytes, since that is what is stored on disks (meaning in file or block storage systems). And then you need to tell someone: this set of bytes is allowed to be executed. That action right there is like the moment Eve bit into the apple: the original sin. Sure, you could teach disks to also only store structured data, where each byte is tagged with "this can be read by process X, written by process Y, and executed by process Z". But who writes this structured data to disk? Say for example a compiler. Thereby reducing the problem to Ken Thompson's famous C compiler that recognizes when the password checking code is being compiled, and puts a backdoor for Ken in. Who watches the watchers?

So an absolute and perfect solution for memory protection in hardware is probably not even feasible. Would a partial solution (with some loopholes built in) work? Would it be an efficient use of gates, cycles, and electrical power? Of brain time?
 
Back
Top