This made me chuckle. Often code from PhDs isn't always the best for long-term viability. Just check out the garbage collector alone from clr (https://raw.githubusercontent.com/dotnet/coreclr/master/src/gc/gc.cpp)
This sheer volume of over-engineered code is not really maintainable (this alone is bigger than the entirety of TCC: https://github.com/TinyCC/tinycc) and they certainly haven't followed KISS principles.
For an exercise; try getting Mono 1.0 building and running on a current platform. It will take you over a lifetime of manhours to do it. I stay away from "technology" like this.
Compare it to C where it is possible to write a compiler in about a week. Someone much smarter than me could probably even make it production ready in that time.
Even though I mentioned language names; it is the underlying platforms that I am really talking about. VB.NET and CSharp are probably fine languages but until they can drop the runtime complexity; I wouldn't recommend them. But again, that is typical of PhDs, the ideas are often valid but the execution and implementation is often atrocious (this is certainly the case for my thesis). If they can find a better (more portable) solution than a garbage collector and develop a compiler that outputs native binary from CSharp code (similar to dotgnu, unfortunately not what AOT does) then I see CSharp having a much longer life. Oh and just drop VB.NET and give us back VB[6]; it was vastly superior!
I do not disagree with the underlying point you are making. But I also think that at some point, you always have to make a decision on someone smarter than you that you are going to trust. Even if you insist on keeping the bar low in terms of expertise required, at some point the sheer volume of man-hours required will turn everything into a black box for all practical purposes. So whether the black box exists because the knowledge required is so complex or specialized that you would not be able to re-implement it, or because there would be so much work to do to modify a toolchain/framework/environment that you wouldn't ever realistically have the budget for that, from a practical perspective this makes no difference, it is the same black box. So it all depends of where you arbitrarily decide to set the bar. It all depends of your risk model (which depends of your specific situation - skills, comfort zone, budget etc...).
Personally (after of course doing a risk analysis and triangulating the sanity and opportunity of investing in something both at a technological and at a strategic level), I generally speaking do not mind trusting PhDs to implement very abstract things based on sane concepts. It is usually something they do very well, and because abstract things are a common denominator, there will always be plenty of pairs of eyes and plenty of smart people enjoying optimizing these abstract problems. So I would not worry about the effort required to re-implement the VM, the compiler or such, especially since there is no reason I'd ever need to do so since I have the source code so I could keep using the same version for decades (putting aside the fact that it is unlikely there will ever be a shortage of maintainers).
Also to nitpick a little on your example, I strongly doubt that they'd over-engineer their garbage collector, this is quite a central piece of the whole thing, they are throwing a lot of resoures to extract every performance improvement possible, and I suspect this is why this file is so large as - reading their blog posts - they do not mind handling particular cases if it can makes things 5% faster. Garbage Collection, VM, and the compilers are exactly what I trust them to do best, for these are complex problems, but abstract and well understood problems. Then comes the core library, but as far as application frameworks like WCF etc... not really....
Regarding the garbage collector, they need to support plenty of use cases and customers. And a garbage collector is a very different beast than a compiler. A garbage collector is a complex optimization problem, and when we say optimize, there is the notion of "optimize for what?" (so various types of load to plan for), and how to optimize (various collecting algorithms / approaches to support - Mono even supports replacing the algorithm). So this is quite a file for sure, but this is also quite a complex challenge even on paper. And garbage collection introduces quite a lot of business value and features, it gives a lot of things for free to the developer. But whether or not (or when) runtimes relying on garbage collection are desirable is another question.