How FreeBSD utilize multicore processors and multi-CPU systems?

ralphbsz · Tuesday at 4:34 AM

Sergei_Shablovsky said:
Not so sure: I have experience personally and a lot of facts on users forums that this ‘good and well tested BIOS support” are just not more than “really good compatibility between parts in a limited set”.

Absolutely. For example, the servers I used (IBM/Lenovo X3650 M4) only recognize a very limited set of PCI cards, namely those sold by IBM/Lenovo, even if they are manufactured by Mellanox or LSI/Broadcom/Avago. And this makes perfect sense, and is not caused by marketing being evil: The motherboard has to know the power consumption, PCI lane assignment and cooling needs of the PCI card, and the only way to do that in general (as the industry has not managed to create standards for it) is to have a table for it. So a customer that stays within the vendor ecosystem gets excellent BIOS support. If you put an unsupported card into it, the BIOS can't help, and it will usually do something "safe, slow, and boring", which is the reasonable choice. Anecdote below.

About the price of these high-end servers:

Not agree! Look at the IBM M3/M4 series on eBay: $150-250/each (with 2xPSU, not bad RAID, eventually good CPUs and amount of RAM)+ shipping.

Yes, that's what they cost after many years on eBay. If you bought them brand-new, without a discount, they were about 100x more.

Anecdote: Take one of those servers, and put a card in the BIOS knows about: It will set the fans to a reasonable speed, monitor the temperature, and adjust the fans accordingly. Put an unknown card in: since the BIOS doesn't know the power usage and heat profile of the card, it will turn the fans to maximum, and the whole server will sound like a jet fighter on takeoff, but at least it is safe. In our lab, we managed to do the opposite: We put a PCI card in that had something like "manufacturer ID 0 serial number -1" (it was a prototype), and due to some bug, this caused the fans to turn off. A colleague and I were using the server with the new card from our office (3 floors above the data center), when the server went offline, and couldn't get rebooted. We looked at our own monitoring, and saw that the last temperature reported by the PCI card was 106 degrees ... Celsius, not Fahrenheit. Sounds bad. We went down to the server room, and immediately noticed the smell of fried electronics. Open the server, pull out the prototype card: the PC board around the chip was dark brown. Big oops.

PMc · Tuesday at 6:22 AM

Well well, this weekend I gave it a try with the second socket.

All the time, since I wanted ECC for my 24/7 zpool machine, and finally got an affordable Xeon-EP board, that second socket was sitting there empty. And Haswell Xeons are now 10 years old and dead cheap. (The boards are not, they are still difficult to get - because they are still good.)

Observation now: the twin ops is inefficient regarding energy. The power consumption increases by some 25 Watt, just for idling. (The chip itself would then report 9 Watt.)
And there is the extra problem of getting all the energy, now in calories, off the board and out of the machine. I'm still using the high-tower case that came with a 80486 - because ATX form-factor hasn't changed - but fitting (and also cooling) dual Xeon and 20 disks is getting increasingly difficult.

I didn't get as far as to look into compute performance, because that second chip just died, too soon and for unknown reasons.

Mjölnir said:
OT, you might already know this... but I hope you do not intend to put the (external) packet filter (often loosely called "firewall") onto the same physical machine than other services. Don't do that. It must be on it's own physical machine, solely for that purpose, and no other services on that host. In contrast, you can merge the internal PF onto the same machine as a DMZ host (with gateway services (proxy, load balancer, mail etc.) jailed or in VMs), but not the external one.

That issue doesn't apply to me, because I run all nodes internet-facing - just like in the good old times before IP4-exhaustion, when every coffee-maker on the campus was globally reachable. With IP6 we can do that again - and so each of my nodes has it's own firewall, and it's own DMZ.
The background is, simply, I am not afraid. (We all know from the stock exchange that greed and fear relate to each other and are the main motivation of business people. But I am not greedy, so I do not need to fear.)

ralphbsz said:
This is actually a very important observation. Amateur computers, built using desk side cases, cheap fans (often only 1 or 2 fans), and whatever power supply is on sale at NewEgg this week, tend to be somewhat unreliable. Enterprise-class servers may have less CPU power or slower RAM, but they may up for it by having about 10 or 12 fans (each individually pretty small)

Yeah, I was looking for these fans.
Common understanding is, with a strong CPU (>100 Watt) you need kind of a monstrous heat dissipator (and the market is full with all kinds of these). But, 1U blade with two high-powered Xeon does not have or need any such thing, and they would not fit anyway.

So what's the secret? Must be these fans, they spin 14'000 rpm and cost about 50$ each.
Open question at this point: how noisy would that be? I might assume: noisy.

In contrast, a modern "super silent" fan does 1400 rpm, is super silent and moves not really much - practically nothing when run thru an air filter. I for my part tried to replace that with a modest 8 Watt, 3200 rpm model. That one was loud, so I added an inline thermosense regulator. Then it was wonderfully silent - because the 8 Watt had rightaway killed the regulator. Anyway, I am getting nearer to the optimal solution.

Considering: up to about Pentium-2 there was no real difference between server and desktop except the case - it was all just computer. Yes, there were multicore platforms, but these were very special and rare constructs for high-performance demands.

Since then, things have diverged. And the desktop market has become a cult of fetish believers.
Those monstrous heat dissipators are fetishes. (BTW: for what technical purpose does one need LED lights inside a CPU cooler? None? So then why are they reported first-place, instead of any useful info?)

Those fetishes are sold via the narrative of "technological advance" - which is bullshit because "technological advance" here does not mean technological advance. 20 years ago, the performance of the machines would double every year - so there was indeed continuous technological advance. Now it doubles maybe every ten years, and the actual technological advance in the desktop field is negligible..
"Technological advance" in this sense is now a believe system: by constant repetition people are made to believe in it without thorough evidence, in order to make them continue to buy new stuff every year.

The personal computer market has become much similar to the cars market: there has to be a new model every year, but there might not be much of an improvement, and it might also just introduce new bugs. The main feature of the new model is that it is shiny and new.
But, with cars there is indeed an issue that they get more and more unreliable with increasing age, due to wear and tear. This is by no means true with computers (but still people are made to believe it would be).

All of this is then part of a larger trend in our society: we have done away with traditional religion, and, nevertheless depending on something to believe in, have made our flawed understanding of technology the new belief. And we have created new priests whom we worship: the big capitalist oligarchs (Amazon, Google, Intel, Facebook etc), who in fact rule this planet now, much like the churches of old.

Originally it was us who assembled these servers. It was us who made things work. And it was not considered a science, but a craft, similar to plumbing. Now the gadgets come ready-made from Google and Amazon, and they are not only built, but also managed by Google and Amazon, which is important because Google and Amazon as your new churches need to know every step you take.

ralphbsz · Tuesday at 7:23 AM

PMc said:
Common understanding is, with a strong CPU (>100 Watt) you need kind of a monstrous heat dissipator (and the market is full with all kinds of these). But, 1U blade with two high-powered Xeon does not have or need any such thing, and they would not fit anyway.

So what's the secret? Must be these fans, they spin 14'000 rpm and cost about 50$ each.
Open question at this point: how noisy would that be? I might assume: noisy.

It's not the spin speed, it is having a heat sink (you call it heat dissipator) that is well matched to the speed and volume of air flow. The small fans in 1U and 2U servers typically use relatively small heat sinks, with relatively thick fins, and pretty high air velocity. The noise comes from a combination of fan spin (the blades move very fast) and air speed. Sometimes, in 1U servers, they put 2 fans in series, to get higher air speed and therefore volume. And to get reliability through redundancy: one of the two can fail; at reduces clock speed and load, the computer can keep running.

The other magic that rack mount servers use: Really good air handling. If you open them up, you typically find all kinds of plastic pieces and deflectors, which steer the air. The idea is to get cool outside air (not preheated by some other part of the electronics) into the fan, then accelerate it past the CPU heat sink, and then get the super hot air out the box again before it cooks something else.

The whole thing is a very important and complex science. The trade has its own magazine (cunningly called "Electronics Cooling"), there are at least hundreds of people working in it (I know about a dozen of them in Silicon Valley), and it has huge economic and environmental impact. Making data center cooling 1% more efficient is probably like turning off the air conditioner for a million households in electricity usage.

Now, in amateur computers, giant heat sinks (which I think, having a dirty mind, is what men use whose equipment isn't large enough), super slow moving fans that make barely a noise, and a light show that makes the memory DIMMs think they are in a 1980s disco when "Staying Alive" is playing ... that's just silly. Fun but silly.

PMc · Tuesday at 4:39 PM

ralphbsz said:
It's not the spin speed, it is having a heat sink (you call it heat dissipator) that is well matched to the speed and volume of air flow. The small fans in 1U and 2U servers typically use relatively small heat sinks, with relatively thick fins, and pretty high air velocity. The noise comes from a combination of fan spin (the blades move very fast) and air speed. Sometimes, in 1U servers, they put 2 fans in series, to get higher air speed and therefore volume. And to get reliability through redundancy: one of the two can fail; at reduces clock speed and load, the computer can keep running.

The other magic that rack mount servers use: Really good air handling.

Yes, this is very much what I figured out from the photos. These machines do not have fans on the CPUs at all, instead they have a strong airflow from front to back, cold-isle to hot-isle.
So there are two possible approaches. One can have these big coolers with 120mm fan, and they will do their job in keeping the CPU Tjmax not far above the case-internal ambient. For a gaming rig that should then almost be enough - some of the air then needs to be moved out of the case, and if things really get too hot, one can always stop the gaming.
But this is no solution for the NAS-server or backend rig.

ralphbsz said:
The whole thing is a very important and complex science. The trade has its own magazine (cunningly called "Electronics Cooling")

There is certainly a science about building the low-profile rackmounts. But , adapting some of that insight in order to get to a reliable home server, that is not that difficult - only not much talked about.

For instance, there are videos on YT about how to replace the fan inside the PSU, if a different color is required.
In the same fashion one could get an 8 Watt / 3200 rpm version that does fit in as well - only we don't get these from the computer shop, we get them from the electronics supply. Also I wouldn't attach that to the PSU internal fan connector, because the increased wattage could also kill that regulater.
Instead put it on the 12V rail, and devise some simple electronics for a regulator (or just a low/high switch). Then, we can read the actual calories currently eaten by each CPU and memory with cpucontrol. We can also read which of our disks are currently spinning, with camcontrol tur. Sum this up and steer the regulator.
Then get a K-sensor, measure the airflow, and adjust additional fans and air directions.

A lot of the nifty-thrifty (and expensive) special server features are then covered, at nearly no cost.

There are a few minor failsafes also, like this one: kern.poweroff_on_panic=1
Since the thermal steering does run in userland, not in firmware and not even in kernel, there is some runaway risk if the userland does not operate properly.

Sergei_Shablovsky · Wednesday at 2:36 PM

ralphbsz said:
Absolutely. For example, the servers I used (IBM/Lenovo X3650 M4) only recognize a very limited set of PCI cards, namely those sold by IBM/Lenovo, even if they are manufactured by Mellanox or LSI/Broadcom/Avago.

Let’s to note,
- from my own experience and knowledge, the M3 still are “STAY AS A ROCK” most stable version of IBM xServe’s line, and the same time M4 (especially earlier both motherboard and firmware editions) are worst line in IBM’s history;
- on both IBM and Lenovo user forums are A TON of messages that describe negative experiences with M4 itself and with conjunction with other 3rd party hardware and software;

ralphbsz said:
And this makes perfect sense, and is not caused by marketing being evil: The motherboard has to know the power consumption, PCI lane assignment and cooling needs of the PCI card, and the only way to do that in general (as the industry has not managed to create standards for it) is to have a table for it. So a customer that stays within the vendor ecosystem gets excellent BIOS support. If you put an unsupported card into it, the BIOS can't help, and it will usually do something "safe, slow, and boring", which is the reasonable choice.

Very smarty and clever, the certainly for that I so love to make installations on production with IBM.

Thank You for sharing the experience and remind me to reading more about this management.

BTW, where am I able to read about exactly technical details of this hardware-management technologies that IBM use in their rack servers (not blade)?

ralphbsz said:
About the price of these high-end servers:

Yes, that's what they cost after many years on eBay. If you bought them brand-new, without a discount, they were about 100x more.

Yes! We also love reasonably discount!

ralphbsz said:
Anecdote: Take one of those servers, and put a card in the BIOS knows about: It will set the fans to a reasonable speed, monitor the temperature, and adjust the fans accordingly. Put an unknown card in: since the BIOS doesn't know the power usage and heat profile of the card, it will turn the fans to maximum, and the whole server will sound like a jet fighter on takeoff, but at least it is safe. In our lab, we managed to do the opposite: We put a PCI card in that had something like "manufacturer ID 0 serial number -1" (it was a prototype), and due to some bug, this caused the fans to turn off. A colleague and I were using the server with the new card from our office (3 floors above the data center), when the server went offline, and couldn't get rebooted. We looked at our own monitoring, and saw that the last temperature reported by the PCI card was 106 degrees ... Celsius, not Fahrenheit. Sounds bad. We went down to the server room, and immediately noticed the smell of fried electronics. Open the server, pull out the prototype card: the PC board around the chip was dark brown. Big oops.

Thank You for sharing!
New smile on my face!

But… Why UEFI not shutdown the system ? Because
- even on M2 in BIOS was settings about “Critical CPU temp”;
- system parameters monitoring utility in OS MUST shutdown server;
- external monitoring by metrics / logs MUST shutdown system;

So WHY one of this 3(!) layers not save life of hardware?

Sergei_Shablovsky · Wednesday at 2:43 PM

ralphbsz said:
Put an unknown card in: since the BIOS doesn't know the power usage and heat profile of the card, it will turn the fans to maximum, and the whole server will sound like a jet fighter on takeoff, but at least it is safe.

You remind me a story with silencing Dell 2950 (II) and (III) versions

ralphbsz · Wednesday at 5:14 PM

Sergei_Shablovsky said:
BTW, where am I able to read about exactly technical details of this hardware-management technologies that IBM use in their rack servers (not blade)?

I have no idea. I guess that they won't tell you, but ask you to install their utilities to access it and centralize the information for many machines. If you search the web, you might get lucky and find it. If you are a registered customer with a support contract, you can ask support for documentation.

When I used IBM hardware, I was an IBM employee, inside a research/development lab. We usually got information using internal channels, like "I know someone in Raleigh or Austin". Companies like IBM/Lenovo, HP and Oracle don't sell individual components. They don't compete with Tyan or SuperMicro. They provide a complete IT solution. The happiest customers are those who buy the complete service, where both all hardware and the staff that operate it come from one vendor. There are examples of large government agencies whose CIO is a vendor employee, all their computing (hundreds of staff) is done in vendor buildings, and the vendor controls all the hardware and most of the software. Those tend to be very happy customers, but that economic model doesn't work in your case.

But… Why UEFI not shutdown the system ?

The thing that overheated was not the CPU, it was a PCI card. Modern PCI cards can have very power-hungry chips on them, for example high-end SAS HBAs or Infiniband/Ethernet adapters. And the card hat prototype software, and no vendor or model ID. So the BIOS was not able to see what was going on. And due to a bug in the BIOS (it was not prepared to deal with a card that claimed to have no vendor/model ID, instead of an unknown one), the cooling fans for that card were turned off. Oops.

PMc · 2024-06-27T01:33:13+0100

ralphbsz said:
whatever power supply is on sale at NewEgg this week

It seems to me, these might actually the wiser guys. There are others who go into elaborate action to decide wich PSU might be usable, and they track them in databases and pronounce recommendations, mainly based on the understanding that a useable PSU is such with an astronomical price-tag.
I would believe that PSU is not so very much magic. We have used them since we brought the first "ibm compatibles" from Taiwan in 1986, and normally they do just work (or sometimes not). But now when I look, I could now buy these things. apparently optimized for gaming, for 60 or 80 or even 100€ (if I were crazy).
Furthermore, when I then go and look into the elaborate databases, they do say that those models for some 80€ are just crap and at best useable for an office desktop. Then their recommendations range somewhere around 200$, give or take, and have at least 16.8 million colors and maximum protection.

The problem now is, if one needs a PSU for some serious purpose, it is almost impossible to find a proper recommendation between all that madness.

BTW, it is very much the same with the thermal compound. I have some, bought about 40 or 50 years ago. It's still fine. That one is rather solid, and you have to distribute it carefully by yourself. But it does not age. It was not cheap back then, but also not as crazy expensive as it is today, and there was only one kind, because you need only one. Back then we used this for the transistors in amplifiers and similar - and such stuff was intended to work for more than twenty years and not need maintenance.

cracauer@ · 2024-06-27T01:44:51+0100

PMc said:
It seems to me, these might actually the wiser guys. There are others who go into elaborate action to decide wich PSU might be usable, and they track them in databases and pronounce recommendations, mainly based on the understanding that a useable PSU is such with an astronomical price-tag.
I would believe that PSU is not so very much magic. We have used them since we brought the first "ibm compatibles" from Taiwan in 1986, and normally they do just work (or sometimes not). But now when I look, I could now buy these things. apparently optimized for gaming, for 60 or 80 or even 100€ (if I were crazy).
Furthermore, when I then go and look into the elaborate databases, they do say that those models for some 80€ are just crap and at best useable for an office desktop. Then their recommendations range somewhere around 200$, give or take, and have at least 16.8 million colors and maximum protection.

The problem now is, if one needs a PSU for some serious purpose, it is almost impossible to find a proper recommendation between all that madness.

We need the Johnnyguru site and forums back.