FreeBSD on IBM System z

In the first decade of the 2000s, a company called Sine Nomine Associates was working on porting OpenSolaris to IBM System z hardware. (https://en.wikipedia.org/wiki/OpenSolaris_for_System_z).

By 2008, they had it working well enough to demo. Around that time, Oracle bought Sun, and stopped supplying source for OpenSolaris, so Sine Nomine shut down the project.

I would think that porting FreeBSD would be similar to porting OpenSolaris. I have a fantasy that maybe one day one of those tech billionaires, instead of trying to rule the world, will pay Sine Nomine to resurrect their project and port FreeBSD, instead of OpenSolaris, to System z hardware.

If I ever become a billionaire, I promise to fund the project. Sadly, making my first billion is running behind schedule, and at age 73, I only have about three decades to achieve it.
 
I have recently gotten a demo of Linux on there. Very slick project.
Yes, it's very impressive. I remember attending a demo (must have been in the mid 2000s or early 2010s) of running 35,000 Linux VMs on a single physical Z host. In words: thirty five thousand. That Z host is probably pretty big, several oversized racks. I think all modern Zs are internally a sysplex, so it was probably a coupled set of multiple systems.

When I worked at IBM, we had the "disk drive" for the Z in our lab, a Shark model codenamed "megamouth". There was a running joke in our group that if it ever went missing, they would come look at my house, since I was the only person in the group with a truck large enough to steal it. It weighed 2700 lbs. I forget how many (Seagate- or Hitachi-) disk drives it contained, but must have been several hundred.

The other fun experience is on a IBM P7IH supercomputer (not a Z, but a Power machine): Starting "cat /proc/cpuinfo", and watching it scroll for a minute. That machine had 1024 cores, around 2010 or 2011.
 
Having spent 19 years of my career on s/360 and s/370, it would be nice if FreeBSD was ported to System z.

Given that the machine architecture does not have a stack -- the linkage stack isn't the same thing -- C compilers tend to simulate a stack using a series of store, store multiple, load and load multiple instructions.
 
The other fun experience is on a IBM P7IH supercomputer (not a Z, but a Power machine): Starting "cat /proc/cpuinfo", and watching it scroll for a minute. That machine had 1024 cores, around 2010 or 2011.
Not a supercomputer, but there was an interesting project that Hoplon Entertainment, a Brazilian MMO company, did with IBM around 2007. They wanted to use the DB2 database to keep track of the data for their game, so they had an IBM mainframe running DB2 hooked up to an IBM BladeServer with a bunch of Cell Broadband Engine blades to do the physics calculations. (In 2007, the Cell Broadband Engine, which was also used in the Sony PS3, was hot stuff, before NVIDIA CUDA made everyone forget about it.)

https://www.gamedeveloper.com/game-platforms/ibm-integrates-cell-into-mainframes-for-virtual-worlds

I have never come across an article that describes the architecture in detail, or even whether they were running MVS or zLinux on the mainframe. It sounds like a fascinating project.
 
Given that the machine architecture does not have a stack -- the linkage stack isn't the same thing -- C compilers tend to simulate a stack using a series of store, store multiple, load and load multiple instructions.
Yes, and there is even a convention to use a specific register (I think it was R4) as the stack pointer. The 360 architecture has 16 general purpose registers, but several of them (typically 13 to 15) are used for SVC (roughly equivalent to system calls in Unix). The details have long left my brain.

Not a supercomputer, but there was an interesting project that Hoplon Entertainment, a Brazilian MMO company, did with IBM around 2007. They wanted to use the DB2 database to keep track of the data for their game, so they had an IBM mainframe running DB2 hooked up to an IBM BladeServer with a bunch of Cell Broadband Engine blades to do the physics calculations.
There were several such projects. A friend (who is from Argentina) worked on one of them, which was to add a few chassis of blade serves to the Shark, and use it for "cheaper" CPU power. The project was called "Tiburon", which is Spanish for shark. The reason I say "cheaper" is that mainframes are famously expensive (typically M$), and if you measure them purely in terms of CPU power, they are exceedingly overpriced. But they are not sold as CPU servers, but for doing IO, and being ridiculously laughably reliable. So when several IBM applications (including DB2) needed extra CPU power, for example for doing complex SQL queries close to the disk drive hardware, the easiest way to do that was to install a few 100K$ worth of (PowerPC or cell engine) CPUs right next to the disk drive.

The underlying philosophy is that mainframes have a long history of "outsourcing" work to other parts of the system. The concept of the "channel" is that much of the IO is not actually done by the CPU, but by channel processors, which run independently, are interrupt driven, and deposit results of IO directly in memory via DMA, or talk to each other. For example, to do a disk-to-tape copy, the CPU is not really involved: it orders two channels to work together and figure it out among themselves, and report back in an hour or so. This is why in the 90s and 00s, the mainframe was capable of adding enormous network bandwidth, because its network cards were not just simple "ethernet", but they contained much of the stack, to upper protocol layers. They were also physically bigger, used more power, had more CPU power, and were much more expensive than whole Intel-based PCs.

From that tradition to adding outboard CPU assists is only a small step. In the meantime, the rest of the industry also caught up, with things like intelligent/active disks, NASD, RDMA, HBAs that execute scripts, and so on. Today every cloud server has network cards that do most of the stack (up to the https layer) without bothering the CPU.
 
Yes, and there is even a convention to use a specific register (I think it was R4) as the stack pointer. The 360 architecture has 16 general purpose registers, but several of them (typically 13 to 15) are used for SVC (roughly equivalent to system calls in Unix). The details have long left my brain.

Yes, I spent most of my 19 years on IBM mainframe as an MVS systems programmer, including as a developer where writing key 0 supervisor state code was the norm for myself and the other members of the development team. Registers 0 and 1 are used to pass arguments or register 1 points to an argument list (linked list). Register 13 pointed to a save area where the second and third words were used as double linked list pointers. R14 was the return address while for userland apps R15 pointed the entry point of the function being called (SVCs had the SVC number in the instruction). R15 contained the return code on return.

The details have not left my brain. (During a project where I spent 20 hours a day writing assembler code I had a recurring dream I was a PSW loading instructins, hence my IRC nick is usually PSW.)

There were several such projects. A friend (who is from Argentina) worked on one of them, which was to add a few chassis of blade serves to the Shark, and use it for "cheaper" CPU power. The project was called "Tiburon", which is Spanish for shark. The reason I say "cheaper" is that mainframes are famously expensive (typically M$), and if you measure them purely in terms of CPU power, they are exceedingly overpriced. But they are not sold as CPU servers, but for doing IO, and being ridiculously laughably reliable. So when several IBM applications (including DB2) needed extra CPU power, for example for doing complex SQL queries close to the disk drive hardware, the easiest way to do that was to install a few 100K$ worth of (PowerPC or cell engine) CPUs right next to the disk drive.

The underlying philosophy is that mainframes have a long history of "outsourcing" work to other parts of the system. The concept of the "channel" is that much of the IO is not actually done by the CPU, but by channel processors, which run independently, are interrupt driven, and deposit results of IO directly in memory via DMA, or talk to each other. For example, to do a disk-to-tape copy, the CPU is not really involved: it orders two channels to work together and figure it out among themselves, and report back in an hour or so. This is why in the 90s and 00s, the mainframe was capable of adding enormous network bandwidth, because its network cards were not just simple "ethernet", but they contained much of the stack, to upper protocol layers. They were also physically bigger, used more power, had more CPU power, and were much more expensive than whole Intel-based PCs.

Yup. Writing channel programs using EXCP was another adventure.

There were no "network cards." The 3705 and 3725 were computers in their own right. (I never got the chance to work on a 3745.) SNA's (System Network Architecture, in which we called it Still No Architecture when it was first announced) philosophy was to offload as much of the communication details to the 37x5 controller. Though channel attached 3270's were still a thing and were blindingly fast.

From that tradition to adding outboard CPU assists is only a small step. In the meantime, the rest of the industry also caught up, with things like intelligent/active disks, NASD, RDMA, HBAs that execute scripts, and so on. Today every cloud server has network cards that do most of the stack (up to the https layer) without bothering the CPU.
The TLS offload, TSO, LRO, and checksum processing are the ifconfig options that control this. Tangentially, TSO and LRO do mind bending tricks on in kernel firewalls. This is why they should be disabled when using ipfw, pf or ipfilter.

Another technology we think of as new is virtualization. Virtual machines were a thing with CP/67 and VM/370. And they did it more efficiently than our Intel processors do today. The IBM mainframe architecture's userland instructions didn't alter the state of the CPU like Intel's userland instructions, i.e. PUSHF and POPF are good examples. Any machine instruction that altered the CPU state of the machine that any hypervisor needed to know were privileged. Not so in the Intel world. Popek and Goldberg wrote a paper about this.

In many ways the IBM mainframe architecture is superior to the ubiquitous machines we have sitting in front of us. In other ways they were/are inferior due to the lack of atomic stack instrucitons.
 
Before I retired, about 10 years ago, IBM disk drives had gotten so smart that you could have a volume that backed itself up to the cloud without ever getting the CPU involved (except for the initial setup, of course).

In the '90s, when I was working on getting a security certification on some mainframe software, and had to get into low-level details, I ran into a really cool instruction called Start Interpretive Execution. It was the basis of virtual machines. You set up some tables, issue the SIE instruction, and the CPU does the work of virtualization, with very little overhead.

Way back in the day, the 370 could offload work via fake tape drives. There was a company, Data Pathing Incorporated, that they used where I worked. IIRC, they were shop-floor terminals (I worked at a truck manufacturer). But to the 370, they looked like two tape drives, one that was always being written to, and one that was always being read from. (I suspect that they used EXCP to talk to the fake tape drives, so they didn't have to worry about tape label processing and such, though maybe specifying LABEL=(,NL) would have done the job; it's been a long time.)

A funny thing about writing EXCP code. I used to be pretty good at it, and I was transferred from the computer security group I was working for to a DB2 tools group, because they had a program that did low-level I/O, and no one on the team could do that. So I fixed their problem, and worked there for another 15 years without ever touching any EXCP code again. (At least I got some RDBS experience from it; I had grown up with IMS, which was hierarchical.)
 
Before I retired, about 10 years ago, IBM disk drives had gotten so smart that you could have a volume that backed itself up to the cloud without ever getting the CPU involved (except for the initial setup, of course).

In the '90s, when I was working on getting a security certification on some mainframe software, and had to get into low-level details, I ran into a really cool instruction called Start Interpretive Execution. It was the basis of virtual machines. You set up some tables, issue the SIE instruction, and the CPU does the work of virtualization, with very little overhead.

Way back in the day, the 370 could offload work via fake tape drives. There was a company, Data Pathing Incorporated, that they used where I worked. IIRC, they were shop-floor terminals (I worked at a truck manufacturer). But to the 370, they looked like two tape drives, one that was always being written to, and one that was always being read from. (I suspect that they used EXCP to talk to the fake tape drives, so they didn't have to worry about tape label processing and such, though maybe specifying LABEL=(,NL) would have done the job; it's been a long time.)

That was DFSMS. One would write storage rules based on various criteria essentially overriding DD statements, TSO allocate commends and SVC 99.

DFSMS will use DFHSM to migrate datasets (files) to archive storage. Level 1 archive storage will migrate to level 2 storage which may be tape back in the day or virtual tape today.

A funny thing about writing EXCP code. I used to be pretty good at it, and I was transferred from the computer security group I was working for to a DB2 tools group, because they had a program that did low-level I/O, and no one on the team could do that. So I fixed their problem, and worked there for another 15 years without ever touching any EXCP code again. (At least I got some RDBS experience from it; I had grown up with IMS, which was hierarchical.)
IMS was fast.

The company I worked for at the time wrote a DB2 performance reporter for Boole & Babbage (I was the only guy on the project). It reported on IRLM (IMS Resource Lock Manager, essentially a huge hash table) performance. IRLM was designed for IMS but IBM also used it for DB2 to manage locks. The output was stored in a DB2 database. It was written in assembler. Making DB2 calls from assembler was a funky process. It wasn't my idea to write it in assembler. It was my manager's idea. It was marketed by Boole for a while until their company and ours folded. That was probably my most fun job.
 
IMS is still fast, and while it doesn't have the customer base that DB2 has, its customers are large organizations that need to process tons of transactions with blinding speed (and, I suppose, don't want to take the plunge and go to TPF).

I wish there was an open-source hierarchical database, but I have yet to find one.

I remember Boole & Babbage. The first shop I worked for used one or more of their products, but I cannot remember which ones.
 
Back
Top