Build High Performance Computing [HPC] using FreeBSD

hosneyosman · Jul 18, 2015

Hi all,
I am going to build High Performance Computing based on FreeBSD 10.1and I don't know what is the start point for this topic. Does anyone have guidelines for this topic as I am doing long searches on the internet and I can't find any useful thing.

Oko · Jul 18, 2015

Hosney Osman said:
Hi all,
I am going to build High Performance Computing based on FreeBSD 10.1and I don't know what is the start point for this topic. Does anyone have guidelines for this topic as I am doing long searches on the internet and I can't find any useful thing.

Define what is High Performance Computing for you. I am doing HPC as my day job. Linux or more precisely RHEL is currently more or less the only choice. ROCKS 6.2 cluster distribution is based of Cent 6.6. NVidia is releasing CUDA and GPU TESLA drivers for Linux, MAC and Windows but most people run Linux. MATLAB is available also only on those three platforms but predominantly used on Linux RHEL or Ubuntu. Ubuntu is coming strong but the lack of cluster distribution based of Ubuntu is hurting it.

What else? There are no proprietary high performance compliers for BSDs. Hadoop, Spark is more or less Linux only technology.

The only HPC thing I can think of which could be run on FreeBSD would be Python/R/C (GSL GNU scientific libarary), C++, Julia

jrm@ · Jul 18, 2015

Hello Hosney Osman,

If you don't require any of the gimmicks [1] Oko mentioned, then FreeBSD might be a good choice for you. I'm not aware of any guides specific to HPC on FreeBSD, but HPC is a vague term, so if you could describe more of what you want to accomplish maybe we can help out and point you to useful information. Much of what you likely need is described in the handbook.

[1] Oko enjoys occasional lighthearted poking.

drhowarddrfine · Jul 18, 2015

This one is old but I used to have links to a couple of sites that talked about HPC specifically but I'd bet you're already aware of those. And then there's all the Beowulf stuff,.

Oko · Jul 19, 2015

drhowarddrfine said:
This one is old but I used to have links to a couple of sites that talked about HPC specifically but I'd bet you're already aware of those. And then there's all the Beowulf stuff,.

That is really old stuff. More than 10 years ago NetBSD guys were also talking about HPC and bringing some clustering software. At that time MATLAB was running both on FreeBSD and NetBSD via Linux emulation. I think Portland compiler set had FreeBSD edition. But that was more than 10 years ago. At that time MATLAB had official release for Solaris and many university labs had Sun hardware and run older Sun scientific computing infrastructure. Those times are long bygone. Linux is now the only kid on the block in this business.

hosneyosman · Jul 19, 2015

Thanks OKO for your reply

My understanding HPC [High Performance Computing] build for specific purpose, this is not my idea before as was think that is something generic can be built for any reason, whatever application running on it

What I am going to do

Build BI [business intelligence] Oracle Database / weblogic as application server / Apache as we server

All of my report will generate using PL/SQL

So what I understand that the cluster something will handle by Operating system not related to what kind of application can be running on it

So my questions

1) Is there anything I can use to build HPC[High Performance Computing] for any purpose or it should be for specific purpose

I can install anything and everything

As I will not build HPC for Oracle DB and another one for MySQL DB and another one for Cassandra DB, and another one for Hadoop.

2) HPC it something can handle by operating system or application I want to use should be support HPC also

3) Do you have any reference/article anything can explain the idea and FAQ about this topic as I have a lot of questions need to know the answers for understanding

jrm@ · Jul 19, 2015

The main constraint on running FreeBSD on your cluster is compatibility with third party software. Because Linux is more popular, developers/vendors/companies often only target Linux. Sometimes they write portable code and the application can be ported to FreeBSD. Sometimes the code depends heavily on specific features in Linux and porting to FreeBSD is difficult. If you need the Oracle applications you mentioned above, FreeBSD is probably not the best choice for you. If you are flexible and can try alternatives like databases/postgresql94-server (I don't know enough about WebLogic to suggest and alternative) then FreeBSD could work. A good source for discovering third party software that's ready to run on FreeBSD is Fresh Ports.

Beastie7 · Jul 19, 2015

Why people still spend thousands of dollars on proprietary SQL is beyond me.

hosneyosman · Jul 19, 2015

jrm said:
The main constraint on running FreeBSD on your cluster is compatibility with third party software. Because Linux is more popular, developers/vendors/companies often only target Linux. Sometimes they write portable code and the application can be ported to FreeBSD. Sometimes the code depends heavily on specific features in Linux and porting to FreeBSD is difficult. If you need the Oracle applications you mentioned above, FreeBSD is probably not the best choice for you. If you are flexible and can try alternatives like databases/postgresql94-server (I don't know enough about WebLogic to suggest and alternative) then FreeBSD could work. A good source for discovering third party software that's ready to run on FreeBSD is Fresh Ports.

Thanks Jrm for your support do you have any answer for my question

hosneyosman · Jul 19, 2015

Beastie7 said:
Why people still spend thousands of dollars on proprietary SQL is beyond me.

Thanks Beastie7 For your reply, This is not my choice it's a business choice and I should maintain the business

Beastie7 · Jul 19, 2015

Hosney Osman said:
Thanks Beastie7 For your reply, This is not my choice it's a business choice and I should maintain the business

Hosney there's experimental support for GlusterFS. I'm not sure how stable it is but you might want to keep an eye on its' progress. Unfortunately FreeBSD doesn't have it's own way of consolidating resources for distributed computing (HPC stuff). Linux has Lustre, GlusterFS, Hadoop and all that jazz.

There was an OpenZFS office discussion with one of the FreeBSD Foundation board of directors (Justin Gibbs) about adding clustering capabilities to the DMU layer in ZFS. When that will happen is up in the air though.

Oko · Jul 19, 2015

Hosney Osman said:
Thanks OKO for your reply

My understanding HPC [High Performance Computing] build for specific purpose, this is not my idea before as was think that is something generic can be built for any reason, whatever application running on it

What I am going to do

Build BI [business intelligence] Oracle Database / weblogic as application server / Apache as we server

All of my report will generate using PL/SQL

So what I understand that the cluster something will handle by Operating system not related to what kind of application can be running on it

So my questions

1) Is there anything I can use to build HPC[High Performance Computing] for any purpose or it should be for specific purpose

I can install anything and everything

As I will not build HPC for Oracle DB and another one for MySQL DB and another one for Cassandra DB, and another one for Hadoop.

2) HPC it something can handle by operating system or application I want to use should be support HPC also

3) Do you have any reference/article anything can explain the idea and FAQ about this topic as I have a lot of questions need to know the answers for understanding

Oracle is no go on FreeBSD. Based on your description you will be waisting your time with FreeBSD unless you have couple FreeBSD developers on your stuff who can add desired functionality.

hwpplayer1 · Sep 20, 2020

I think this is the right place to continue the conversation on this forum. I want to make a service, for Lab environments, i'll study university, so math, physics , and other study fields. FreeBSD is not a choice or option, i want to teach FreeBSD ecosystem to my friends and who wants to learn it.

Thanks.

outpaddling · Jan 6, 2021

I've maintained both FreeBSD and RHEL/CentOS HPC clusters for 10 years, serving a wide variety of disciplines (biology, business, chemistry, engineering, physics, political science, psychology, etc).

Some takeaways from that experience:

Many technologies in HPC and HTC (high throughput computing) are vastly overhyped solutions looking for problems. People hear about the next big thing in parallel computing and assume everybody needs it to compete, but that's never the case. CUDA is a great example. It's indispensable for a few niche applications (e.g. machine learning) where it provides vastly better performance and cost/performance, but it's of no use to the vast majority of HPC users. Programs have to be completely rewritten to use CUDA and there's no justification for doing that in most cases. Our main general-purpose cluster has ~2000 CPU cores across 100 nodes that have averaged ~70% utilization and 2 GPU nodes with two boards each which sit idle much of the time. I have some colleagues in physics with their own cluster who invested in several GPU nodes that never once got used and ended up scrapped after sitting idle for ~5 years.

Machine learning itself is a software solution looking for problems. Again, indispensable to a small fraction of the population, but most people have no use for it.

Parallel filesystems are only useful on large clusters or if there are unusually I/O intensive jobs frequently run. Most HPC jobs are CPU and memory-bound. That said, FreeBSD has PFS options now if you need them. Gluster and Ceph are in the ports tree. Sites I'm aware of that are really serious about parallel I/O use an appliance such as Panasas, Netapp, or Isilon (all of which are based on FreeBSD, BTW).

Don't get swept up in the hype about "cool" new technologies. Find out if they're really useful to you before making any decisions.

We run CentOS on our big clusters primarily due to the need for mature Infiniband drivers for MPI apps and support for commercial software. If you need to run one or a few closed-source Linux binaries, FreeBSD might be a good choice. The Linux compatibility module works fine for most scientific apps. It's just a matter of installing the right Linux shared libs, same as on RHEL/CentOS. It only tends to have issues with system software that uses more esoteric Linux system calls. If you have a wide variety of closed-source software, it's probably easier to run CentOS. I did experiment with FreeBSD Infiniband and it was almost ready for prime-time as of about a year and a half ago. I built one FreeBSD+ZFS file server in our CentOS cluster and it performed about as well as the CentOS servers (better in some benchmarks, slower in others). The Infiniband driver was still a bit glitchy at that time, but maybe that's been fixed by now.

Outside engineering, the vast majority of the HPC software is open source and much of it is already in FreeBSD ports. There are a few challenging apps (e.g. SRA Toolkit) but most of them become more portable over time. Qiime comes to mind. It was a pain to port to FreeBSD 6/7 years ago, but Qiime 2 got a new, very clean design and it's now trivial to port all of the modules I've tried. FreeBSD ports makes it trivial to deploy the majority of popular scientific software. Contrary to popular belief, the ports tree has a lot of scientific software already in it, and the vast majority of dependencies for apps that aren't yet ported.

I've generally found FreeBSD better at holding up under load. We frequently had CentOS compute nodes and file servers freeze up while the FreeBSD cluster handled the same jobs without a hiccup. I've narrowed my focus to bioinformatics now and do all my current work on a FreeBSD cluster.

I found the canned cluster-management suites like Rocks to be buggy and incomplete in terms of routine management tasks like user-management, OS updates, etc. Rocks actually prevented us from installing critical security updates during our brief test run with it.

So we developed our own portable cluster-management tools for maintaining both the FreeBSD and CentOS clusters:

http://acadix.biz/spcm.php

I just committed this to FreeBSD ports. It's still alpha quality, but gradually progressing. It should be good enough for anyone interested in playing with a FreeBSD HPC cluster. I do most testing under VirtualBox using a NAT network (with DHCP disabled, so the head node can provide DHCP and PXE install services).

ralphbsz · Jan 6, 2021

In theory, running a large cluster with FreeBSD might be possible.

In practice, 100% of the 500 largest supercomputers in the world run Linux. In addition, there are a few supercomputers that are considerably larger than those listed in the Top500 list (they are either commercial installations or national security), but all the ones I know about also run Linux.

Let me repeat that statement, just so it sinks in: There is no OS other than Linux in the ~700 or 800 largest supercomputers in the world. None, zip, zilch.

That means that anyone who proposes to run a different OS on these machines will have to have really good arguments. In particular since these computers are not cheap (the cost of a Top500 machine starts at M$), so doing something that endangers the productivity of a million-$ machine will require extremely good and well-tested arguments. Such as running a highly unusual operating system.

For smaller installations, experimenting with FreeBSD might be a good idea and feasible.

outpaddling · Jan 7, 2021

I don't think anyone here is talking about building a top 500 cluster. For something huge, the needs will usually be highly diverse and a RHEL-based platform will be the best choice in order to support commercial apps, CUDA, etc. as I mentioned. For smaller clusters with a narrower focus, people have a lot more freedom to choose based on the merits of various platforms for their specific needs. FreeBSD's reliability and ports system make it a much better choice for some settings. Bioinformatics is a good example. Most of the software is open source and easily runs on FreeBSD. I mentioned that the SRA Toolkit is difficult to port (due to a messy, esoteric build system), but the Linux binaries also run fine on FreeBSD.

RHEL/CentOS dominates big clusters for good reasons, but looking at open source scientific software deployment on these platforms is like watching the 3 stooges. It's a mish-mash of non-reproducible cave-man installs, containers, 3rd-party repos like EPEL, etc. I've dealt with some upstream developers who rejected simple patches that allow their code to build with the native RHEL GCC, purely out of ego (they think delegating copy constructors and lambda functions are too cool not to use). Countless man hours have gone to waste struggling with installs a FreeBSD user can do in 2 seconds with "pkg install". This delays or even prevents important research on a regular basis. I've seen it happen more times than I can count. We avoided most of the chaos by using pkgsrc on our CentOS clusters. It can pull in it's own modern compilers as dependencies and makes installs easily reproducible, not to mention collaborative on a global scale.

hwpplayer1 · Jan 7, 2021

I just offered a strategy to develop FreeBSD/BSDs together and seperate with their own approaches. Now we are on FreeBSD domain so let's get started with basics or fundamentals I take the responsibility to converge teams in their own platforms like their own hardware datacenter and by their own teams.

outpaddling · Jan 7, 2021

I should also point out that an HPC cluster doesn't have to be a monoculture of one operating system. I experimented with a FreeBSD file server in one of our CentOS clusters and a FreeBSD visualization node in another. The latest visualization apps, terminal servers, etc. are a challenge to install on Enterprise Linux with its outdated tools and libraries. One could also have Linux compute nodes in a predominantly FreeBSD cluster for running CUDA or other proprietary software while leveraging FreeBSD ports for managing open source apps on most of the nodes. It's just a matter of using a compatible version of the job scheduler. SPCM has limited support for this already. It leverages auto-admin as a compatibility layer for sysadmin tasks like running updates, managing users, etc. so OS differences are opaque to the core SPCM scripts.