Other Recommended file system for virtualization cluster

alphachi

Member

Reaction score: 7
Messages: 48

Environment: FreeBSD 11.1R, 10Gbx2 Ethernet(LACP), 8GBx2 FC(multipath)

I want to build a FreeBSD virtualization cluster with 5+ hosts. Because bhyve doesn't support live migration at this time, Xen(Dom0) is the unique method.

The simplest topology looks like:
Code:
 Server2       ...       Server9
---------   ---------   ---------
NFS(50TB)   NFS(50TB)   NFS(50TB)
     \          |          /
      \         |         /
       \        |        /
        \___ZFS(50TB)___/
            ---------
             Server1 ---FC--- SAN
Because of the requirement for live migration, the main drawbacks of this topology include:
1) VMs can't use zvol as the backend, so it will cause significant performance loss.
2) The only NFS server(Server1) becomes "single point of failure".

The ideal topology looks like:
Code:
Server1   Server2     ...     Server9
-------------------------------------
   50TB (some kind of file system)
   |         |         |         |
   |         |         |         |
   |         |         |         |
   FC        FC       FC        FC
    \         \       /         /
     \         \     /         /
      \_________\   /_________/
                 SAN
What kind of shared-disk file system should I distribute? Something like OCFS2 on Linux?

Thanks!
 

Oko

Daemon

Reaction score: 770
Messages: 1,620

Let me make sure that I understand your question correctly. You want us to explain you how to build a FreeBSD virtualization cluster by running Xen Dom0 on FreeBSD (Xen Dom0 is not ported to FreeBSD) and utilizing Oracle's proprietary OCFS2 (which runs only on Linux) on FreeBSD? Unfortunately I ,as a long time BSD user, can't answer that question. However I can offer you at least two alternative comments to your original post.

  1. You are just bored so you are trolling. I can assure you there a much better ways to spend your time.
  2. You just read an Oracle's PR document about OCFS2 but you are technologically challenged and don't realize that what you are asking is technically not feasible.
 
Last edited:

ralphbsz

Daemon

Reaction score: 968
Messages: 1,560

What SAN or cluster file systems are available on FreeBSD? On Linux, there are several free ones: OCFS, Gluster, Ceph, Lustre, and perhaps a few others (not all with a full Posix interface). In addition, there are several commercial ones. I don't know which ones are available on FreeBSD.
 
OP
OP
A

alphachi

Member

Reaction score: 7
Messages: 48

Hi, Oko.

Xen Dom0 is not ported? Do you means I read a fake handbook?

Why do you think I'm trolling? Did I say I want to use OCFS2 on FreeBSD? I'm just looking for some kind of shared-disk file system on FreeBSD which can be “parallel-mounted” by multiple hosts from FC, and OCFS2 is only an example of shared-disk file system on Linux.
 
OP
OP
A

alphachi

Member

Reaction score: 7
Messages: 48

From wikipedia.org:

Clustered file system includes:
  • Shared-disk file system: like Lustre and OCFS2
  • Distributed file systems: like Ceph and GlusterFS
  • Network-attached storage: like NFS and SMB
Because I want to mount the same FS for multiple hosts from FC at the same time, Shared-disk file system on FreeBSD is my requirement.
 

ralphbsz

Daemon

Reaction score: 968
Messages: 1,560

How many disks do you have? What type are they? You say that you have 50TB. Assuming the default near-line SAS or SATA disks in 3.5 form factor, that could be as few as 5 disks.

Here is why I ask: You say you are using 2 times 10gig Ethernet. That's roughly 2 Gigabyte per second. The throughput you get out of one 3.5" near-line disk in real-world applications is typically no larger than 100 Mbyte/sec. In theory, with very large files, and single-file workloads, they can go to 1.5x or 2x that, but in practice, that's unlikely to be achieved, because with reasonable file sizes, the mixing of data and metadata, and multiple workloads, there will be seeks (meaning a partially random workload). If you really only have 5 disks, then the best throughput you can realistically hope for is 0.5 Gigabyte/sec, or a quarter of of what your ethernet can do. So you are not going to get better performance by going to a cluster file system and having the 5 clients access the storage directly. Now, if your 50TB is in 50 disks, or if you have a lot of SSDs, that's a different story.

With 5 clients, it's not easy to make the argument that a cluster file system or SAN file system will give you better performance.

Now on the reliability: You seem to be worried about the NAS server being a "single point of failure". But in that view you are only considering hardware failure. Now think about cluster file systems. What is their uptime, and how good is their availability? In particular when managed by people who don't have a lot of experience? Cluster file systems tend to be a bit complex, and software systems are far from perfect. In my experience, if you look at outages of large cluster installations, the first 90% are caused by human error of administrators and maintenance personnel. Of the remaining 10%, 9% are caused by software problems. Only the last 1% are caused by hardware problems.

Let me try to describe the situation in two jokes. The first one: "How many cluster storage administrators does CERN have (a major cloud and HPC site)? About 10, and they all have PhD's." Second joke comes from a former colleague, when arguing about the availability of our product: "We should not aim for 5 nines, but instead for 9 fives."

What I'm really saying is this: You might want to analyze your proposed plan some more.
 
OP
OP
A

alphachi

Member

Reaction score: 7
Messages: 48

Thanks for your input.

50TB is the size of final LUN. The storage device has two cabinets with double-live-data technology, like Hardware RAID1. Each cabinet consists of many groups of Hardware RAID50 and hot spares with 900GB 10K SAS disks. The storage device cross-connects two SAN switches with 4 fibers.

In fact, I'm the only administrator, so I'll do my best to avoid any human error.

Many critical business will be distributed on the cluster, so the vital feature is shorter downtime. For example, if a Dom0 needs reboot after freebsd-update, all DomU on it can be live migrated to another Dom0 at first.
 

SirDice

Administrator
Staff member
Administrator
Moderator

Reaction score: 7,182
Messages: 29,471

Xen Dom0 is not ported? Do you means I read a fake handbook?
What the handbook unfortunately fails to mention is that it's highly experimental, has lots of caveats and is in no way production worthy. I wouldn't want to run any production application on it, let alone business critical applications. Test systems may be fine.
 
OP
OP
A

alphachi

Member

Reaction score: 7
Messages: 48

That sounds awful.

No Xen and no shard-disk file system, so does it mean I have to distribute bhyve and save vm disk files to NFS like the first topology?
 

vermaden

Son of Beastie

Reaction score: 1,119
Messages: 2,727

Environment: FreeBSD 11.1R, 10Gbx2 Ethernet(LACP), 8GBx2 FC(multipath)

I want to build a FreeBSD virtualization cluster with 5+ hosts. Because bhyve doesn't support live migration at this time, Xen(Dom0) is the unique method.

The simplest topology looks like:
Code:
 Server2       ...       Server9
---------   ---------   ---------
NFS(50TB)   NFS(50TB)   NFS(50TB)
     \          |          /
      \         |         /
       \        |        /
        \___ZFS(50TB)___/
            ---------
             Server1 ---FC--- SAN
Because of the requirement for live migration, the main drawbacks of this topology include:
1) VMs can't use zvol as the backend, so it will cause significant performance loss.
2) The only NFS server(Server1) becomes "single point of failure".

The ideal topology looks like:
Code:
Server1   Server2     ...     Server9
-------------------------------------
   50TB (some kind of file system)
   |         |         |         |
   |         |         |         |
   |         |         |         |
   FC        FC       FC        FC
    \         \       /         /
     \         \     /         /
      \_________\   /_________/
                 SAN
What kind of shared-disk file system should I distribute? Something like OCFS2 on Linux?

Thanks!
You can use OVIRT for that - https://www.ovirt.org/- with GlusterFS as fs.

You can alsu use PetaSAN - http://www.petasan.org/ - for the highly available backend.
 
Top