Setup of RAID10 (RAID0 stripe of two RAID1 mirrors) on FreeBSD 10.1

DutchDaemon

Administrator
Staff member
Administrator
Moderator
Developer
Just a quick and unceremonious write-up of an installation I performed just now. Substitute device names at your own leisure. These are four 4 TB disks (ada0-ada3) in a QNAP. Note that these disks only constitute a dedicated RAID10 storage pool. The OS runs from a separate disk (USB in this case) and mounts the storage pool.
Code:
# load your kernel modules
kldload geom_label
kldload geom_mirror
kldload geom_stripe

# if necessary
dd if=/dev/zero of=/dev/ada0 count=2
dd if=/dev/zero of=/dev/ada1 count=2
dd if=/dev/zero of=/dev/ada2 count=2
dd if=/dev/zero of=/dev/ada3 count=2

gpart create -s gpt ada0
gpart create -s gpt ada1
gpart create -s gpt ada2
gpart create -s gpt ada3

# RAID1 mirror ada0+ada1
gpart add -t freebsd-ufs -l ada0data ada0
gpart add -t freebsd-ufs -l ada1data ada1

gmirror label datastore01 /dev/gpt/ada0data /dev/gpt/ada1data

newfs -U /dev/mirror/datastore01

## ONLY FOR MIRROR TEST
## echo '/dev/mirror/datastore01 /data1 ufs rw,noatime 1 1' >> /etc/fstab
## mkdir /data1
## mount /data1
## REMOVE ABOVE AFTER TEST

# RAID1 mirror ada2+ada3
gpart add -t freebsd-ufs -l ada2data ada2
gpart add -t freebsd-ufs -l ada3data ada3

gmirror label datastore02 /dev/gpt/ada2data /dev/gpt/ada3data

newfs -U /dev/mirror/datastore02

## ONLY FOR MIRROR TEST
## echo '/dev/mirror/datastore02 /data2 ufs rw,noatime 1 1' >> /etc/fstab
## mkdir /data2
## mount /data2
## REMOVE ABOVE AFTER TEST

# RAID0 from both RAID1 mirrors

gstripe label -v datastore /dev/mirror/datastore01 /dev/mirror/datastore02

newfs -U /dev/stripe/datastore

echo '/dev/stripe/datastore /data ufs rw,noatime 2 2' >> /etc/fstab
Et voilà:
Code:
mkdir /data
mount -a
df -h | grep datastore

/dev/stripe/datastore  7.0T  8.0K  6.5T  0%  /data
In /boot/loader.conf:
Code:
geom_label_load="YES"
geom_mirror_load="YES"
geom_stripe_load="YES"
 
I wonder if someone can shed a light on this issue (which came up somewhere else)

My question is "Why use gmirror, now that we have ZFS for dealing with arrays of large drives?"

When a drive exceeds the 2Tb limit on MBR you end up with a issue where GPT and GEOM both want to store metadata and the end of the disk, right? (If it helps people follow, MBR labels can't cope with drives >2Tb and the newer GPT scheme writes stuff on the last blocks of the drive, which is exactly where GEOM, the provider for gmirror, stores its metadata too). Anyway, conventional wisdom was don't mix GEOM and GPT on drives larger than 2Tb, and why should you, because with ZFS the world is wonderful! This was definitely an issue with FreeBSD 9, and I got scared and have been using ZFS ever since.

The GEOM/GPT work-around was, as I understand it, to gmirror partitions rather than the entire disk. I can think of a few problems with that, like what happens if they all re-build at once?

However, I don't happen to like ZFS, so this interests me. In particular, I like to mount drives where I want them, and to know which drives a particular file is store.

But I can't see how you're avoiding a clash between the gstripe metadata, which is still stored in the last sector, is it not? As is the secondary GPT partition table that gpart creates. I guess if you've got this working, it's working (and may or may not work on <10.1). Or are you getting messages about the secondary GPT header in the log? Or is this now a non-issue with FreeBSD 10 and the documentation needs to catch up?

This is a very low-cost, low-spec, low-energy set up that needs to have 'reasonably high availability and redundancy', but whatever is on those disks is ultimately replaceable (temporary/rolling backups, some ftp repositories and some quick and temporary file shares). So it doesn't have to be nuke-proof. UFS is fine, trusted, and easy on resources. The metadata issue was not on my mind when I created this quick set up, but I have used striped partitions to good effect before, so if this is still an issue (and I'd be interested to get confirmation either way), I'll split the single 6.5 TB /data up into four separate and striped /data[0123] ones without too much hassle. Thanks for the heads-up though.

Any recent developments or insights?
 
Which issue? There are a bunch of them listed.

The 2TB problem is only a problem with MBR. MBR does not work on disks larger than 2TB. The easy solution is, of course, to stop using MBR. Larger disks are not a problem with GPT.

The metadata is kept in the last block of the logical device. The only problem is a conflict with the backup GPT table on whole devices. The backup GPT table is written to the end of the physical drive. Mirror or stripe configurations that use GPT partitions work, because the metadata for them is inside that partition.
 
Sorry, should have edited that down to the gist, I guess. Yeah, the metadata clash was what I was asking about, and I understand that that's not a point of worry. Thanks for that.
 
Back
Top