UFS create degraded raid 5 with 2 disks on freebsd

gman

New Member


Messages: 5

I am trying to create a new raid5 raidset (or equivalent) on freebsd using just 2 disks. I understand this would essentially be a degraded raid5 raidset and I'm completely good with that for my situation. I'm able to accomplish this on centos with the mdadm command, but unfortunately that command is not available on freebsd. I have tried using gvinum on freebsd but it seems to always error out whenever I try to use it. when using mdadm command, I do get an error, but it does go ahead and create the md device which I am then able to use and put data on. So does anyone know of a way I can achieve this on freebsd similarly to how I did with mdadm on centos?

FWIW, this is only for an interim fix for a couple specific cases within our environment. In all cases we will be adding a third disk to the raidset after hopefully getting this to work on 2.
 

SirDice

Administrator
Staff member
Administrator
Moderator

Reaction score: 8,057
Messages: 31,638

Create a fake file-based device using mdconfig(8). Use that fake device as the 'third' disk. Remove it once the RAID5 has been set up.
 
OP
OP
G

gman

New Member


Messages: 5

Create a fake file-based device using mdconfig(8). Use that fake device as the 'third' disk. Remove it once the RAID5 has been set up.
Would that "fake" file-based device need to actually be the same size as the other 2 disks or can it be thin-provisioned? I ask because we're using 16TB drives (and some systems may have 32BT drives).
 

SirDice

Administrator
Staff member
Administrator
Moderator

Reaction score: 8,057
Messages: 31,638

It has to be the same size. Note that "thin-provisioned" (aka sparse files) still has a large filesize, it just doesn't take up all that space on disk.
 
OP
OP
G

gman

New Member


Messages: 5

any ideas of if there's a way to do it with just the two disks without the "fake" file? Below is a little more context

These are systems of our customers' that currently just have 2 16TB disks in a raid 1 mirror (and the disks are now full). They want to add only one additional 16TB drive to the system but also have the mirror converted from raid1 to raid5. I can get the 2 disks by using the new one we're shipping them and stealing one from the raid1 mirror. I would then rsync the files from the remaining raid1 disk to the raid5 consisting of the other 2.
Unless I'm misreading, it looks like file created with mdconfig would need to be large enough (both in file size and in space on disk) to accommodate the data being transferred to the raid5 which would be 16TB. Thus that method wouldn't work without more disk space from somewhere.

We would like to avoid shipping them an additional disk even if it'd just be temporary.

I've made this whole scenario work on centos, but our development team would like to stay with freebsd
 
OP
OP
G

gman

New Member


Messages: 5

If I were to use ZFS would that make this situation any more doable on FreeBSD?
 

PMc

Aspiring Daemon

Reaction score: 180
Messages: 513

In theory: yes - with ZFS there is a possible way to do this.
In practice: no, because the implementation is broken.

Method:
We have a two-way mirror, and we add a third disk of same size.
Now we can create a three-way raid5 of half the intended size, and the data will exactly fit on that:
We break the mirror,
we leave the data on disk0,
we create a raid with half the disk1, the other half of disk1 and half of disk2.
We copy the data to that raid. (Dont ask about the performance of that operation, it will probably be bad.)
Then we replace the volume on the second half of disk1 with half of disk0 (this will also be quite slow).
Finally we grow the raid to the intended size.

I verified, ZFS can do that (FreeBSD 11.2):
Code:
root@edge:~ # gpart add -t freebsd-zfs -s 2097152 ada2
ada2p5 added
root@edge:~ # gpart add -t freebsd-zfs -s 2097152 -b 3882672936 ada2
ada2p6 added
root@edge:~ # gpart add -t freebsd-zfs -s 2097152 -b 3886867240 ada2
ada2p7 added
root@edge:~ # gpart show ada2
=>        40  5860533088  ada2  GPT  (2.7T)
  3878478632     2097152     5  freebsd-zfs  (1.0G)
  3880575784     2097152        - free -  (1.0G)
  3882672936     2097152     6  freebsd-zfs  (1.0G)
  3884770088     2097152        - free -  (1.0G)
  3886867240     2097152     7  freebsd-zfs  (1.0G)
  3888964392  1971568736        - free -  (940G)

root@edge:~ # zpool create xxx raidz ada2p5 ada2p6 ada2p7
root@edge:~ # zpool list xxx
NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
xxx   2.75G   632K  2.75G        -         -     0%     0%  1.00x  ONLINE  -
root@edge:~ # gpart resize -i 5 -s 3145728 ada2
ada2p5 resized
root@edge:~ # gpart resize -i 6 -s 3145728 ada2
ada2p6 resized
root@edge:~ # gpart resize -i 7 -s 3145728 ada2
ada2p7 resized
root@edge:~ # gpart show ada2
=>        40  5860533088  ada2  GPT  (2.7T)
  3878478632     3145728     5  freebsd-zfs  (1.5G)
  3881624360     1048576        - free -  (512M)
  3882672936     3145728     6  freebsd-zfs  (1.5G)
  3885818664     1048576        - free -  (512M)
  3886867240     3145728     7  freebsd-zfs  (1.5G)
  3890012968  1970520160        - free -  (940G)

root@edge:~ # zpool set autoexpand=on xxx
root@edge:~ # zpool list xxx
NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
xxx   2.75G   896K  2.75G        -         -     0%     0%  1.00x  ONLINE  -

root@edge:~ # zpool online xxx ada2p5
root@edge:~ # zpool list xxx
NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
xxx   4.25G   992K  4.25G        -         -     0%     0%  1.00x  ONLINE  -
root@edge:~ # zpool status xxx
  pool: xxx
state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        xxx         ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            ada2p5  ONLINE       0     0     0
            ada2p6  ONLINE       0     0     0
            ada2p7  ONLINE       0     0     0

errors: No known data errors
Voila, raid created on same disk, and expanded.

Now for the downside: I intentionally left half of the space free between the volumes. If I would not do that, i.e. if I would make the volumes adjacent to each other, like so:

Code:
root@edge:~ # gpart show ada2
=>        40  5860533088  ada2  GPT  (2.7T)
  3878478632     4194304     5  freebsd-zfs  (2.0G)
  3882672936     4194304     6  freebsd-zfs  (2.0G)
  3886867240     4194304     7  freebsd-zfs  (2.0G)
  3891061544  1969471584        - free -  (939G)
and if I would then try to expand that raid, then the ZFS would become unreadable and I would get a very reproductible kernel crash. (I.e. reproducible with a SATA drive and a USB stick, on amd64 and i386. Actually that was what I tried first, because I thought it would either work or not work. In fact, it could work but it is implemented as broken.)
 

Quip

Member

Reaction score: 11
Messages: 20

Unless I'm misreading, it looks like file created with mdconfig would need to be large enough (both in file size and in space on disk) to accommodate the data being transferred to the raid5 which would be 16TB. Thus that method wouldn't work without more disk space from somewhere.
Just create the sparse file. As you can see it does not take up anything from available space

Bash:
quip@test /tmp/> zfs list /tmp
NAME        USED  AVAIL  REFER  MOUNTPOINT
tank2/tmp  1.92G   150G  1.79G  /tmp

quip@test /tmp/> truncate -s 16T myfile.img

quip@test /tmp/> ls -lFGh myfile.img
-rw-r--r--  1 quip  wheel    16T May 15 12:52 myfile.img

quip@test /tmp/> zfs list /tmp
NAME        USED  AVAIL  REFER  MOUNTPOINT
tank2/tmp  1.92G   150G  1.79G  /tmp
When you have your 16TB sparse file, you can use mdconfig to create disk device from this file

Bash:
quip@test /tmp/> sudo mdconfig -u md1 -f myfile.img
Password:

uip@illbsd /tmp/> ls -l /dev/md1
crw-r-----  1 root  operator  0xc8 May 15 12:54 /dev/md1

quip@test /tmp/> diskinfo -v /dev/md1
/dev/md1
        512             # sectorsize
        17592186044416  # mediasize in bytes (16T)
        34359738368     # mediasize in sectors
        0               # stripesize
        0               # stripeoffset
As you can see, disk device md1 seems like 16TB disk.

Then you can create RAID from your 2 real disks and 1 fake disk, say ada0 ada1 and md1. You can use gpart on md1 in the same way as it will be real disk.
You can create ZFS RAIDZ with these 3 devices as well.

After you have your desired RAID and filesystem on it, remove the fake device md1 (make it offline in ZFS). Then you will have your RAID in degraded state with just 2 real disks.
 
Top