From mirror to raidz2

Greetings all,

after Terry helped me to recover my backup data, which I merged with my current data, my zpool (a mirror comprised of two 1TB drives) is at 85% capacity. I have additional five 1TB drives and would like to build a raidz2 (4+2).

What would be the easiest way to do so? Could I somehow break the mirror, use one of the drives as one of the drives in the raidz2 and have the date propagate through the raidz2?

Kindest regards,

M
 
@mefizto

Let´s assume this:
Code:
disk0 = old mirror #1
disk1 = old mirror #2
disk2
disk3
disk4
disk5
disk6
# zpool detach oldpool disk1
# zpool create -m /mnt newpool raidz2 disk{1,2,3,4,5,6}
# zfs snapshot -r oldpool@now
# zfs send -r oldpool@now | zfs recv -dF newpool
# zpool destroy oldpool
# zfs set mountpoint=/foo/bar newpool

/Sebulon
 
Hi Sebulon,

thank you for the reply. What do you think about modifying your proposal the following way, which would prevent me from risking potential loss of data by using one of the disks 0,1 in the new pool initially:

Create a sparse file with dd(1):
# dd if=/dev/zero of=/tmp/disk.img bs=1024k seek=149k count=1

Then create a memory disk from the file:
# mdconfig -a -t vnode -f /tmp/disk.img md2

Create newpool using disks 2-6 + the md2:
# zpool create -m /mnt newpool raidz2 disk{2,3,4,5,6,md2}

Offline the md2 so it is not written to; now I have a degraded pool:
# zpool newpool offline /dev/md2

Then continue the way you suggested and finally:
# zpool replace newpool md2 disk0

Any comments would be appreciated.

Kindest regards,

M
 
@mefizto,

That would not work because you can not rebuild a RAIDZ2 from one disk.
What Sebulon is suggesting is the safest and quickest way. I would only add to that, physically remove the detached drive from the mirror and store it somewhere safely until you are done with the procedure.
 
@gkontos

I don't get your reply. He is using an offline md device as one of the disks in the new pool. This means the source for the copy is still a redundant mirror, and the destination can withstand one disk failure (on top of the offline disk) because it's a raidz2. Once the data is moved, the mirror is destroyed and the md device is replaced by using one of the original disks. I actually think this is a good idea.

Also, as soon as the disk is detached from the mirror it's no longer a valid pool so there's no reason to keep it safe. Regardless, it's detached because, in the original plan, it's needed as a disk in the new pool. The important disk is the remaining disk in the original pool, which has to stay online as its needed to copy from.

@mefizto

I think there's a truncate command which makes it easy to create a sparse file for the offline disk.

Code:
# truncate -s 1t /tmp/disk.img
 
mefizto said:
Create a sparse file with dd(1):
# dd if=/dev/zero of=/tmp/disk.img bs=1024k seek=149k count=1

Then create a memory disk from the file:
# mdconfig -a -t vnode -f /tmp/disk.img md2

In theory, sparse files for md(4) should work. In reality, problems are reported often enough that it should be avoided.
 
I've never actually done it but the idea of using an offline md device as a placeholder in a pool has been thrown around quite a few times on here (and I believe a few people have actually done it). You could even use a memory backed md device.

I can't say it's going to work for certain but considering the device is immediately marked offline and never actually has any data written to it, I can't see it being much of a problem?
 
usdmatt said:
@gkontos

I don't get your reply. He is using an offline md device as one of the disks in the new pool. This means the source for the copy is still a redundant mirror, and the destination can withstand one disk failure (on top of the offline disk) because it's a raidz2. Once the data is moved, the mirror is destroyed and the md device is replaced by using one of the original disks. I actually think this is a good idea.

Also, as soon as the disk is detached from the mirror it's no longer a valid pool so there's no reason to keep it safe. Regardless, it's detached because, in the original plan, it's needed as a disk in the new pool. The important disk is the remaining disk in the original pool, which has to stay online as its needed to copy from.

@mefizto

I think there's a truncate command which makes it easy to create a sparse file for the offline disk.

Code:
# truncate -s 1t /tmp/disk.img

No, you can't build a raidz vdev from a single disk by attaching more disks. You must have the exact number of disks present when you create the vdev. Also, the old contents of the disks that make up the vdev are destroyed at creation. After creation of the raidz vdev it can not be modified in any way.
 
kpa said:
No, you can't build a raidz vdev from a single disk by attaching more disks. You must have the exact number of disks present when you create the vdev. After creation of the raidz vdev it can not be modified in any way.

Am I the only person who actually understands what he intends to do?

The new 6 disk raidz2 will be made with 5 new disks and an md device.
The md device is immediately marked offline so no data is written to it, then the data is moved from the mirror.
After, the mirror is destroyed and one of the disks is used to replace the md device in the raidz2

This procedure has been done before.
 
@usdmatt,

I now finally got it ;)

Yes, that would work too but for performance issues it is better if he immediately offlines the device.
 
Greetings all,

thank you all for your answers, which I appreciate very much. Just so we are on the same page, let me restate what I understand your answers to be.

gkontos,

from your last post I understand that you now believe that the procedure will work. Please note that I am offlinig the md2 once I have created the zpool.
# zpool create -m /mnt newpool raidz2 disk{2,3,4,5,6,md2}
# zpool newpool offline /dev/md2

usdmatt,

thank you for the confirmation and the suggested:
# truncate -s 1t /tmp/disk.img

I am not familiar with the command, do I use it instead of my proposed:
# dd if=/dev/zero of=/tmp/disk.img bs=1024k seek=149k count=1
and
# mdconfig -a -t vnode -f /tmp/disk.img md2

wblock@,

I am confused by your assertion:
In theory, sparse files for md(4) should work. In reality, problems are reported often enough that it should be avoided.

The md2 is used merely as a placeholder for creating the zpool. Since after the zpool creation the md2 is offlined, it will never be written to. Essentially, as I understand it, the zfs now thinks that the zpool is degraded.

Or, did I misread your concern?

kpa,

like with wblock@, I am maybe missing your point made:
No, you can't build a raidz vdev from a single disk by attaching more disks. You must have the exact number of disks present when you create the vdev.

I am building the raidz2 from five physical disks and md2. Consequently, I have 6 disks.

Also, the old contents of the disks that make up the vdev are destroyed at creation.

There is no content on the devices. This is the reason I modified Sebulon's procedure, which used one of the original mirror's devices. In my case, the mirror remains intact. The degraded zpool has one disk redundancy. Even if, after I break the mirror and use one of the disk's in the zpool, if something goes wrong, I still have the data on the second mirrored disk.

After creation of the raidz vdev it can not be modified in any way.

I do not believe that offlining is modification in that the zfs thinks that the zpool contains the correct number of devices, yet due to one device being offline, the zpool is degraded.

Please do not hesitate to correct me if I made a false conclusion, I would hate to make an irreversible mistake despite the mirror being still intact.

Kindest regards,

M
 
Maybe a sparse file will work. I'm pretty sure at least ZFS metadata will be written to it. Since you have no data on those drives yet, it should be safe to try.
 
Hi wblock@,

Maybe a sparse file will work. I'm pretty sure at least ZFS metadata will be written to it. Since you have no data on those drives yet, it should be safe to try.

I do not want to be argumentative, but my understanding is, that if one offlines a drive that is a part of a zpool, the zpool has no access to the drive; therefore, nothing can be written to it. So it would stand to reason that if one offlines a sparse file, the same argument should apply.

Kindest regards,

M
 
mefizto said:
Hi Sebulon,

thank you for the reply. What do you think about modifying your proposal the following way, which would prevent me from risking potential loss of data by using one of the disks 0,1 in the new pool initially:

Create a sparse file with dd(1):
# dd if=/dev/zero of=/tmp/disk.img bs=1024k seek=149k count=1

Then create a memory disk from the file:
# mdconfig -a -t vnode -f /tmp/disk.img md2

Create newpool using disks 2-6 + the md2:
# zpool create -m /mnt newpool raidz2 disk{2,3,4,5,6,md2}

Offline the md2 so it is not written to; now I have a degraded pool:
# zpool newpool offline /dev/md2

Then continue the way you suggested and finally:
# zpool replace newpool md2 disk0

Any comments would be appreciated.

Kindest regards,

M

Yes that also works and provides you with better safety during the transfer of data. I would suggest this procedure when creating the placeholder:
Code:
[CMD="#"]diskinfo -v /dev/gpt/disk0 | grep bytes | awk '{print $1}'[/CMD]
2000397864960
[CMD="#"]echo "2000397864960 / 1024000 -1" | bc[/CMD]
1953512
[CMD="#"]dd if=/dev/zero of=/tmp/disk.img bs=1024000 seek=1953512 count=1[/CMD]
[CMD="#"]mdconfig -a -t vnode -f /tmp/disk.img[/CMD]
md0
[CMD="#"]diskinfo -v md0 | grep bytes | awk '{print $1}'[/CMD]
2000397312000

This is what I do when performing these procedures, to get the placeholder as close in size as one of the other original HDD's. 2TB BrandX can differ enough from BrandY to make the procedure fail.

/Sebulon
 
mefizto said:
I do not want to be argumentative, but my understanding is, that if one offlines a drive that is a part of a zpool, the zpool has no access to the drive; therefore, nothing can be written to it. So it would stand to reason that if one offlines a sparse file, the same argument should apply.

Metadata will be written when you create the pool. If there was really nothing written, then you could use gnop(8).
 
Solved

Hi Sebulon,

thank you for the confirmation and the procedure for creating the placeholder.

Hi wblock@,

thank you for the explanation, it makes sense.

Kindest regards,

M
 
Back
Top