Using gstripe to help expand raidz pools

This isn't really a question, just something I discovered today and wanted to share and get your feedback.

I just had a 3 TB disk fail in a 6x 3 TB RAID-Z2 pool. I didn't have any 3 TB spares, so I tried creating a fake 3 TB disk from two old 1.5 TB disks using gstripe with gstripe label -v st0 /dev/ad9 /dev/ad10. Then I did zpool replace tank 238497239478249 /dev/stripe/st0. It worked!

So this is a nice stopgap solution for getting a RAID-Z back to full redundancy while we wait for replacement disks. But then I wondered if this idea has any further applications?

For example the conventional wisdom I've always heard is that the only way to expand a RAID-Z/RAID-Z2/RAID-Z3 "in-place" is to buy larger disks and then replace them one at a time until the pool expands automatically when the final disk is replaced [1]. However, with this gstripe method it seems like one could buy half as many new double-size disks and double up the existing disks to make up the difference.

For example if you started with:

Code:
raidz2
  1.5TB raw disk
  1.5TB raw disk
  1.5TB raw disk
  1.5TB raw disk
  1.5TB raw disk
  1.5TB raw disk

you could migrate to this setup:

Code:
raidz2
  3TB raw disk
  3TB raw disk
  3TB raw disk
  1.5TB + 1.5TB gstripe
  1.5TB + 1.5TB gstripe
  1.5TB + 1.5TB gstripe

Here you would double your zpool's capacity with the purchase of only three new drives and no need to add additional vdevs to the pool. All the replacements except the last one can be done without partial loss of redundancy (for the last replacement you'd have to remove an active 1.5 TB drive from the pool in order to create the final 1.5 TB + 1.5 TB gstripe).

I would guess running a pool like that would decrease your Mean Time To Data Loss because there are more disks involved.

Anyway, I apologize if this is old news for you, but I hadn't considered using gstripe with a RAID-Z2 before. It seems like a versatile tool for helping to keep costs down.

[1] I already did that once, which is why I have six 3 TB drives in my RAID-Z2 and six spare 1.5 TB disks lying around.
 
For situation A, where you have no appropriately sized spares and want to stripe across 2 drives to replace one - I'd say sure, in an emergency it's better than zero replacement drives in the pool (though your failure rate will in theory be doubled).

As a permanent migration plan... I wouldn't go there myself - due to the added complexity and non-portability of the pool (e.g., Linux, Solaris won't have gstripe support). If you need to boot from some other recovery media, are you sure you'll get your stripes set back up correctly? Will the boot media try and mount the pool and pull in the first half of all your gstripe disks and fail them all?

Could you do that at 4am? :)
 
Ah, thank you for pointing out these important issues. Maybe it is not such a good idea after all.

Although gstripe label does store metadata on disk for automatic re-assembly, it seems like geom_stripe is not loaded by default. So one would have to kldload geom_stripe before the pool was imported or risk scary behavior.

It would be pretty horrible if ZFS saw its disk header on the first disk of the stripe, tried to import it alone, and then ended up ruining it!
 
Back
Top