ZFS ZFS on partitioned disks / reserved space / short stroking

Hi. Does anybody know of a good solution to set up a guaranteed allocation of physical drive space for a ZFS pool?

Ideally I would like my database pool to allocate-from and use only the first third (for example) of the physical sectors of all my drives so that all data lives in the "fast part". I've never thought to do this before but it seems to be a somewhat common practice with dedicated DBs and SANs.

The typical method seems to be to only use a small part of your total capacity and leave a huge part of the drive empty.

But I would like to somehow utilize the upper / "slow part" of each drive for cold storage of infrequently accessed static assets (e.g. I have a large number of 64GB BLOBS that only need to be accessed say once a month as a batch operation).



First thing I tried was using the ZFS "refreservation" and "reservation" settings but they seem to be more like a pre-assigned quota and looking at the metaslab metadata it doesn't seem like data written outside of the reserved filesystem is treated any differently in terms of where it is physically allocated.



That leaves me with partitioning the drives - which is less flexible and I'm concerned about hampering the performance of the "fast" partition as the smaller filesystem fills up.

I've read a lot of posts saying that you shouldn't fill ZFS past say 50% capacity to keep it happy BUT from what I understand of how the spacemap is implemented it only switches over to a sub-optimal algorithm on a per-metaslab basis and only when that metaslab has less that 4% free-space. So it seems like a ZFS filesystem shouldn't loose any performance (to allocation degradation) up to say 90% used capacity.

Can anybody confirm this? Or maybe suggest some other reasons why partitioning is a bad idea?



The other big thing is the obvious resource conflict between the two zpools operating on each section of the drive. But if one of the pools is hardly ever used I can't see why this would be a big problem. And it seems like you'd have very similar resource conflicts during concurrent access whether the two separate datasets were in the same zpool or not. (Although probably with better scheduling.)

Does anybody have practical experience with this?


And does anybody have any guesses about whether using a different pooling strategy for the different partitions would cause issues? (i.e. using a single mirrored pool across the lower partitions and using multiple RAIDz pools across sets of the upper partitions).

The FreeBSD Handbook says that there's no loss of performance when ZFS is given a partition instead of a whole disk - but I assume this isn't the scenario they are referring to. The conventional wisdom seems to be to always prefer whole disks over partitions.
 
So I went ahead and tried this anyway. If anybody is interested then yes it can be done. But is it a good idea? Probably not maybe yes.


As expected, performance during concurrent access to both pools is pretty awful but not unusable. Probably expect to lose 20-25% at least and I imagine the more clients you have the worse this will be. There's just no way for the two zpools to efficiently schedule the IO (as I imagine they both think they have exclusive access to the underlying hardware). There might be some other issues like not being able to properly take advantage of the HDD's cache etc.


But if the other partition is genuinely just used as cold storage and you are accessing either one OR the other and only infrequently both then this might be a viable scheme to consider. There doesn't seem to be any difference to the pools when they are operating in isolation.

Some care needs to be taken deciding exactly how to split apart the partitions. You want to make sure you know how to recover during a failure and plan for future expansion etc.


And for hobby users who are only streaming media over gigabit ethernet etc. then even concurrent pool access is going to saturate your connection well and truly. So it might be an option for say mixing smaller triple redundant mirrors with a larger RAIDz or whatever your heart is set on.
 
Back
Top