ZFS Location of data In VDEVs and when to add ARC/ZIL?

I'm setting up a desktop with a mirror of two 480GB SSDs for my root and home directories on FreeBSD and Linux, as well as have a mirror of two 2TB HDDs in another vdev. I was hoping to get the speed of the SSDs for my system but still have the ability to have some extra storage on the hard drives.

From what I have read it doesn't always make sense you add a L2ARC or ZIL SSD and sometimes you can actually slow down your computer unless you have enough memory so I'm afraid of going down that route as I don't know enough to make the judgment as to whether or not it would help. How do you know when to add one? I have 16 gigabytes of memory so I assume that's not enough to warrant as an L2ARC, am I right or would that be a better way to speed my computer up then using the SSD as a root disk. If that's the case I do have a 200 gigabyte SSD I could potentially partition and use as an L2ARC or ZIL

The other thing I was wondering is if I do end up having two VDEVs where my system files are on the SSD in one VDEV, and other data is in a VDEV on the hard drive, data won't be spread across but the hard drive and SSD right? I should get faster speed from the files in the SSD VDEV right? They won't be spread across both unless I where to explicitly define that by using another mirror? for example.
 
The reason you add L2ARC or ZIL is to speed up read and write operations. It does not make sense to have a Zpool with SSDs and add a slow HD as L2ARC or ZIL. That's not how it's meant to work. If you have relatively slow HDs in your Zpool you can use SSDs as L2ARC and/or ZIL to speed things up a bit. That's how it's meant to be used.

There are several layers of caching with ZFS, ARC is the first and fastest as it uses internal RAM memory, L2ARC is the next best thing, preferably using fast SSDs. Either way caching really only works well if you have small files that are randomly accessed. Big files (movies for example) shouldn't be cached as a single file would fill up the entire cache. In that case you simply do not get the benefit of caching.

If you have two separate Zpools, one with SSDs and one with HDs, the SSD Zpool would indeed be faster. However if you have one Zpool with two VDEVS (one SSD and one HD) it will slow things down to the slowest drive for everything because the data is spread across the two VDEVS.
 
The other thing I was wondering is if I do end up having two VDEVs where my system files are on the SSD in one VDEV, and other data is in a VDEV on the hard drive,

As kpa just replied, if you add more than one vdev to a pool, say two mirrors, all data will be striped across those mirrors and you have no control over that. A pool is made up of one or more vdevs, and ZFS puts the data wherever it wants.

If you want the have the system on the SSDs, and your normal data on the spinning disks, you will need to create two separate pools.

Regarding the FreeBSD/Linux thing, I'll be highly surprised if you can get both systems to boot off the same ZFS pool. Pools are tied to the system they are created on (they contain the host ID of the system in the pool metadata). To access a pool on two systems you would ideally need to export on the one system, then import on the other, every single time you want to swap between them. Booting is slightly different but I would be surprised if the system doesn't complain that it's trying to boot from a pool that is imported into another system.

I think the closest you will get is two pools on the SSDs (basically partition them in half), with each OS system on one of those pools; I don't know if you'll be able to get grub to ZFS boot both Linux & FreeBSD though, I have no experience with that. Then use the 2TB disks for everything else. Although as mentioned you'll need to re-export/import the data pool every time you switch OS.
 
The reason you add L2ARC or ZIL is to speed up read and write operations. It does not make sense to have a Zpool with SSDs and add a slow HD as L2ARC or ZIL. That's not how it's meant to work. If you have relatively slow HDs in your Zpool you can use SSDs as L2ARC and/or ZIL to speed things up a bit. That's how it's meant to be used.

There are several layers of caching with ZFS, ARC is the first and fastest as it uses internal RAM memory, L2ARC is the next best thing, preferably using fast SSDs. Either way caching really only works well if you have small files that are randomly accessed. Big files (movies for example) shouldn't be cached as a single file would fill up the entire cache. In that case you simply do not get the benefit of caching.

If you have two separate Zpools, one with SSDs and one with HDs, the SSD Zpool would indeed be faster. However if you have one Zpool with two VDEVS (one SSD and one HD) it will slow things down to the slowest drive for everything because the data is spread across the two VDEVS.

Whoops, I mistyped, it's a 200GB SSD ofc.

As kpa just replied, if you add more than one vdev to a pool, say two mirrors, all data will be striped across those mirrors and you have no control over that. A pool is made up of one or more vdevs, and ZFS puts the data wherever it wants.

If you want the have the system on the SSDs, and your normal data on the spinning disks, you will need to create two separate pools.

Regarding the FreeBSD/Linux thing, I'll be highly surprised if you can get both systems to boot off the same ZFS pool. Pools are tied to the system they are created on (they contain the host ID of the system in the pool metadata). To access a pool on two systems you would ideally need to export on the one system, then import on the other, every single time you want to swap between them. Booting is slightly different but I would be surprised if the system doesn't complain that it's trying to boot from a pool that is imported into another system.

I think the closest you will get is two pools on the SSDs (basically partition them in half), with each OS system on one of those pools; I don't know if you'll be able to get grub to ZFS boot both Linux & FreeBSD though, I have no experience with that. Then use the 2TB disks for everything else. Although as mentioned you'll need to re-export/import the data pool every time you switch OS.

Interesting, It would still make sense to mirror that SSD with another SSD though right? Say for example make four partitions, two on SSD1 and two on SSD2 and then mirror each partition with the other SSD so that I'm getting proper checking for data corruption out of ZFS. You need to have your mirror on a different drive for it to work properly right?
 
Say for example make four partitions, two on SSD1 and two on SSD2 and then mirror each partition with the other SSD so that I'm getting proper checking for data corruption out of ZFS. You need to have your mirror on a different drive for it to work properly right?

A mirrorr is just what it sounds like: the two (or more) volumes are completely identical, so that if one disk dies the data remains intact and accessible on the others. You can do this without ZFS (gmirror(8) on FreeBSD, mdadm on Linux, dynamic volumes on Windows, etc), and the ability of ZFS to create mirrors isn't exactly one of its key features (though it's definitely a nice thing to have). ZFS adds some protection through checksums and multiple copies of metadata, and while multi-device vdev types activate its self-healing ability you can also get that from a one-disk pool by increasing the "copies" property on it.

If you wanted to run two different operating systems with ZFS, installing each one to a separate pool on each SSD after running zfs set copies=2 on each pool would be the simplest, most manageable, and most convenient way to do it. You'd get nearly all of the benefits of ZFS available to each OS, without a convoluted boot scheme or worrying about either OS fouling up or arbitrarily limiting the other.

EDIT: Only data added to a pool after the "copies" property has been modified will be affected by that property. I updated my message above to clarify that.
 
  • Thanks
Reactions: mae
Back
Top