Server layout: rootfs on USB flash drive or NVMe?

aiui you want your special vdevs to have the exact same redundancy as your main pool, which would complicate adding one of those to our 8-way raidz3
 
If it were my system, I would get a pair of quality NVMe SSDs as large as I could afford. Consider heatsinks as the design stage.
I bought 2 poor-to-medium quality NVMe drives, used, 256 GB. The board has PCIe 3 x 4 lanes IIRC, and NVMe 1.3. So plugging in hyper-dyper-throw-your-money-at-me NVMe drives just doesn't make sense. These are already much faster than the rotating SATA HDD, that's all I want; and since reads can be done in parallel even the cheap "slow" ones will saturate the PCI bus. I have to keep an eye on performance/money.
Heat sinks: yes these are on my list. The NVMe will get one, but the RAM not. This is the 1st time in my life that I do what "modders" do to pimp up their hardware... ;) They have blinkin' lights on their RAM DIMM's and NVMe and so on, crazy!!!
Mirror them using ZFS. But see the caveat bellow about copy-on-write file systems -- you may wish to use a GEOM mirror for the whole disk, and have multiple types of file systems.
Why should I want a GEOM mirror? I would use that for swap, but ATM I think I don't need to mirror swap, because that box is not "mission critical": if it crashes, it crashes and I loose some fresh data, let's say the last 15 minutes. Ok, be it so, I don't care.
I would also install swap here, but keep watch and move it to a SATA mirror if the swap gets too intensive.
If the swap get's too extensive I can double the RAM from 2 x 8 to 2 x 16 GB. For the time beeing, I think 16 GB is pretty much more than I need with sufficient headroom.
[...] A SLOG never needs to be larger than main memory.
THIS is the kind of information I need. I didn't find this anywhere.
I would always consider putting a special VDEV on the SSDs. This is essentially a cache for the file system metadata on the SATA disks. They are a challenge to size correctly.
Why do I have to worry about the correct seizing? Can't I use ZVOLs for these (zpool cache, log, special and dedup vdev).
So put it adjacent to the swap space, and move the swap to SATA if more space is required. Plan for this when you partition the disks.
The NVMe's are big enough to hold 2 x 32 GB swap, i.e. 2 x max RAM. I seldomly needed more tham this, and that was when a program went completely crazy; more swap would only increase the time to arrive at crash. I'm 100% NOT going to swap to the SATA disks.
I would try place my important VMs on the SSDs. But beware, you don't want both the hypervisor and the VM client using copy-on-write file systems. And you don't want VMs swapping madly to any underlying SSD.
OK thx I'll keep that in mind and read why this is so.
I would never build a NAS without 100% redundancy. I would mirror the SATA disks... and migrate storage between SSD and SATA as needs dictate.
But I dont need redundancy for all the data. Some is "scratch" data, so why should I mirror it? If it's gone it's gone, I can download or create again.
 
L2ARC requires memory for its allocation tables; roughly 100MB per 1GB stored in L2ARC as a rule of thumb. So L2ARC is always the wrong tool against memory pressure on such low-end systems as you will worsen the problem.
Ok. So my naive plan to have 2 x 64 = 128 GB L2ARC cache (50/50 for the mirror and the scratch) is nonsense, because this would need 2 x 6.4 = 13 GB of RAM. And I have only 16 GB total. So either I cut down the size of the cache drastically or skip it completely.

This also means I gain much space on the NVMe SSD, 1st candidate for a reasonable usage is the base OS. Then the decision is clear: no OS on the internal USB thumb drive, instead install to the NVMe SSDs. Mirror most, and put on the striped "scratch" zpool what does not need redundancy: /var/cache, /var/crash, /var/obj (historically /usr/obj) etc. pp.
 
please put "zpool list -v". We are here now in the swamp.
Alan, I'm still waiting for the 1st 6 TB HDD to arrive, and ATM I have the 1st NVMe but the 2nd comes tomorrow or thursday or friday. Ok? And I still have to "shoot" a 2nd SATA HDD with 6 TB, which is not so easy because some nerds are paying extraordinarily high prices for used hardware. Can you believe that they even pay 5 €/TB for a DEFECT HDD?!!! Yes, they do! That's CRAZY!!! I have time. I bid, and when others bid more, be it so, I'll bid in the next auction. And no SMR please and no such crap like Barracuda or "Green". ATM I have two old SATA SSD with 256 GB in the box, just for testing. zpool list -v would show s/th like that: 158 GB DATA, 158 GB SCRATCH. The box is switched off, so I can't produce a real command output, ok? Cheers.
 
Have a nice day. See you later. Me now in the process of putting a NVME zpool special device , single point of failure onto a mirrored special device on SSD.
 
Why should I want a GEOM mirror? I would use that for swap, but ATM I think I don't need to mirror swap, because that box is not "mission critical": if it crashes, it crashes and I loose some fresh data, let's say the last 15 minutes. Ok, be it so, I don't care.
You might want to provision storage to a VM which is not ZFS on the server side, e.g. using a mirror'd UFS file system. See my comments re COW client file systems provisioned on COW server file systems above.
Why do I have to worry about the correct seizing [of a special VDEV]? Can't I use ZVOLs for these (zpool cache, log, special and dedup vdev).
ZFS terminology can get intense. I don't think that you mean ZVOL, which is a specialized ZFS dataset that presents as a raw block device (virtual disk) rather than a file system. As an example, I use ZVOLs to provision iSCSI storage from my ZFS server to my KVM server for running Windows VMs.

VDEVs may be either for storage or support. The support VDEV classes include the special VDEV. Expanding VDEVs is a lot easier these days than it used to be, but you still need to size them, and special VDEVs are tricky to size. You don't really know how big they need to be until you have instantiated them. So you need to have a contingency plan to grow, if required. Placing the metadata from a slow disk on an SSD special VDEV can deliver significant benefits in some circumstances (e.g. metadata intense applications like find(1)).
 
Back
Top