ZFS using 'advanced format drives' with FreeBSD (8.2-RC3)

Firstly some praise: Thanks for this informative post. Previously the only thing I read about these non-512B sector disks is to avoid them for zfs, but your numbers look great.

And an addition: my system set up has 48 GB of ram and I found that zfs was only using around 600 MB until I raised the kmem size and other zfs tunables. You should definitely look into that. Here is basically what I used as a template: ZFSguru advanced tuning Before tuning, I found my numbers were 60-100% higher than yours (with more expensive hardware and 16 disks) with tests like yours (simple dd or cp with something other than /dev/zero though), after verifying no caching. But, when doing reading and writing at the same time, it went horribly slow... maybe 50-100 MB/s, which is slower than my raid0 fake raid desktop. After tuning the memory, it adds up to around 500, which is faster than an XFS 22 disk SAS RAID6 system we have.

And the main reason for my post... a question.
Does anyone know what sort of ashift/sector size an SSD should use? I am thinking it might affect my ZIL performance which I have on an SSD and I think it seems far too slow. I checked now, and it says ashift=9.
 
I use 4K on SSD just like I use on HDD. Not using it as ZIL, though.

How did you figure out what setting you need with 48GB RAM system? Out of curiosity what setting did you use, or did you just copied one from ZFSguru?
 
danpritts said:
I'm about to rebuild my system, as it turns out with 512-byte sector drives.

Is there any downside, other than a slight loss of capacity with small files, in building an array assuming the 4k sector size? I'm imagining replacing a disk later might be easier this way.

Now's about the time to start building all ZFS pools using ashift=12, regardless of the block size used on the drives themselves. There's negligible performance differences for using 512B disks in a 4K pool, but it future-proofs the pool: you can add 512B and 4K disks to an ashift=12 pool, but you cannot add a 4K drive to an ashift=9 pool.
 
peetaur said:
And an addition: my system set up has 48 GB of ram and I found that zfs was only using around 600 MB until I raised the kmem size and other zfs tunables.

That should no longer be needed on 8-STABLE/9.x systems, as the default kmem_size has been expanded to 64 GB or thereabouts on amd64 installs.

And you should not need to tune the arc_max setting anymore, unless you need to limit it (allow more RAM for other things), as that is now auto-tuned as well.
 
bbzz said:
I use 4K on SSD just like I use on HDD. Not using it as ZIL, though.

How did you figure out what setting you need with 48GB RAM system? Out of curiosity what setting did you use, or did you just copied one from ZFSguru?

I just took the template, and filled in my max memory minus 4 (leaving 1 as the guide suggested for the system, plus 3 more for no particular reason), and then set the other numbers proportionally high based on my max memory.

Here are my settings right now.
Code:
vm.kmem_size="44g"
vm.kmem_size_max="44g"
vfs.zfs.arc_min="80m"
vfs.zfs.arc_max="42g"
vfs.zfs.arc_meta_limit="24g"
vfs.zfs.vdev.cache.size="32m"
vfs.zfs.vdev.cache.max="256m"
vfs.zfs.vdev.min_pending="4"
vfs.zfs.vdev.max_pending="32"
kern.maxfiles="950000"

arc_meta_limit may be low, but my system isn't full enough to matter (16 disks filled to 30%, with 24 empty bays for the future). None of the above memory limits are ever hit. When I have more data, I expect that something will hit a limit and I will tweak further.

I check to see if I hit my limits with
# zfs-stats -a
 
How would we go about with replacing drives on a mirrored or raidzN vdev previously created with gnop (all vdevs shows ashift=12)? Do we need to create a gnop for the new drive and use the gnop device on the zpool replace command?
 
Nope. The ashift is set permanently on the vdev when the vdev is created. Just replace the drives normally, and the ashift won't change.
 
phoenix said:
Nope. The ashift is set permanently on the vdev when the vdev is created. Just replace the drives normally, and the ashift won't change.

Got it. Thanks.

What if the mirrored vdev (sample below is for zil) has a current ashift=9 and I want to change it to ashift=12. Are my steps below perfectly safe? Notice the resilver process kicking in after zpool attach. Do I need the extra detach/attach steps to remove the "da2.nop" device from the vdev and use "da2" directly? Is there any difference with having da2.nop show up on zpool status?

[CMD=""]zpool status[/CMD]
Code:
  pool: data
 state: ONLINE
config:

        NAME        STATE     READ WRITE CKSUM
        data        ONLINE       0     0     0
          da0       ONLINE       0     0     0
          da1       ONLINE       0     0     0
        logs
          da2       ONLINE       0     0     0
          da3       ONLINE       0     0     0

errors: No known data errors

[CMD=""]# zdb|grep ashift[/CMD]
Code:
            ashift: 9
            ashift: 9
            ashift: 9
            ashift: 9

[CMD=""]# zpool remove data da2 da3[/CMD]
[CMD=""]# zpool status[/CMD]
Code:
  pool: data
 state: ONLINE
config:

        NAME        STATE     READ WRITE CKSUM
        data        ONLINE       0     0     0
          da0       ONLINE       0     0     0
          da1       ONLINE       0     0     0

errors: No known data errors
[CMD=""]# gnop create -S 4096 da2[/CMD]
[CMD=""]# zpool add data log mirror da2.nop da3[/CMD]
[CMD=""]# zpool status[/CMD]
Code:
  pool: data
 state: ONLINE
config:

        NAME         STATE     READ WRITE CKSUM
        data         ONLINE       0     0     0
          da0        ONLINE       0     0     0
          da1        ONLINE       0     0     0
        logs
          mirror-2   ONLINE       0     0     0
            da2.nop  ONLINE       0     0     0
            da3      ONLINE       0     0     0

errors: No known data errors
[CMD=""]# zdb|grep ashift[/CMD]
Code:
            ashift: 9
            ashift: 9
            ashift: 12
[CMD=""]# zpool detach data da2.nop[/CMD]
[CMD=""]# zpool status[/CMD]
Code:
  pool: data
 state: ONLINE
config:

        NAME        STATE     READ WRITE CKSUM
        data        ONLINE       0     0     0
          da0       ONLINE       0     0     0
          da1       ONLINE       0     0     0
        logs
          da3       ONLINE       0     0     0

errors: No known data errors
[CMD=""]# zdb|grep ashift[/CMD]
Code:
            ashift: 9
            ashift: 9
            ashift: 12
[CMD=""]# gnop destroy /dev/da2.nop[/CMD]
[CMD=""]# zpool attach data da3 da2[/CMD]
[CMD=""]# zpool status[/CMD]
Code:
  pool: data
 state: ONLINE
 scan: resilvered 0 in 0h0m with 0 errors on Fri Feb 15 11:03:08 2013
config:

        NAME        STATE     READ WRITE CKSUM
        data        ONLINE       0     0     0
          da0       ONLINE       0     0     0
          da1       ONLINE       0     0     0
        logs
          mirror-2  ONLINE       0     0     0
            da3     ONLINE       0     0     0
            da2     ONLINE       0     0     0

errors: No known data errors
[CMD=""]# zdb|grep ashift[/CMD]
Code:
            ashift: 9
            ashift: 9
            ashift: 12
 
Back
Top