I am currently replacing a lot of old backup-patchwork made up of AMANDA jobs and lots of shellscripts with ZFS-based backups.
One usage scenario for ZFS replication is for our smartOS zones, where the zone datasets (=mostly zvols for KVM VMs) are being snapshotted and send|received to a storage server. With these datasets I see a _much_ higher disk usage on FreeBSD compared to the original datasets on illumos/smartOS.
Both pools use the same ashift (=12 for 4k drive alignment) and datasets are set to use LZ4 compression. However, actual used space on FreeBSD for the replicated datasets is nearly twice as high (USED & REFER) as on illumos.
This output is from a freshly replicated dataset (
I can see similar behavior with other replicated datasets where FreeBSD is using ~70-100% more disk space than the original dataset on smartOS.
To get out some variables with replicating streams, I transferred just the current state of the dataset to the FreeBSD host:
Also note how LUSED is lower than USED and/or REFER - shouldn't it always be higher when using compression? At least that's what I see everywhere with every "native" and replicated datasets from other FreeBSD systems, except for very small datasets (just a few kb), where more metadata is being written/aggregated than actual data...
Let's see what happens when sending this dataset back to smartOS:
REFER is back to its original size; COMPRESSRATIO is back up at 1.76x just as the original source.
FreeBSD used nearly 33GB more space (+75% !!) for the same data...
For one, the LZ4 implementation in FreeBSD seems to be much less efficient than the one illumos is using (assuming the COMPRESSRATIO is correct). But even this won't account for all the additional space used on FreeBSD, so there may be another, additional problem? (Metadata?)
Can anyone try using ZFS send|receive between FreeBSD and illumos machines and can or can't confirm these findings?
I've never really looked behind the curtains of ZFS (or any filesystem at all...), so if any dev or filesystem-wizard could provide me with some insight on where to start figuring out what is going wrong here, I'd be happy to try to shed some light into this (possible) issue.
One usage scenario for ZFS replication is for our smartOS zones, where the zone datasets (=mostly zvols for KVM VMs) are being snapshotted and send|received to a storage server. With these datasets I see a _much_ higher disk usage on FreeBSD compared to the original datasets on illumos/smartOS.
Both pools use the same ashift (=12 for 4k drive alignment) and datasets are set to use LZ4 compression. However, actual used space on FreeBSD for the replicated datasets is nearly twice as high (USED & REFER) as on illumos.
Code:
smartOS # uname -a
SunOS vhost1 5.11 joyent_20170706T001501Z i86pc i386 i86pc
smartOS # zfs list -o name,used,lused,refer,usedbysnapshots,compress,compressratio,dedup zones/c86060a8-15b3-c641-d3e9-9cb03a1d6878-disk0
NAME USED LUSED REFER USEDSNAP COMPRESS RATIO DEDUP
zones/c86060a8-15b3-c641-d3e9-9cb03a1d6878-disk0 57.0G 94.9G 45.7G 11.3G lz4 1.68x off
Code:
FBSD # uname -a
FreeBSD stor1 11.0-RELEASE-p10 FreeBSD 11.0-RELEASE-p10 #5 r309898M: Fri May 5 12:14:20 CEST 2017 root@stor1:/usr/obj/usr/src/sys/NETGRAPH_VIMAGE amd64
FBSD # zfs list -o name,used,lused,refer,usedbysnapshots,compress,compressratio,dedup -r stor1/backups/zones/winsrv1/c86060a8-15b3-c641-d3e9-9cb03a1d6878-disk0
NAME USED LUSED REFER USEDSNAP COMPRESS RATIO DEDUP
stor1/backups/zones/winsrv1/c86060a8-15b3-c641-d3e9-9cb03a1d6878-disk0 97,4G 95,0G 78,3G 19,1G lz4 1.31x off
ssh zfs send -peDR | zfs recv -ue
). I can see similar behavior with other replicated datasets where FreeBSD is using ~70-100% more disk space than the original dataset on smartOS.
To get out some variables with replicating streams, I transferred just the current state of the dataset to the FreeBSD host:
Code:
FBSD # ssh root@10.10.2.100 zfs send -e zones/c86060a8-15b3-c641-d3e9-9cb03a1d6878-disk0@--head-- | zfs recv -ue stor1/test
FBSD # zfs list -o name,used,lused,refer,compress,compressratio -r stor1/test/c86060a8-15b3-c641-d3e9-9cb03a1d6878-disk0
NAME USED LUSED REFER COMPRESS RATIO
stor1/test/c86060a8-15b3-c641-d3e9-9cb03a1d6878-disk0 76,1G 75,9G 76,1G lz4 1.34x
Let's see what happens when sending this dataset back to smartOS:
Code:
FBSD # zfs send -e stor1/test/c86060a8-15b3-c641-d3e9-9cb03a1d6878-disk0 | ssh root@10.10.2.100 zfs recv -ue zones/test
Code:
smartOS # zfs list -ro name,used,lused,refer,compress,compressratio zones/test/c86060a8-15b3-c641-d3e9-9cb03a1d6878-disk0
NAME USED LUSED REFER COMPRESS RATIO
zones/test/c86060a8-15b3-c641-d3e9-9cb03a1d6878-disk0 43.3G 75.7G 43.3G lz4 1.76x
smartOS # zfs list -ro name,used,lused,refer,compress,compressratio zones/c86060a8-15b3-c641-d3e9-9cb03a1d6878-disk0@--head--
NAME USED LUSED REFER COMPRESS RATIO
zones/c86060a8-15b3-c641-d3e9-9cb03a1d6878-disk0@--head-- 1.20G - 43.3G - 1.76x
REFER is back to its original size; COMPRESSRATIO is back up at 1.76x just as the original source.
FreeBSD used nearly 33GB more space (+75% !!) for the same data...
For one, the LZ4 implementation in FreeBSD seems to be much less efficient than the one illumos is using (assuming the COMPRESSRATIO is correct). But even this won't account for all the additional space used on FreeBSD, so there may be another, additional problem? (Metadata?)
Can anyone try using ZFS send|receive between FreeBSD and illumos machines and can or can't confirm these findings?
I've never really looked behind the curtains of ZFS (or any filesystem at all...), so if any dev or filesystem-wizard could provide me with some insight on where to start figuring out what is going wrong here, I'd be happy to try to shed some light into this (possible) issue.