ZFS ZFS where has the freespace gone?

We have a four disk zraid2 pool with 4 3T drives.
Code:
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
bootpool  1.98G   284M  1.71G        -         -    15%    13%  1.00x  ONLINE  -
zroot     10.6T  8.61T  2.02T        -         -    57%    81%  1.00x  ONLINE  -

According to this we are at 81% utilisation. However, I cannot seem to find where the space has gone:

Code:
zfs list -o space -r zroot | sort -hk3
NAME                     AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
. . .
zroot/vm/inet16/disk0     836G   284G      218G   65.5G              0          0
zroot/vm/inet14           836G   475G      645K    151K              0       475G
zroot/vm/inet14/disk0     836G   475G      374G    101G              0          0
zroot/vm/inet19           836G   539G     1.26M    169K              0       539G
zroot/vm/inet19/disk0     836G   539G      328G    210G              0          0
zroot/vm/inet18           836G   612G      732K    157K              0       612G
zroot/vm/inet18/disk0     836G   612G      499G    112G              0          0
zroot/vm/inet13           836G   647G     1.24M    169K              0       647G
zroot/vm/inet13/disk0     836G   647G      461G    185G              0          0
zroot/vm/inet17           836G  1.36T     1.47M    169K              0      1.36T
zroot/vm/inet17/disk0     836G  1.36T     1006G    391G              0          0
zroot/vm                  836G  4.12T      738K   10.7G              0      4.10T
zroot                     836G  4.17T         0    140K              0      4.17T

If I understand this correctly then the total disk space used should be 4.17T. When I check the size of our snapshots then I see this:

Code:
zfs list -p -t snapshot -o space -r zroot | awk '{ total += $3 }; END { print total }'
1088709139584

Which is just over one Tb. That plus the free space amounts to 3.1 Tb. Where is the other Tb?
 
Don't really understand Your math. Can You elaborate?

With raild5 the figures are weird: in zpool the redundancy is included in "ALLOC" and not subtracted from "FREE"; in zfs the redundancy is not included in "USED" and subtracted from "AVAIL".
 
PMc hit the nail on the head:

zpool status reports physical bytes in disk; this will include bytes used (or that will need to be used, for free space) for redundancy.

zfs status shows pre-redundancy numbers (for the logicalreferenced, the value is pre-compression or copies impact, too.)

There is also some space reserved that also reduces the free space reported by zfs from the zpool reported one. (Beyond just the redundancy overhead.)
 
I readily admit that I am confused about this. I started off with 4 x 3Tb disks. I created a zraid2 and ended up with ~ 8.8Tb usable. I presently see this in zpool list:
Code:
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
bootpool  1.98G   257M  1.73G        -         -    12%    12%  1.00x  ONLINE  -
zroot     10.6T  8.86T  1.76T        -         -    57%    83%  1.00x  ONLINE  -

I need to recover enough of the space given over to snapshots to get utilisation down to around 70%.

Code:
zfs list -p -o name,avail,used,usedsnap,usedds -r zroot 
NAME                            AVAIL           USED       USEDSNAP        USEDDS
zroot                    759407069440  4722379343616              0        142848
zroot/ROOT               759407069440    32881407360              0        142848
zroot/ROOT/default       759407069440    32881264512    15515643840   17365620672
zroot/tmp                759407069440       14695488       14475264        220224
zroot/usr                759407069440      944576448              0        142848
zroot/usr/home           759407069440         142848              0        142848
zroot/usr/ports          759407069440      944147904              0     944147904
zroot/usr/src            759407069440         142848              0        142848
zroot/var                759407069440      264256896              0        142848
zroot/var/audit          759407069440         142848              0        142848
zroot/var/crash          759407069440         142848              0        142848
zroot/var/log            759407069440      113082048      102856512      10225536
zroot/var/mail           759407069440        2654592        2476032        178560
zroot/var/tmp            759407069440      148091712         821376     147270336
zroot/vm                 759407069440  4662441185856         767808   11516590272
zroot/vm/inet09          759407069440         160704              0        160704
zroot/vm/inet13          759407069440   710384841984        1529664        172608
zroot/vm/inet13/disk0    759407069440   710383139712   510991621824  199391517888
zroot/vm/inet14          759407069440   526271233152         857088        160704
zroot/vm/inet14/disk0    759407069440   526270215360   417189327936  109080887424
zroot/vm/inet16          759407069440   315232764288        1422528        172608
zroot/vm/inet16/disk0    759407069440   315231169152   244743888960   70487280192
zroot/vm/inet17          759407069440  1535427506112        1886784        172608
zroot/vm/inet17/disk0    759407069440  1535425446720  1114848299904  420577146816
zroot/vm/inet18          759407069440   702397847232         952320        160704
zroot/vm/inet18/disk0    759407069440   702396734208   581470331136  120926403072
zroot/vm/inet19          759407069440   588604541184        1547520        172608
zroot/vm/inet19/disk0    759407069440   588602821056   362369866368  226232954688
zroot/vm/samba-01        759407069440   219476155008         708288        172608
zroot/vm/samba-01/disk0  759407069440   219475274112   130993723008   88481551104
zroot/vm/samba-02        759407069440    53128778112        1124928        172608
zroot/vm/samba-02/disk0  759407069440    53127480576    25373935488   27753545088

The majority of space taken up by snapshots seem to be these datasets:
Code:
NAME                            AVAIL           USED       USEDSNAP        USEDDS
zroot/vm/inet13/disk0    759407069440   710383139712   510991621824  199391517888
zroot/vm/inet17/disk0    759407069440  1535425446720  1114848299904  420577146816
zroot/vm/inet18/disk0    759407069440   702396734208   581470331136  120926403072

If I look at `zroot/vm/inet17/disk0` in detail I see this:
Code:
zfs list -rt all zroot/vm/inet17 | sort -hk2
. . .
zroot/vm/inet17/disk0@2019-04-26_02.00.00--6w   7.24G      -   389G  -
zroot/vm/inet17/disk0@2019-04-27_02.00.00--6w   7.31G      -   389G  -
zroot/vm/inet17/disk0@2019-04-30_02.00.00--6w   8.06G      -   389G  -
zroot/vm/inet17/disk0@2019-04-29_02.00.00--6w   9.50G      -   389G  -
zroot/vm/inet17/disk0@2018-08-01_00.00.00--2y   12.1G      -   399G  -
zroot/vm/inet17/disk0@2019-02-24_01.00.00--3m   12.3G      -   382G  -
zroot/vm/inet17/disk0@2018-10-01_00.00.00--2y   20.9G      -   367G  -
zroot/vm/inet17/disk0@2019-02-01_00.00.00--2y   25.9G      -   380G  -
zroot/vm/inet17/disk0@2018-11-01_00.00.00--2y   26.6G      -   367G  -
zroot/vm/inet17/disk0@2019-01-01_00.00.00--2y   40.2G      -   376G  -
zroot/vm/inet17                                 1.40T   706G   169K  /zroot/vm/inet17
zroot/vm/inet17/disk0                           1.40T   706G   392G  -

It seems to me that if I remove snapshots up to and including zroot/vm/inet17/disk0@2018-08-01_00.00.00--2y then I should recover approximately 400Gb. If so then that should put utilisation back to about 75%.

Is that a reasonable inference?
 
I started off with 4 x 3Tb disks. I created a zraid2 and ended up with ~ 8.8Tb usable.
That doesn't add up. It does when it's RAID-Z, not RAID-Z2. With 4 x 3TB in RAID-Z you get around 8.16 TB of usable space. With RAID-Z2 that would be around 5.44 TB.

Also keep in mind the differences between metric and binary prefixes. Disks are sold using metric prefixes, not binary prefixes. So a 3TB drive is 3.000.000.000.000 bytes, not 3298534883328 bytes. The OS (for both UFS and ZFS) uses binary prefixes. So a 3 TB disk is actually 2.72 TB (and that's excluding the obvious overhead of formats and disk layouts)

Also note the convention of 'b' and 'B', 'b' is generally meant to indicate bits while 'B' is used to indicate bytes.
 
Another point in addition to SirDice's explanation: The "SIZE" and "ALLOC" in zpool list's output do not take into account the redundancy. With RAID-Z2 you have only 50% usable space, so the ALLOC shows actually your data doubled.
Also, to keep in mind - when using compression (depending on the content), sometimes you get much more from the storage that what it's actually displayed.
Another point - Snapshots also take space. If you have lots of content changing and snapshots keeping the old versions, it adds up quickly. Check the "REFER" column to see how much is taken including snapshots.
 
I readily admit that I am confused about this. I started off with 4 x 3Tb disks. I created a zraid2 and ended up with ~ 8.8Tb usable.

No. 4x3TB = 10.91TB.
Then raidz2 should be 50%, minus some 3.2% striping tax deducted, minus some other fees and taxes, might get 5.24 TB.
Minus hidden reserve (spa slop shift) 3.2% = 5.08TB = 5582220534218.

If I look at `zroot/vm/inet17/disk0` in detail I see this:
Code:
zfs list -rt all zroot/vm/inet17 | sort -hk2
. . .
zroot/vm/inet17/disk0@2019-04-26_02.00.00--6w   7.24G      -   389G  -
zroot/vm/inet17/disk0@2019-04-27_02.00.00--6w   7.31G      -   389G  -
zroot/vm/inet17/disk0@2019-04-30_02.00.00--6w   8.06G      -   389G  -
zroot/vm/inet17/disk0@2019-04-29_02.00.00--6w   9.50G      -   389G  -
zroot/vm/inet17/disk0@2018-08-01_00.00.00--2y   12.1G      -   399G  -
zroot/vm/inet17/disk0@2019-02-24_01.00.00--3m   12.3G      -   382G  -
zroot/vm/inet17/disk0@2018-10-01_00.00.00--2y   20.9G      -   367G  -
zroot/vm/inet17/disk0@2019-02-01_00.00.00--2y   25.9G      -   380G  -
zroot/vm/inet17/disk0@2018-11-01_00.00.00--2y   26.6G      -   367G  -
zroot/vm/inet17/disk0@2019-01-01_00.00.00--2y   40.2G      -   376G  -
zroot/vm/inet17                                 1.40T   706G   169K  /zroot/vm/inet17
zroot/vm/inet17/disk0                           1.40T   706G   392G  -

It seems to me that if I remove snapshots up to and including zroot/vm/inet17/disk0@2018-08-01_00.00.00--2y then I should recover approximately 400Gb.

That may or may not be the case. While the "REFER" column shows the size that this specific archived filesystem would contain if it were standing for itself, the "USED" column shows, well, something:
Files that were created, snapshoted and then deleted again, are certainly contained in the respective snapshot(s), do consume space, but are not visible in the figures per individual snapshot, only in the sum of "usedbyshapshots". The space gets freed if one deletes all of the snapshots that were taken during the lifetime of that file - but there seems to be no figure to predetermine how much that might be.
 
Thank you for the explanation. We (I) use zfsnap to automatically create snapshots and in my naivete I adopted a rather aggressive snapshot schedule which has turned around and bitten me. We were creating monthly snapshots with a two year expiry date. I have changed that to one year and have destroyed snapshots older than one year. This has gotten us back to 80% utilisation so far.

Our problem appears to be our imap server which is a BHyve vm (inet17). This seems to be consuming the vast majority of the snapshot space, providing that I am reading the report correctly:
Code:
zfs list -t snapshot -o name,used,avail,refer,creation,usedds,usedsnap,origin,compression,compressratio,refcompressratio,mounted,atime,lused | sort -rhk4 | more

NAME                                               USED  AVAIL  REFER  CREATION               USEDDS  USEDSNAP  ORIGIN  COMPRESS  RATIO  REFRATIO  MOUNTED  ATIME  LUSED

zroot/vm/inet17/disk0@2018-07-01_00.00.00--2y     72.2G      -   409G  Sun Jul  1  0:00 2018       -         -  -              -  1.06x     1.06x        -      -      -
zroot/vm/inet17/disk0@2018-08-01_00.00.00--2y     12.1G      -   399G  Wed Aug  1  0:00 2018       -         -  -              -  1.09x     1.09x        -      -      -
zroot/vm/inet17/disk0@2019-05-21_15.10.00--14d     131M      -   392G  Tue May 21 15:10 2019       -         -  -              -  1.12x     1.12x        -      -      -
zroot/vm/inet17/disk0@2019-05-21_14.10.00--14d     130M      -   392G  Tue May 21 14:10 2019       -         -  -              -  1.12x     1.12x        -      -      -
zroot/vm/inet17/disk0@2019-05-21_13.10.00--14d     256M      -   392G  Tue May 21 13:10 2019       -         -  -              -  1.12x     1.12x        -      -      -
zroot/vm/inet17/disk0@2019-05-21_12.10.00--14d     365M      -   392G  Tue May 21 12:10 2019       -         -  -              -  1.12x     1.12x        -      -      -
zroot/vm/inet17/disk0@2019-05-21_11.10.00--14d     573M      -   392G  Tue May 21 11:10 2019       -         -  -              -  1.12x     1.12x        -      -      -
zroot/vm/inet17/disk0@2019-05-21_10.10.00--14d     558M      -   392G  Tue May 21 10:10 2019       -         -  -              -  1.12x     1.12x        -      -      -
zroot/vm/inet17/disk0@2019-05-21_09.10.00--14d     343M      -   392G  Tue May 21  9:10 2019       -         -  -              -  1.12x     1.12x        -      -      -
zroot/vm/inet17/disk0@2019-05-21_08.10.00--14d    79.0M      -   392G  Tue May 21  8:10 2019       -         -  -              -  1.12x     1.12x        -      -      -
zroot/vm/inet17/disk0@2019-05-21_07.10.00--14d    35.8M      -   392G  Tue May 21  7:10 2019       -         -  -              -  1.12x     1.12x        -      -      -
zroot/vm/inet17/disk0@2019-05-21_06.10.00--14d    31.3M      -   392G  Tue May 21  6:10 2019       -         -  -              -  1.12x     1.12x        -      -      -
. . .
zpool list
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
bootpool  1.98G   257M  1.73G        -         -    12%    12%  1.00x  ONLINE  -
zroot     10.6T  8.53T  2.10T        -         -    56%    80%  1.00x  ONLINE  -

Adjusting for raidz2 I take it that this is telling me that I have ~5.3Tb of which 4.26 Tb is allocated and 1.05 Tb is available. Am I close?
 
Our problem appears to be our imap server which is a BHyve vm (inet17). This seems to be consuming the vast majority of the snapshot space, providing that I am reading the report correctly:

Well, that would not surprize me - that seems to be a mail service, so probably your snapshots keep lots of Spam. ;)
In a middle term approach I would consider a way to separate that application's installation+configuration from it's payload (i.e. the mails) and keep the latter only for a time needed.

Adjusting for raidz2 I take it that this is telling me that I have ~5.3Tb of which 4.26 Tb is allocated and 1.05 Tb is available. Am I close?

Then it seems there still are ~220 GB at the loss, but that may indeed go with the "fees and taxes"...
 
Adjusting for raidz2 I take it that this is telling me that I have ~5.3Tb of which 4.26 Tb is allocated and 1.05 Tb is available. Am I close?
Yes, that's how I would read it, regarding the zpools.
I don't think your inet17 snapshots take the most of the space though. The snapshots seem to refer 392G (it's the same value, you don't have to add them up in REFER!), so it seems like your ZVOL has relatively few changes over time and the consumption does not expand significantly.

Can you maybe try to use this command I saw from the URL below? It should show you the space consumption per dataset:
zfs list -t all -o space -r zroot
 
Also, using this command I was able to identify the most space consuming datasets on my system:
zfs list -t all -o space -p -r zroot | sort -k3
 
Code:
# zfs list -t all -o space -p -r zroot | sort -k3
NAME                                                      AVAIL           USED       USEDSNAP        USEDDS  USEDREFRESERV      USEDCHILD
. . .
zroot/vm/samba-01/disk0                           1031345872576   191718622080   103199251968   88519370112              0              0               
zroot/vm/samba-01                                 1031345872576   191719270848         476160        172608              0   191718622080               
zroot/vm/inet16/disk0                             1031341729984   283339740096   212667459840   70672280256              0              0               
zroot/vm/inet16                                   1031341729984   283340805504         892800        172608              0   283339740096               
zroot/vm/inet14/disk0                             1031320034944   527743037760   418526153088  109216884672              0              0               
zroot/vm/inet14                                   1031320034944   527743966272         767808        160704              0   527743037760               
zroot/vm/inet19/disk0                             1031320719424   545188516416   318650051520  226538464896              0              0               
zroot/vm/inet19                                   1031320719424   545189593728         904704        172608              0   545188516416               
zroot/vm/inet13/disk0                             1031327308288   669427392000   469806942336  199620449664              0              0               
zroot/vm/inet13                                   1031327308288   669428463360         898752        172608              0   669427392000               
zroot/vm/inet18/disk0                             1031340527680   703011492480   582480296256  120531196224              0              0               
zroot/vm/inet18                                   1031340527680   703012426944         773760        160704              0   703011492480               
zroot/vm/inet17/disk0                             1031333272192  1425865810176  1005242815104  420622995072              0              0               
zroot/vm/inet17                                   1031333272192  1425867339840        1357056        172608              0  1425865810176               
zroot/vm                                          1031345872576  4402638522624         422592   11516590272              0  4391121509760               
zroot                                             1031345872576  4450440486912              0        142848              0  4450440344064

It seems to me that INET17 is the largest consumer by a fair margin.
 
Back
Top