Out of space

Alet · Nov 7, 2013

Hi all,

Could you advise how to solve such situation:

Code:

# rm /usr/ports/distfiles/wget-1.14.tar.xz
rm: /usr/ports/distfiles/wget-1.14.tar.xz: No space left on device

# zfs destroy zroot@2013.11.07_0959
cannot destroy snapshot zroot@2013.11.07_0959: dataset is busy

# zfs holds zroot@2013.11.07_0959
NAME                   TAG            TIMESTAMP
zroot@2013.11.07_0959  .send-46676-1  Thu Nov  7  9:59 2013

# zfs release .send-46676-1 zroot@2013.11.07_0959
cannot release: out of space

# uname -a
FreeBSD server 9.2-RELEASE FreeBSD 9.2-RELEASE #4: Sun Oct 20 11:58:09 EEST 2013     root@server:/usr/obj/usr/src/sys/GENERIC  amd6

ShelLuser · Nov 7, 2013

What does zpool status tell you? And could you also give us the output of zfs list?

This looks pretty bizarre, I've never been in a situation where any of my ZFS filesystems ended up filled beyond capacity, but as far as I know it should have some form of failsave which should prevent issues like these.

Is there any chance of using a rescue CD to access this system?

phoenix · Nov 7, 2013

If your pool is 100% full, and zfs list shows 0 bytes available, you just might be screwed. You need to find a filesystem that has 0 snapshots on it. And you need to delete some files from that filesystem, in order to free up some space in the pool. Then, and only then, can you delete files from filesystems with snapshots, or delete snapshots themselves.

Why? Because deleting a file from a filesystem with snapshots doesn't free up space

It "moves" the blocks from that file into the previous snapshot, and writes out metadata showing it's deleted. If you have 0 bytes free, that metadata can't get written, thus the file can't be deleted.

And, you can't delete snapshots, as blocks that are common between snapshots gets "moved" into the previous snapshot and metadata needs to be written out to do so.

Just ran into this exact problem this morning. Thankfully, I don't snapshot my archived logs filesystem, so I was able to remove some year-old logs and free up enough space to start deleting 2-year-old snapshots.

There was also a thread about this issue on the zfs-discuss mailing list this month. One of the recommendations there was to create a "do-not-remove" filesystem, and set a 1 GB reservation on that filesystem. Don't use it, don't mount it, and don't snapshot it. Then, if you ever hit 100% usage again, you just decrease the reservation to free up space in the pool.

# zfs create -o reservation=1G -o mountpoint=none -o compress=off -o dedup=off [b]<poolname>[/b]/do-not-delete

usdmatt · Nov 8, 2013

Interesting.

You'd think they would of seen this coming and used the same solution that's been used on other file systems for years to stop them getting themselves into trouble - reserve a few percent purely for internal use.

In fact, they do reserve 1.6%, which is quite a lot on most pools. It even mentions to make use of this space for rm, although it specifically relates to freeing allocations and I'm not sure that's the case if snapshots are causing the file to not actually be removed.

Code:

	/*
	 * Reserve about 1.6% (1/64), or at least 32MB, for allocation
	 * efficiency.
	 * XXX The intent log is not accounted for, so it must fit
	 * within this slop.
	 *
	 * If we're trying to assess whether it's OK to do a free,
	 * cut the reservation in half to allow forward progress
	 * (e.g. make it possible to rm(1) files from a full pool).
	 */

Alet · Nov 8, 2013

Thanks for advice

The problem was solved by growing partition and resizing by zpool online -e <pool> <partition>

Maybe reserving was added in later release? The pool was created in 9.1 and upgraded

gkontos · Nov 8, 2013

phoenix said:
Just ran into this exact problem this morning. Thankfully, I don't snapshot my archived logs filesystem, so I was able to remove some year-old logs and free up enough space to start deleting 2-year-old snapshots.

Just out of curiosity. How does the pool behave at such capacity?

I have noticed that anything above 80% degrades performance dramatically.

phoenix · Nov 8, 2013

Surprisingly well.

Previous pools on earlier versions of FreeBSD (and ZFS) would slow to a crawl once it got above 80-90% full. A single rsync of a remote server would not complete, let alone a full backups run.

Now, the pools are only slightly slower, but still completing full backups runs every night, and full zfs send syncs during the day. FreeBSD 9-STABLE (pre-9.2) and ZFS v5000.

One server (betadrive) has

Code:

dedup=off

and

Code:

compress=lz4

The other server (omegadrive) has

Code:

dedup=sha256

and

Code:

compress=ljzb

The former hit 100% full, and is now running at 95%. The latter was at 97% full, and is now running at 91%.

Neither of them are saturating a gigabit link anymore. But they aren't so slow that we can't use them anymore.

Definitely wouldn't want to use a pool over 85% full if throughput is important.

Alet · Nov 9, 2013

gkontos said:
Just out of curiosity. How does the pool behave at such capacity?

The machine most for training.

bthomson · Nov 10, 2013

Might be relevant here to point out that when a zpool is almost full, the allocation algorithm changes from first fit to best fit, causing additional write performance loss. I think the cutoff is 96% full in recent FreeBSDs. I believe it used to be 70% or 80%.

It was probably changed to improve write performance below the cutoff, but at the cost of creating additional fragmentation. IIRC the ZFS authors recommended not ever filling zpools more than 80% full to maintain performance and prevent fragmentation. That's important since there's no in-place defrag for ZFS.

Maybe the smartest policy is to set quotas ( zfs set quota=xxx) or use the trick phoenix mentioned in order to prevent your zpool from getting more than 80% full. That way you maintain good write performance, don't have to worry about fragmentation, and can easily lift the restrictions temporarily if a dataset fills up.