ZFS Deleting a directory with an unknown number of files

CanOfBees · Sep 16, 2025

Hi all -

I have a directory containing an unknown number of files - at least 35K at one point. I would like to delete the directory, but every method I've tried runs indefinitely (> ~12 hours), hangs, or runs out of memory. The directory is on a ZFS filesystem, but it isn't a separate ZFS dataset (lesson learned for next time).

So far I have tried the following:
1. # rm -rf ocd-data/ - this hangs with a [zio->io_cv] in ctrl+t
2. Different invocations of rsync trying --delete and --delete-after; # rsync -a --delete-after empty_dir/ ocd-data/ just hang... and never return.
3. I tried a perl solution suggested, # perl -e 'for(<*>){((stat)[9]<(unlink))}', but I don't really know perl... and this, too, never returns.

Does anyone have any suggestions for dropping this directory? I would be very appreciative - thank you in advance for your time and help!

Mods, if it would be better to post this in a different forum please let me know and I'll repost it.

covacat · Sep 16, 2025

does
find ocd-data
finish ?

ralphbsz · Sep 16, 2025

And what does that find command output? The speed of the find command should be at least many hundreds, if not thousands of file names per second.

How big is the file system? How big is the directory, in the sense: If you stat it (for example look at the output from ls -ld on it), how many bytes is it? That should be the number of entries in the directory.

Are there processes that are continuously changing the directory? Is there other activity (such as resilvering) that keeps the file system, the disks, or the machine busy?

For a directory with "just" 35K files, taking 12 hours for any operation seems way out of line. A few minutes seems believable; 12 hours probably means something is broken.

Anecdote: I actually used to ask job candidates the following interview question, for design of (parallel and clustered) file systems: A customer has complained that deleting all files in a directory takes too long, about 3 days. The directory contains a billion files. They attempt the deletion by using 1000 nodes in parallel, each node deleting 1/1000th of the files (taking care that each file is deleted exactly once). Change either how the delete is performed, or the internal design of the file system to make the delete operation faster.

The interview problem is due to a real-world customer complaint, and describes the actual situation that happened at a customer.

Maturin · Sep 16, 2025

I once also had the problem a directory contained too many files they cannot be removed all with one simple
rm anymore.
(I'm not quite sure anymore - it happened six or seven years ago, and I forgot almost all the details: how I got to this directory [something I messed up {of course}]...as far as I remember I examined why rm refused to handle that amount of files was somehow in the rm's code the max number of files is defined [edit: I think I remember it wasn't in rm's code, but the max. definition for wildcards]; however when this 'internally capacity' is overflown, rm refuses to work/does not work properly. But don't nail me on that. As I said it was over six years ago, I didn't documented it, and this certain case may have its cause in other reasons.)

However,
My solution was to remove the files step by step partially:
rm a*
rm b*
rm c*
...
This may not actually suit your current situation, but you get the idea.
Depending on the way the file's are named it may be useful to put that into a small shellscript.

Another idea you could try was to get the dir's content (partially) into a textfile:
ls > mydirscontent.txt
then write a small script:

sh:

#!/bin/sh

while read filename; do
        rm ${filename}
done < mydirscontent.txt

exit 0

USerID · Sep 16, 2025

https://www.cyberciti.biz/faq/howto-linux-unix-delete-remove-file/

SirDice · Sep 16, 2025

find .... | xargs rm is typically a lot more efficient than trying to delete individual files one by one.

Also note that if the pool is full, or almost full, performance is horrendous, even for deleting files.

covacat · Sep 16, 2025

find | xargs is usually better but in some cases wont work (when the absolute / relative to start point path is too long)
i tried this

sh:

i=0;while mkdir dirname$i;do cd dirname$i;touch filename$i;i=$(($i+1));echo -n $i.;done

DO NOT TRY THAT ON YOUR FS (i did on a md specially crafted for this)
and created a 3k component path / like 30k long.
rm -rf worked (uses chdir) but find|rm wont work for the leaves (filename too long)

Charlie_ · Sep 16, 2025

This may not be related to this issue, but as a general rule, deleting a large number of files on a copy-on-write filesystem when there is little free space can be very slow.
This is because metadata needs to be copied when deleting, but it is not possible to secure contiguous free space for this purpose.
I haven't actually experienced it though.

covacat · Sep 16, 2025

also if the large tree has snapshots it is even more ...

cracauer@ · Sep 16, 2025

You can try reducing the number of files in the dir incrementally by
rm a*
rm b*
[...]
rm z*

mro · Sep 16, 2025

Code:

$ find /<dir>/ -delete

find(1)

man.freebsd.org

CanOfBees · Sep 16, 2025

covacat said:
does
find ocd-data
finish ?

hi covacat - no, it doesn't.

RussellASC · Sep 16, 2025

mro said:
Code:

$ find /<dir>/ -delete

find(1)

man.freebsd.org

This should be the best one. Consider "find /dir/ -print -delete" so you can watch the output as it goes. Just one access of the file metadata, and doesn't have to launch extra processes for every file.

Find to xargs rm means passing thousands of filenames and may require a second access of the file metadata (ie: lookup the name in rm after find already did). Xargs will have to spawn many rm processes for further overhead.

"rm a*" is a poor choice, because shell globbing is fragile in case of overwhelming numbers of files. It will likely also require multiple file metadata accesses (shell globbing then rm). It may also iterate the whole directory metadata in order to filter out the files with "A". Every rm will start from scratch.

Typically when deleting/copying/listing directories with huge numbers of files, the bulk of the time is spent in iterating over the filenames and metadata for each file. File allocation tables are often slow and single threaded, depending on the filesystem type. Anything you can do to minimize the number of times you iterate over the file list should help.

CanOfBees · Sep 16, 2025

ralphbsz said:
And what does that find command output? The speed of the find command should be at least many hundreds, if not thousands of file names per second.

How big is the file system? How big is the directory, in the sense: If you stat it (for example look at the output from ls -ld on it), how many bytes is it? That should be the number of entries in the directory.

Are there processes that are continuously changing the directory? Is there other activity (such as resilvering) that keeps the file system, the disks, or the machine busy?

For a directory with "just" 35K files, taking 12 hours for any operation seems way out of line. A few minutes seems believable; 12 hours probably means something is broken.

Anecdote: I actually used to ask job candidates the following interview question, for design of (parallel and clustered) file systems: A customer has complained that deleting all files in a directory takes too long, about 3 days. The directory contains a billion files. They attempt the deletion by using 1000 nodes in parallel, each node deleting 1/1000th of the files (taking care that each file is deleted exactly once). Change either how the delete is performed, or the internal design of the file system to make the delete operation faster.

The interview problem is due to a real-world customer complaint, and describes the actual situation that happened at a customer.

hi ralphbsz - thanks for your response!

Code:

) ls -ld ocd-data
drwxr-xr-x  2 bridger bridger 31408165 Feb  2  2025 ocd-data/

Are there processes that are continuously changing the directory? Is there other activity (such as resilvering) that keeps the file system, the disks, or the machine busy?

For a directory with "just" 35K files, taking 12 hours for any operation seems way out of line. A few minutes seems believable; 12 hours probably means something is broken.

No processes are changing the directory, and, to the best of my knowledge, no reslivering. It definitely seems like something is broken - hopefully someone can help me figure out the "what"!

CanOfBees · Sep 16, 2025

Maturin said:
I once also had the problem a directory contained too many files they cannot be removed all with one simple
rm anymore.
(I'm not quite sure anymore - it happened six or seven years ago, and I forgot almost all the details: how I got to this directory [something I messed up {of course}]...as far as I remember I examined why rm refused to handle that amount of files was somehow in the rm's code the max number of files is defined [edit: I think I remember it wasn't in rm's code, but the max. definition for wildcards]; however when this 'internally capacity' is overflown, rm refuses to work/does not work properly. But don't nail me on that. As I said it was over six years ago, I didn't documented it, and this certain case may have its cause in other reasons.)

However,
My solution was to remove the files step by step partially:
rm a*
rm b*
rm c*
...
This may not actually suit your current situation, but you get the idea.
Depending on the way the file's are named it may be useful to put that into a small shellscript.

Another idea you could try was to get the dir's content (partially) into a textfile:
ls > mydirscontent.txt
then write a small script:

sh:

#!/bin/sh while read filename; do rm ${filename} done < mydirscontent.txt exit 0

Hi Maturin - I think that may be part of what happened here: too many args (or files) for the rm command. I don't remember the syntax of the file names in the directory, which prevents your helpful suggestion, and ls > dir_contents.txt won't ever complete.

Thanks for the response!

CanOfBees · Sep 16, 2025

USerID said:
https://www.cyberciti.biz/faq/howto-linux-unix-delete-remove-file/

thanks USerID - that's a helpful link, but sadly doesn't really provide any extra information or suggestion for the problem at hand.

CanOfBees · Sep 16, 2025

SirDice said:
find .... | xargs rm is typically a lot more efficient than trying to delete individual files one by one.

Also note that if the pool is full, or almost full, performance is horrendous, even for deleting files.

hi SirDice - thanks for the response.

mro said:
Code:

$ find /<dir>/ -delete

find(1)

man.freebsd.org

thanks, mro - I appreciate the response.

Edit: RussellASC, thank you, too, for the suggestion and thought behind the command choice. I'll post back when an update shortly.

Globbing the find... suggestions together. I'll give the find /<dir>/ -delete a try, but since a basic find ocd-data/ never returns, I don't have high hopes for this approach (but will soon see!).

Code:

root@dustbin:/astral/errata # find ocd-data/
ocd-data/
load: 0.19  cmd: find 73678 [zio->io_cv] 9037.96r 1.09u 7.40s 0% 871892k
load: 0.14  cmd: find 73678 [zio->io_cv] 10543.53r 1.34u 8.47s 0% 1038768k
load: 0.17  cmd: find 73678 [zio->io_cv] 11129.11r 1.39u 8.80s 0% 1102472k

eternal_noob · Sep 16, 2025

Maybe do it the other way round. Move all the stuff you want to keep elsewhere and then format the filesystem.

fjdlr · Sep 16, 2025

Maybe try with a live cd, I had a similar situation, but with UFS, and starting with Nomad boot I managed to do a nice rm -rf

Maturin · Sep 16, 2025

CanOfBees said:
ls > dir_contents.txt won't ever complete.

f4##! You really have a problem.

CanOfBees said:
drwxr-xr-x 2 bridger bridger 31408165 Feb 2 2025 ocd-data/

that's not 35k files, that's more kind of 31M - that's in deed a lot more I had dealt with.

Since this dir was created on Feb 2nd, a quick calculation tells me that's (in average) 138974 files per day, which was 96 new files every minute - are you sure the "production of new files" to that directory is stopped?

Can you do a df -h on the disk/partition the filesystem in question is on?
And maybe an zfs list, too?

Anyway I would at least try something like a rm /path/to/dir/111* rm /path/to/dir/112* on it - partially, but try to kill what can be killed, just to reduce the number of files to get to a state where actual work can be done again.

Do you have any idea how the filenames may look like?
At this magnitude I guess they are somehow numerated.

CanOfBees · Sep 16, 2025

eternal_noob said:
Maybe do it the other way round. Move all the stuff you want to keep elsewhere and then format the filesystem.

hey eternal_noob - that may be a future step! Thanks for the suggestion!

bakul · Sep 16, 2025

Use the -f option with ls to prevent sorting.

You can try "rm -rvx ocd-data" to see what it prints -- you should see some progress or error messages. The -x option to avoid crossing mounts. You can ^C rm any time so this is just to see what it does. Given 31M files, you may want to point its stderr and stdout to a file.

Erichans · Sep 16, 2025

CanOfBees said:
or runs out of memory.

I take it that you're running bare metal?
How much memory does your system have?

CanOfBees · Sep 16, 2025

Maturin said:
f4##! You really have a problem.

that's not 35k files, that's more kind of 31M - that's in deed a lot more I had dealt with.

I probably should have said "at least 35K".

Maturin said:
Since this dir was created on Feb 2nd, a quick calculation tells me that's (in average) 138974 files per day, which was 96 new files every minute - are you sure the "production of new files" to that directory is stopped?

Yes - no new files are being written to the directory.

Maturin said:
Can you do a df -h on the disk/partition the filesystem in question is on?
And maybe an zfs list, too?

 

root@dustbin:/astral # df -h astral

Filesystem    Size    Used   Avail Capacity  Mounted on

astral        1.7T    239G    1.5T    14%    /astral

Maturin said:
Anyway I would at least try something like a rm /path/to/dir/111* rm /path/to/dir/112* on it - partially, but try to kill what can be killed, just to reduce the number of files to get to a state where actual work can be done again.

Do you have any idea how the filenames may look like?
At this magnitude I guess they are somehow numerated.

Unfortunately, the files were generated with a random string - that's making small batch deletion tricky.

CanOfBees · Sep 16, 2025

Erichans said:
I take it that you're running bare metal?
How much memory does your system have?

hi Erichans - only 12GB, and yes, bare metal.

ZFS Deleting a directory with an unknown number of files

Administrator