Typically the problem with "rm *" from the shell is the following: The shell first expands the glob, meaning it turns the command line internally from "rm *" to "rm aaa aab aac ...". But the shell has a limit on the length of the command line (used to be 10K bytes), and it just runs out of memory. That's why the trick with "rm a*", "rm b*" ... may work, if each rm command individually fits into the memory limit.I once also had the problem a directory contained too many files they cannot be removed all with one simple
rm
anymore.
(I'm not quite sure anymore - it happened six or seven years ago, and I forgot almost all the details: ...
As said above, that's 31 million files. Your file system has been busy. I hope it is not currently as busy. To make sure your deletion has a chance, I would put the system into single user mode, and carefully check that no other processes are running. You said that nothing is being written right now ... but given that you have been writing creating about two files per second for the last ~8 months, I'm not sure we can trust that statement.Code:) ls -ld ocd-data drwxr-xr-x 2 bridger bridger 31408165 Feb 2 2025 ocd-data/
In that case, I strongly suspect that your ZFS file system has become corrupted. Even with 31 million files, just reading the directory once (which is what find does) should finish, and it should make progress. Unfortunately, that leads to the following solution:hi covacat - no, it doesn't.
Since you are not interested in debugging how ZFS got broken, this is likely the easiest answer. Requires an extra set of disks though. Once you have the new file system formatted, it's easy to do: rsync with an --exclude option.Maybe do it the other way round. Move all the stuff you want to keep elsewhere and then format the filesystem.
Okay. That cleared a lot of things:Unfortunately, the files were generated with a random string - that's making small batch deletion tricky.
Yeah, right! That was it. I remember now.The shell first expands the glob, meaning it turns the command line internally from "rm *" to "rm aaa aab aac ...". But the shell has a limit on the length of the command line (used to be 10K bytes), and it just runs out of memory.
Why would rm use swapspace?Do you have swapspace and does it fill up when you do a `rm -rf` on the dir?
Why is this better than just doing "rm -rf", unless you only want to remove a specific subset? If "rm -rv" shows progress, eventually it will finish.it is better to do "find ... -print | xargs rm"
Good point, "rm -rf ocd-data" is better than "find ... | xargs ... rm".Why is this better than just doing "rm -rf", unless you only want to remove a specific subset? If "rm -rv" shows progress, eventually it will finish.
No new files have been written to this directory for a long while.As said above, that's 31 million files. Your file system has been busy. I hope it is not currently as busy. To make sure your deletion has a chance, I would put the system into single user mode, and carefully check that no other processes are running. You said that nothing is being written right now ... but given that you have been writing creating about two files per second for the last ~8 months, I'm not sure we can trust that statement.
In that case, I strongly suspect that your ZFS file system has become corrupted. Even with 31 million files, just reading the directory once (which is what find does) should finish, and it should make progress. Unfortunately, that leads to the following solution:
I am interested in how/why ZFS is broken, but I'll readily confess that I may lack the skills to debug completely.Since you are not interested in debugging how ZFS got broken, this is likely the easiest answer. Requires an extra set of disks though. Once you have the new file system formatted, it's easy to do: rsync with an --exclude option.
This was a data processing project - I would definitely reconsider serializing files like this again; definitely not without using a ZFS dataset, and I would strongly consider using a pear tree for handling on-disk file storage.In the future, may I ask to reconsider the architecture of your system? You're creating ~2 files per second, sustained. Do you actually need all of them? If the files are just a write-only cache of recent results (commonly used for checkpointing processes), they are mostly unneeded once the process finishes. Could you implement a cleaning mechanism, which limits the number of files? Or maybe have a small number of files (for example just two), and overwrite them in place with a ping-pong mechanism? If your files are created and deleted at random times, and they are individually small, perhaps a database is a better mechanism, rather than a file system.
Another question is the following: How a directory is stored on disk depends on the file system implementation. I do not know how ZFS does it (never having studied its internals), but I know that some file systems really struggle with large directories, because they internally rely on linear searches of the directory for all operations. You might be much better off structuring your workload to create many smaller subdirectories. For example, if your data is storing recently acquired data, you could do ocd-data/September/Week3/Tuesday/09am/40:37.123.data for the file created today at 09:40:37.123, and the first few layers of directories would only have a handful of entries (the last one, however, would still have about 7200). This might overall run much faster, and make deletion and cleanup much easier.
zfs behavior is far too complex for that to work! Also recall that zfs is copy on write so even changing one byte will result in multiple writes all the way to zfs superblock (or whatever they call it). It may be that deletes are indeed write-through (in some higher level sense) which is why they are slow. But if you want to see what is going on at the disk level, may be use ktrace on bhyve (for a VM that is using zfs).I have a question that may sound unrelated, but isn't: Is there a way to configure a zpool or zfs file system so all disk IOs have to be synchronous?
rm -rf ocd-data/
, but I'm not quite sure how to proceed with ktrace. # ktrace -i rm -rf ocd-data/
? Based on the ktrace
man page that seems like a good start, but if there's a better/more appropriate invocation I'd like to know it. Indeed. Two old jokes: I don't suffer from OCD, I enjoy it. And my neighbors call it "obsessive cycling disease", the spend several hour per day bicycling around the hills.PS: an interesting choice of directory name![]()
bakul - yes,You can also attach ktrace to a running process.
But you don't need that if you do "rm -rvx ocd-data". What does it show? it should show filenames as it deletes them. Does it continue showing more names or does it stop?
PS: an interesting choice of directory name![]()
Indeed. Two old jokes: I don't suffer from OCD, I enjoy it. And my neighbors call it "obsessive cycling disease", the spend several hour per day bicycling around the hills.
Seriously: I like the suggestion of rm -rvx; just by looking at the output on the console, one can see roughly how fast it is going.
# truss -o ocd-data-delete-2.txt rm -rvx ocd-data/
is rolling right along, it seems. While the output file has grown too long to tail -f ...
, lc
shows the file is growing. ) lc -l ocd-data-delete-2.txt
-rw-r--r-- 1 root bridger 271M Sep 16 16:23 ocd-data-delete-2.txt
The pool was created in 2017 (Just checking:
Have you changed any ZFS parameters/sysctl's, especially anything related to ZFS' ARC? If so, which ones and how?
Are you perhaps using dedup, if so is it still enabled?
What version of FreeBSD are you running; when, approximately, was the pool created, mainly under what OS version?
2017-08-02.20:49:11 zpool create astral /dev/ada2
# truss -o ocd-delete.txt rm -rvx ocd-data/
So, I'm instead moving the important work off of this filesystem and I'll just nuke it from orbit and re-create the filesystem.
Noooo
Don't give in. For all we know the delete would work if you execute it on a server with -say- 256 GB RAM.
If you don't have a spare computer with that much RAM, there is an easy way to do it. Rent a AWS EC2 machine with 256 GB, export the disks the pool is on via iSCSI, mount that on the EC2 server and there you go. Easy peasy.
It’s truss that is appending to your ocd-delete.txt file as it traces what syscalls rm makes. You should read truss and rm manpages carefully. Note that rm -v will output names as it deletes them. See my last reply. Sigh.# truss -o ocd-delete.txt rm -rvx ocd-data/
but I've just noticed (after too many hours) that this process doesn't delete files as it appends them to the `ocd-delete.txt` file - it waits until all files are accumulated(?)/touched by the `rm` process, and then they are `unlinked`. My mistake.
Hi bakul - don't sighIt’s truss that is appending to your ocd-delete.txt file as it traces what syscalls rm makes. You should read truss and rm manpages carefully. Note that rm -v will output names as it deletes them. See my last reply. Sigh.
rm -vxw ocd-data/
would sit and sit and sit and never print a thing: I don't know if the problem was because of trying to buffer the list, or some other reason. rm...
never showed excessive RAM usage in top
/ htop
, but also never gave any appearance that it was actually deleting the files. Sadly, I think in this particular case of "mistake", the right approach was to move important files off the drive, delete the dataset, and recreate it. The original, delete-all-the-files approach would have taken me another week - the ZFS fix took about 30 minutes, counting the time it took to move around files with rsync
. It's impossible from external reading forum posts only to tell how long you wait until you decide a process will "never end."sit and sit and sit and never print a thing
ls
can even finish on such a large amount of files - you are way beyond those "normal default" 10k ralphbsz mentioned. But that doesn't mean the OS couldn't handle it at all.