Deleting certain files hangs filesystem

SirDice

Administrator
Staff member
Administrator
Moderator
I've got a nice problem on my fileserver x(

I've been trying to delete a directory on it but somehow something is messed up. If I just do rm -rf I can see the drive light blinking for a while then stop. When that happens the rm command can't be stopped with ctrl-c. Any other filesystem actions (like unmounting a different filesystem) also hangs. Rebooting seems to stop all services but hangs in the end. The only way to recover from this is to reboot the box with the reset button. This is a pain because fsck'ing 1.2TB takes a long time.

Somehow, somewhere several files seem to be corrupted in such a way that it will hang the whole filesystem. I've tried doing a find to delete files to see which on but that also hangs after a while. Different files seem to cause this.

How do I get rid of that crap?

This is on a 8.0-STABLE btw.
 
Boot into a single user mode
Run fsck and it will fix any problem
Than try deleting the files with find command. If you have too many files delete a,b,c,d wise (see local find man page for exact option)
Code:
find /path/to/dir -iname "a*" -delete
 
vivek said:
Boot into a single user mode
Run fsck and it will fix any problem
Unfortunately it doesn't. I've rebooted to single user mode several times already.

Than try deleting the files with find command. If you have too many files delete a,b,c,d wise (see local find man page for exact option)
Code:
find /path/to/dir -iname "a*" -delete
I managed to delete some using find and the inode suggestion made by Business_Woman:
Code:
find . -type f -exec ls -i {} \; | awk '{print $1;}' | xargs -J % rm -i %

However some files couldn't be deleted and I think this is the cause of the problem. Files that couldn't be deleted gave a message like:
Code:
rm <inode>: No such file or directory
 
That's weird... A # ls -li <inode> or # rm -i <inode> can't find the file. However, a # find . -inum <inode> -exec rm -i {} \; does.

So I'm now deleting about 52000 files using:
Code:
find . -type f -exec ls -i {} \; | awk '{print $1}' > /tmp/inodes.txt
foreach f ( `cat /tmp/inodes.txt` )
yes | find . -inum $f -exec rm -i {} \;
end
This is going rather slow but at least I'm getting rid of it :e
 
You're deleting 52,000 files? How many more do you have? A newfs and full restore from backup looks easier and less messy to me hehe...
 
SirDice said:
That's weird... A # ls -li <inode> or # rm -i <inode> can't find the file. However, a # find . -inum <inode> -exec rm -i {} \; does.

-i in rm -i mean interactive and not inode.
AFAIK there is not method for rm(1) to accept inodes.
Also while [CMD="ls"]-i[/CMD] displays inodes, it does not accept inode as input either.

Anyway, to me it seems that this is a bug. Either there should be an explicit limit on the files in a directory, and a verbose message when exceeding the limit, or the situation should be handled gracefully.
The system simply going stale is not normal.
Did you write some -stable mailing list?
 
Your shell has wild card limits
Code:
 sysctl kern.argmax
O/P
Code:
kern.argmax: 262144
Also (refer local shell to write exact syntax) following will give you exact number:
Code:
getconf ARG_MAX - (env | wc -c)
 
Beastie said:
You're deleting 52,000 files? How many more do you have? A newfs and full restore from backup looks easier and less messy to me hehe...

I've got loads more, it's a 1.3TB raid5 volume. Newfs is not an option ;)
 
achix said:
-i in rm -i mean interactive and not inode.
AFAIK there is not method for rm(1) to accept inodes.
Also while [CMD="ls"]-l[/CMD] displays inodes, it does not accept inode as input either.
You're absolutely correct. I mistakenly assumed the -i to accept inodes.

Anyway, to me it seems that this is a bug. Either there should be an explicit limit on the files in a directory, and a verbose message when exceeding the limit, or the situation should be handled gracefully.
The system simply going stale is not normal.
# rm -rf <directory> still hangs the whole filesystem x(
Even going in that directory, a few levels deep, and trying it also hangs. The only way to remove it seems to be using find and deleting the files one by one. Not sure if it's a bug but IMO it should either fail or just continue. Not hang.


Did you write some -stable mailing list?
Not yet as I haven't figured out what the cause is.
 
Back
Top