Unable to delete some files

Hi guys,

Recently I've discovered that there are some files that I'm unable to delete. See the screenshot below from mc (in /root directory).

20220203-screenshot_mc.png


A simple ls does show the file:

20220203-screenshot03.png


ls with other switches, e.g. ls -la does not show the file.

It's difficult for me to say what caused this, perhaps a power outage. What I do see is that there are additional files which are shown in mc as starting with the question mark (if it is indeed a question mark and not some other random symbol being shown as a question mark).

Another, more worrying example of the same is:

20220203-screenshot02_mc.png


This leads to cron not being able to start. The same is true for wlan0 and some other files and daemons.

My system is
Code:
 FreeBSD XXXX 13.0-RELEASE-p6 FreeBSD 13.0-RELEASE-p6 #0 releng/13.0-b0c8bc5d9: Wed Jan 26 13:24:54 CET 2022     xxxx@xxxx:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64

Filesystem is ZFS on SSD and zfs scrub did not find any errors.

I did try the following suggestions from #freebsd on Libera.chat but without success:
  • Regular rm \?.viminfo - No such file or directory
  • Moving everything out of /root directory delete the directory with rm -rf /root - Directory not empty
  • Using rm -fv -- \?.viminfo
  • Using ls -alo - the file does not appear in the output
  • Using printf "%s" *.viminfo gives "no matches found *.viminfo"
  • Tried to find inode of the file in order to delete it that way - wasn't able to find the inode.
So now I'm stuck and hope you guys are able to help me.
 

Attachments

  • 20220203-screenshot03.png
    20220203-screenshot03.png
    12 KB · Views: 93
In addition to the output from "file", also do an ls -l on them. It is a little worrisome that the date (actually mtime) of the file is 1970-01-01. That seems to indicate that something has wiped out the mtime attribute of the file. What other attributes did it wipe out? How about the ownership, and permissions? How about ownership and permission of the enclosing directory? One reason the file might be not deletable is: You don't own it, or you don't have permission to modify it, or you don't own or don't have permission for the directory it is in.

It is also very bizarre that a normal ls (without the -a switch) shows a file name that starts with a dot (it should not), while the ls with the -a switch does not show it (it should).

I wonder whether there is something funny going on with the file name. Perhaps there is a space or invisible character before or after the file name? Perhaps the leading dot is not really a dot, but some other unicode character which happens to look like a dot? You can try "ls -1 | hexdump -C", then we'll see the binary representation of the file name.
 
Thanks, so I reckon, the name of the file somehow gained a non-printing character (or something like a non-printing character).

Here, after intentionally adding a non-printable character:

Code:
% bfs ~ -name "*viminfo*" -print
/home/grahamperrin/
                   .viminfo
^C
% pwd
/usr/home/grahamperrin
% ls -hl *viminfo
-rw-------  1 grahamperrin  grahamperrin   933B  4 Feb 00:19 ?.viminfo
%

… I'm in no way an expert on FreeBSD (just a regular user of it for the past 20 years or so, but was not too adventurous), …

If terminal is too adventurous in this case, you can probably use an application such as Dolphin to show hidden files, then correct the file name. (Dolphin works for me.)

I say probably because, as pointed out by ralphbsz, other things are unusual in your case.
 
Code:
root@XXXX:~ # file ./.viminfo 
./.viminfo: cannot open `./.viminfo' (No such file or directory), ASCII text
That makes no sense (which doesn't mean that it isn't what you're seeing).

The output from file tells me two things. First, that it cannot open .viminfo, because such a file does not exist. Second, that it can read some file (we don't know which file), which contains ASCII text. But the name of the second file is not visible on the command line. Also, the output format of file is broken. Here is an example: In my directory, I have a file named bar which contains ASCII text, and there is no file foo:
Code:
> file foo bar
foo: cannot open `foo' (No such file or directory)
bar: ASCII text
See the difference: If you give the file command two file names as arguments, it prints two lines of output. Your example looks like you gave file two arguments, but it printed only one line, and that line is contradictory. Also note that you used file name completion in the shell, which should not put two arguments on the command line.

I think I need to see the output of "ls -1 | hexdump -C" in that directory. I'm beginning to suspect that we have multiple files here, some of which are invisible, or may contain special characters that screw up output formatting. For example, it is possible to create a file whose name is a single space, and that will make ls and command line completion (with tab) look really funny. It is even possible to create a file whose name is a bunch of backspace characters ... but that wouldn't backspace across the newline.

EDIT: Covecat's suggested command is also very good, accomplishes the same as my ls command.

Very weird.
 
ralphbsz tried to do ls -1 the output is:
Code:
root@XXXX:~ # ls -1 .viminfo 
ls: .viminfo: No such file or directory

Your following suggestion ls -l | hesdump C gives (partial output):
Code:
00000080  0a 2e 73 75 62 76 65 72  73 69 6f 6e 0a 2e 76 69  |..subversion..vi|
00000090  6d 0a 2e 76 69 6d 69 6e  66 6f 0a 6d 79 2e 63 6e  |m..viminfo.my.cn|
000000a0  66 0a 53 79 6e 63 0a                              |f.Sync.|
000000a7

I cannot see ownership attributes of the file, but can see that the ownership and flags of the /root directory are normal and haven't changed.

Erichans tried you suggestion but .viminfo does not appear in the list of files to be deleted.
 
try
find /root -print0 |xargs -0 stat -f "%Op %SN"

covacat tried:
Code:
root@XXXX:~ # find /root -print0 | xargs -0 stat -f "%Op %SN" | grep vim
40755 /root/.vim
100644 /root/.vim/.netrwhist
stat: /root/.viminfo: stat: No such file or directory
root@XXXX:~ #
 
ralphbsz tried to do ls -1 the output is:
Code:
root@XXXX:~ # ls -1 .viminfo 
ls: .viminfo: No such file or directory
I presume you typed the ".viminfo" on that command line by hand? In that case, this is one good piece of information. It tells use that there is no file whose name is ".viminfo", with the usual characters that come from typing it by hand.

But now, it gets totally bizarre:
Your following suggestion ls -l | hesdump C gives (partial output):
Code:
00000080  0a 2e 73 75 62 76 65 72  73 69 6f 6e 0a 2e 76 69  |..subversion..vi|
00000090  6d 0a 2e 76 69 6d 69 6e  66 6f 0a 6d 79 2e 63 6e  |m..viminfo.my.cn|
000000a0  66 0a 53 79 6e 63 0a                              |f.Sync.|
000000a7
This is bizarre. You did a "ls -l" here (the option character is ell, the letter after i and before m). I expect to see roughly this format in the output:
Code:
> ls -l bar blatz
-rw-r--r--  1 ralph  ralph  37 Feb  3 23:35 bar
-rw-r--r--  1 ralph  ralph   0 Feb  4 00:01 blatz
But instead, what you got was the output format from a normal "ls -1" command (the option character is the digit one). And that output tells us that either a file named ".viminfo" exists (look for the string ".viminfo" followed by 0a = newline in the output). Or perhaps that a file named ".viminfo\nmy.cnf" exists (that's the name with a newline in it). That would be legal but weird. And having file names with newlines in it leads to amusing (bizarre? confusing?) output from standard commands. But: A single file with newline in the name does not explain all the "no such file..." error messages you are getting.

Now let's combine that with the output you posted from the find command:
Code:
# find /root -print0 | xargs -0 stat -f "%Op %SN" | grep vim
...
stat: /root/.viminfo: stat: No such file or directory
This is where things go bad. This tells me that find found a file named /root/.viminfo (with nothing behind the name!) and passed it to stat via xargs. Given that you used -print0 and -0 on find and xargs, we can assume that the filename was passed to stat unmolested. And then stat told us that such a file does not exist.

I think I'm starting to have a bad thought: I fear your file system is damaged. If the find utility can find a file named .viminfo that exists and is in the directory, but then stat can not open the file, that's an inconsistency. That sort of matches the suspicion I had above, which is that the 1-Jan-1970 date suspiciously looks like file attributes were damaged.

Time to (a) look in dmesg and /var/log/messages for error messages, and (b) go into single user mode and run fsck. Oh wait, that won't work. You wrote above "Filesystem is ZFS on SSD and zfs scrub did not find any errors". On ZFS, there is nothing like fsck to fix problems like an inconsistency in the file system ... scrub should have done what it can, and above and beyond that, there are only specialized tools such as zdb that are not automated.

I'm out of good ideas for how to help you. My only suggestion is: cd to /root, and post the output of "ls -l" without any arguments. Also, while you are in the area, do a "ls -l -d", perhaps there is something strange about the directory itself (although I don't know of any strangeness that could cause the symptoms you are seeing). Perhaps something on other files will give us a clue.
 
Code:
#include <dirent.h>
#include <stdio.h>
#include <err.h>
int main(int argc,char **argv)
{
DIR *dir;
struct dirent *dp;
if(argc != 2) {
 errx(1,"%s <file>",argv[0]);
 }
dir = opendir(argv[1]);
if (!dir) {
        err(1,"Error opening %s",argv[1]);
    }

 while ((dp = readdir(dir)) != NULL) {
    fwrite(dp,sizeof(struct dirent),1,stdout);
    fflush(stdout);
    }
    closedir(dir);
//printf("%lu\n",sizeof(struct dirent));
return 0;
}
compile this cc a.c -o /tmp/ad
/tmp/ad /root/.vim |hexdump -Cv
/tmp/ad /root/.vim > /tmp/out.bin
post/attach the binary output
 
covacat:
Code:
root@XXXX:~ # cc a.c -o /tmp/ad
root@XXXX:~ # /tmp/ad /root/.vim | hexdump -Cv
00000000  15 90 0d 00 00 00 00 00  01 00 00 00 00 00 00 00  |................|
00000010  20 00 04 00 01 00 00 00  2e 00 00 00 00 00 00 00  | ...............|
00000020  92 00 00 00 00 00 00 00  02 00 00 00 00 00 00 00  |................|
00000030  20 00 04 00 02 00 00 00  2e 2e 00 00 00 00 00 00  | ...............|
00000040  16 90 0d 00 00 00 00 00  09 0b 16 17 00 00 00 00  |................|
00000050  28 00 08 00 0a 00 00 00  2e 6e 65 74 72 77 68 69  |(........netrwhi|
00000060  73 74 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |st..............|
00000070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000080  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000100  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000110  00 00 00 00 00 00 00 00  92 00 00 00 00 00 00 00  |................|
00000120  02 00 00 00 00 00 00 00  20 00 04 00 02 00 00 00  |........ .......|
00000130  2e 2e 00 00 00 00 00 00  16 90 0d 00 00 00 00 00  |................|
00000140  09 0b 16 17 00 00 00 00  28 00 08 00 0a 00 00 00  |........(.......|
00000150  2e 6e 65 74 72 77 68 69  73 74 00 00 00 00 00 00  |.netrwhist......|
00000160  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000170  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000210  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000220  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000230  16 90 0d 00 00 00 00 00  09 0b 16 17 00 00 00 00  |................|
00000240  28 00 08 00 0a 00 00 00  2e 6e 65 74 72 77 68 69  |(........netrwhi|
00000250  73 74 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |st..............|
00000260  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000270  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000280  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000290  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000300  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000310  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000320  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000330  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000340  00 00 00 00 00 00 00 00                           |........|
00000348
 
ralphbsz thanks for your suggestions. I tried:
Code:
root@XXXX:~ # ls -l
ls: .viminfo: No such file or directory
total 74
drwx------  5 root  wheel      5 Jan  6 22:18 .cache
drwxr-xr-x  3 root  wheel      7 Jan  2 18:35 .composer
drwx------  5 root  wheel      5 Jan  2 23:25 .config
-rw-r--r--  1 root  wheel   1023 Apr  9  2021 .cshrc
drwx------  3 root  wheel      3 Jan  2 23:25 .dbus
-rw-------  1 root  wheel  26162 Feb  3 22:47 .history
-rw-r--r--  1 root  wheel     80 Apr  9  2021 .k5login
-rw-------  1 root  wheel     59 Jan 29 18:11 .lesshst
drwx------  3 root  wheel      3 Dec 22 00:32 .local
-rw-r--r--  1 root  wheel    328 Apr  9  2021 .login
-rw-------  1 root  wheel   2420 Jan  1 23:39 .mysql_history
-rw-r--r--  1 root  wheel    507 Apr  9  2021 .profile
-rw-r--r--  1 root  wheel    865 Apr  9  2021 .shrc
drwx------  2 root  wheel      3 Dec 22 00:33 .ssh
drwxr-xr-x  3 root  wheel      6 Jan  1 21:57 .subversion
drwxr-xr-x  2 root  wheel      3 Jan 10 18:18 .vim
-rw-r--r--  1 root  wheel    437 Feb  4 09:22 a.c
-rw-r--r--  1 root  wheel    981 Aug  6  2019 my.cnf
drwxr-xr-x  3 root  wheel      3 Jan  2 23:25 Sync
root@XXXX:~ # ls -l -d
drwxr-xr-x  12 root  wheel  23 Feb  4 09:22 .
root@XXXX:~ #
 
Thank you covecat for posting the binary! This will show us exactly what is in the directory.

Potzilov: Please run the same command again, but this time with the argument being just /root (so "/tmp/ad /root |hexdump -Cv"). What the previous run showed us: A directory named /root/.vim exists. It is readable without errors. I'm still decoding what it's content is, but I think it shows three files.
 
Code:
root@XXXX:~ # ls -l
ls: .viminfo: No such file or directory
total 74
...
Your file system is definitely screwed up. The directory contains a directory entry named .viminfo, but there is no such file. I fear you have found some sort of bug, which has caused some fort of file system corruption. We'll know more about what sort of corruption when we see the hexdump from the ad program on the /root directory.
 
Thank you covecat for posting the binary! This will show us exactly what is in the directory.

Potzilov: Please run the same command again, but this time with the argument being just /root (so "/tmp/ad /root |hexdump -Cv"). What the previous run showed us: A directory named /root/.vim exists. It is readable without errors. I'm still decoding what it's content is, but I think it shows three files.
Attached.
 

Attachments

  • out.zip
    670 bytes · Views: 100
Each directory entry is 280 bytes = 0x118 bytes long. Remember, we're talking about the directory /root/.vim here. Do a "man dirent", and you'll see how to decode a directory entry.

root@XXXX:~ # /tmp/ad /root/.vim | hexdump -Cv
00000000 15 90 0d 00 00 00 00 00 01 00 00 00 00 00 00 00 |................|
00000010 20 00 04 00 01 00 00 00 2e 00 00 00 00 00 00 00 | ...............|
00000020 92 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 |................|
00000030 20 00 04 00 02 00 00 00 2e 2e 00 00 00 00 00 00 | ...............|
00000040 16 90 0d 00 00 00 00 00 09 0b 16 17 00 00 00 00 |................|
00000050 28 00 08 00 0a 00 00 00 2e 6e 65 74 72 77 68 69 |(........netrwhi|
00000060 73 74 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |st..............|
00000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000110 00 00 00 00 00 00 00 00 92 00 00 00 00 00 00 00 |................|
[/code]
At offset 0, there exists a directory entry with inode number 0x0d9015. The next entry will be at offset 1 in the directory. This directory entry is 0x0020 bytes long (and I have no idea what this means, because it is really 0x118 bytes long). This entry is of type 0x04 = DT_DIR, which means it's a directory. The name is 1 byte long. The name starts at offset 0x18, and is the string ".", zero terminated. The remainder of the 256-byte long name buffer contains some gibberish, but that's OK.

Code:
00000110  00 00 00 00 00 00 00 00  92 00 00 00 00 00 00 00  |................|
00000120  02 00 00 00 00 00 00 00  20 00 04 00 02 00 00 00  |........ .......|
00000130  2e 2e 00 00 00 00 00 00  16 90 0d 00 00 00 00 00  |................|
00000140  09 0b 16 17 00 00 00 00  28 00 08 00 0a 00 00 00  |........(.......|
00000150  2e 6e 65 74 72 77 68 69  73 74 00 00 00 00 00 00  |.netrwhist......|
00000160  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000170  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000210  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000220  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
At offset 1 (0x118 in bytes), there exists a directory entry with inode number 0x92. The next entry will be at offset 2 in the directory. This directory entry is 0x0020 bytes long (still don't understand). This entry is also of type 0x04 = DT_DIR, which means it's another directory. The name is 2 bytes long. The name starts at offset 0x130, and is the string "..", zero terminated. The remainder of the 256-byte long name buffer contains some gibberish, but that's also OK.

So far, everything is great: This directory contains . and .., as expected.

Code:
00000230  16 90 0d 00 00 00 00 00  09 0b 16 17 00 00 00 00  |................|
00000240  28 00 08 00 0a 00 00 00  2e 6e 65 74 72 77 68 69  |(........netrwhi|
00000250  73 74 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |st..............|
00000260  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000270  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000280  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000290  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000300  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000310  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000320  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000330  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000340  00 00 00 00 00 00 00 00                           |........|
00000348
At offset 2 (0x230 in bytes), there exists a directory entry with inode number 0x0d9016. The next entry will be at offset 0x17160b09 in the directory; this is irrelevant, as this is the last one. This directory entry is 0x0028 bytes long (that is very bizarre, why not 0x20?). This entry is of type 0x08 = DT_REG, which means it's a regular file. The name is 10 bytes long. The name starts at offset 0x248, and is the string ".netrwhist", zero terminated, which is indeed 10 bytes. The remainder of the 256-byte long name buffer contains some gibberish, but that's OK.

So this just shows us that we know how to decode directory entries (mostly, the length in there is a bit of a mystery to me).
 
To see if it is in fact a filesystem hiccup, you might also want to investigate the dataset with zdb(8).

zdb can only work on whole datasets or pools, so you might want to try to create a new dataset and 'cp -a' the /root directory to that dataset (and hope the error gets carried over).

zdb -dd zroot/copy-of-root would give you the listing of all objects with their various sizes, type and the object number. The file (s) in question might show some implausible values for their size. Examine those objects further with zdb -ddddd zroot/copy-of-root <objectnumber>
 
The long output one in the zip file: Tomorrow. It's after midnight here. Maybe you can try to decode it yourself? The recipe is above.
 
the data seems okish
i speculate that the inode number for the borked files is wrong
rm(1) will always lstat the file before trying anything so if the inode is problematic will fail
no idea what will happen if you unlink them directly without a stat before
 
any filename longer than 8 will get 0x28 longer than 16 probably 0x30 and so
struct data before filename is 24 bytes long (0x18)
Ah, you're exactly right! The file name length is rounded up to the nearest 8 bytes, which is why the entries for file names of length 1 and 2 were 0x20 instead of the 0x18 that I expected, and the entry for the 10-byte filename is 0x28. Thank you, now I understand.
 
the data seems okish

Allow me to confirm that, with details. I will use two examples, the directory called .vim which we looked at last night, and the problematic file .viminfo.
Code:
000000b0  15 90 0d 00 00 00 00 00  bc 04 98 12 00 00 00 00  |................|
000000c0  20 00 04 00 04 00 00 00  2e 76 69 6d 00 00 00 00  | ........vim....|
Inode number is 0x0d9015; that makes sense, and was the inode number for directory entry "." inside the .vim directory. The next entry will be at offset 0x9804bc; that is completely crazy, but may be irrelevant (I don't have time right now to think through how readdir, seekdir and telldir interact; I suspect the next entry number is only relevant when reading multiple entries at once, which we are not doing here). The length of the entry is 0x20 bytes, which also makes sense (see message above): 0x18, plus a name that's rounded up to 8 bytes. The type of the entry is 4, which means directory, correct. The length of the name is 4 bytes, and the name is ".vim", correctly 4 bytes.

Finding the entry for .viminfo is a little harder, because for some bizarre reason the unused part of the dirent-length buffer is filled with irrelevant copies of other directory entries. Here is the real one (I blanked out irrelevant parts of the previous one with X):
Code:
000014c0  XX XX XX XX XX XX XX XX  0f 02 0d 00 00 00 00 00  |XXXXXXXX........|
000014d0  10 e7 f2 19 00 00 00 00  28 00 08 00 08 00 00 00  |........(.......|
000014e0  2e 76 69 6d 69 6e 66 6f  00 00 00 00 00 00 00 00  |.viminfo........|
This tells us that there is an entry with inode number 0x0d020f = 852495. The next one is number 0x19f2e710, which is again irrelevant. The length of this entry is 0x28, which is sensible. The type of this entry is 8 = regular file, and the name length is 8 characters (correct). The name is ".viminfo", padded with a zero at the end. Because the zero pad is at the byte at offset 0x20, the length of the entry had to be padded out to 0x28 bytes.

So this is a perfectly sensible and valid directory entry for a file named .viminfo. As Covecat already said above, the likely cause is that there is no file system object with inode number 852495. How can you verify that? This is a little difficult. POSIX = Unix does not have a documented/published/official system call (or API) for finding or opening files by inode number. Most file systems have internal (undocumented/unpublished/inaccessible) APIs that can do that, but I don't have time right now to dig into ZFS internals to find it. Can someone find a ZDB command for that, please?

EDITed to add: If it is really true that the directory /root has a directory entry for a file named .viminfo with inode number 852495, but such an inode number does not exist, then (a) the file system is now corrupted, and it will require deep knowledge of ZFS internals to fix, and (b) you have found a bug, although we do not know where the bug is (at least, not yet).
 
Back
Top