Solved newfs: wtfs: Invalid argument

So, I decided to upgrade my home server and add some disks. Previously I used just graid3 for backup disk, now decided to use graid3+graid5+gconcat together:

Code:
# graid5 status
       Name                  Status  Components
raid5/parem  COMPLETE CALM (safeop)  da0
                                     da1
                                     da2
                                     da3
                                     da4
                                     da5
# graid3 status
       Name    Status  Components
raid3/vasak  COMPLETE  da8 (ACTIVE)
                       da9 (ACTIVE)
                       da12 (ACTIVE)
                       da13 (ACTIVE)
                       da14 (ACTIVE)
# gconcat status
       Name  Status  Components
concat/suur      UP  raid5/parem
                     raid3/vasak
But when trying to add filesystem I got error:

Code:
# gpart create -s GPT /dev/concat/suur
# gpart add -t freebsd-ufs /dev/concat/suur
# gpart show /dev/concat/suur
=>         3  1648274878  concat/suur  GPT  (25T)
           3  1648274878            1  freebsd-ufs  (25T)
# newfs -U -b 65536 -e 8192 -f 8192 -i 131072 -k 0 -m 2 -o space /dev/concat/suurp1
        using 7491 cylinder groups of 3438.44MB, 55015 blks, 27648 inodes.
        with soft updates
super-block backups (for fsck_ffs -b #) at:
256, 7042176, 14084096, 21126016, 28167936, 35209856, 42251776, 49293696,
56335616, 63377536, 70419456, 77461376, 84503296, 91545216, 98587136,
... many numbers...
52666519936, 52673561856, 52680603776, 52687645696, 52694687616, 52701729536,
52708771456, 52715813376, 52722855296, 52729897216, 52736939136, 52743981056
newfs: wtfs: 8192 bytes at sector 14592: Invalid argument
Resulting filesystem is unusable. How to fix that?

Goal is to have filesystem with low number of big files (backup disk).

Filesystem size should not be a problem, I have 33TB UFS disk in another machine with same newfs parameters.

I have used all gmirror/gstripe/gconcat and graid3 for years without major problems. Just graid5 is new for me, found it from ports.
 
Looks like a bug. Run the newfs with tracing enabled, and see what system call (at the end) is returning "invalid argument" = EINVAL = errno 22.

I don't think you're into the size limit of UFS, that's supposed to not be dozen of TB, but dozens of ZB.
 
I don't know tracing, did not find anything about that in newfs man page or internet.

I also get error:
Code:
GEOM: raid3/vasak: corrupt or invalid GPT detected.
GEOM: raid3/vasak: GPT rejected -- may not be recoverable.
Now I have got GPT errors for years all the time, mostly just ignore them. But maybe that is relevant this time.

Should I recrate the graid3?

Possibly also been thinking about creating graid3 arrays of 3 + 3 + 5 disks and then gconcat them together.

I could create one large graid5 array, but because those are old disks I was hesitant. One disk already failed...
 
I don't know tracing, did not find anything about that in newfs man page or internet.
All you need is "truss newfs ..."
It will show you the arguments to and output from every system call. There will be thousands or millions of system calls, as newfs initializes the file system. Towards the end, one returns errno 22, we need to see why.

I also get error: ... Now I have got GPT errors for years all the time, mostly just ignore them. But maybe that is relevant this time.
If your GPT has errors, then it is likely that something doesn't know how big the partition really is. That could easily cause "invalid argument", when newfs tries to write outside the partition.

Should I recrate the graid3?
Whatever it takes to make it go without errors.

Possibly also been thinking about creating graid3 arrays of 3 + 3 + 5 disks and then gconcat them together.

I could create one large graid5 array, but because those are old disks I was hesitant. One disk already failed...
This is a long and complicated discussion. Not now, maybe later today I'll have time to type it in.
 
Ok, i did run the truss newfs command. I will try to work around this problem some other way:


Code:
write(1,"\n",1)                                  = 1 (0x1)
 52708771456,write(1," 52708771456,",13)                         = 13 (0xd)
pwrite(4,"\0\0\0\0\0\0\0\0\^P\0\0\0\^X\0\0"...,16384,0x188c36f40000) = 16384 (0x4000)
pwrite(4,"\0\0\0\0U\^B\t\0\0\0\0\0>\^]\0\0"...,65536,0x188c36f50000) = 65536 (0x10000)
pwrite(4,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,131072,0x188c36f60000) = 131072 (0x20000)
 52715813376,write(1," 52715813376,",13)                         = 13 (0xd)
pwrite(4,"\0\0\0\0\0\0\0\0\^P\0\0\0\^X\0\0"...,16384,0x188d0ddb0000) = 16384 (0x4000)
pwrite(4,"\0\0\0\0U\^B\t\0\0\0\0\0?\^]\0\0"...,65536,0x188d0ddc0000) = 65536 (0x10000)
pwrite(4,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,131072,0x188d0ddd0000) = 131072 (0x20000)
 52722855296,write(1," 52722855296,",13)                         = 13 (0xd)
pwrite(4,"\0\0\0\0\0\0\0\0\^P\0\0\0\^X\0\0"...,16384,0x188de4c20000) = 16384 (0x4000)
pwrite(4,"\0\0\0\0U\^B\t\0\0\0\0\0@\^]\0\0"...,65536,0x188de4c30000) = 65536 (0x10000)
pwrite(4,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,131072,0x188de4c40000) = 131072 (0x20000)
 52729897216,write(1," 52729897216,",13)                         = 13 (0xd)
pwrite(4,"\0\0\0\0\0\0\0\0\^P\0\0\0\^X\0\0"...,16384,0x188ebba90000) = 16384 (0x4000)
pwrite(4,"\0\0\0\0U\^B\t\0\0\0\0\0A\^]\0\0"...,65536,0x188ebbaa0000) = 65536 (0x10000)
pwrite(4,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,131072,0x188ebbab0000) = 131072 (0x20000)
 52736939136,write(1," 52736939136,",13)                         = 13 (0xd)
pwrite(4,"\0\0\0\0\0\0\0\0\^P\0\0\0\^X\0\0"...,16384,0x188f92900000) = 16384 (0x4000)
pwrite(4,"\0\0\0\0U\^B\t\0\0\0\0\0B\^]\0\0"...,65536,0x188f92910000) = 65536 (0x10000)
pwrite(4,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,131072,0x188f92920000) = 131072 (0x20000)
 52743981056write(1," 52743981056",12)                   = 12 (0xc)

write(1,"\n",1)                                  = 1 (0x1)
fstatat(AT_FDCWD,"/etc/nsswitch.conf",{ mode=-rw-r--r-- ,inode=3221038,size=345,blksize=32768 },0x0) = 0 (0x0)
open("/etc/nsswitch.conf",O_RDONLY|O_CLOEXEC,0666) = 3 (0x3)
mmap(0x0,20480,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34366910464 (0x8006d7000)
mmap(0x0,20480,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34366930944 (0x8006dc000)
fstat(3,{ mode=-rw-r--r-- ,inode=3221038,size=345,blksize=32768 }) = 0 (0x0)
mmap(0x0,36864,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34366951424 (0x8006e1000)
read(3,"#\n# nsswitch.conf(5) - name ser"...,32768) = 345 (0x159)
read(3,0x8006e1c80,32768)                        = 0 (0x0)
mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34366988288 (0x8006ea000)
mmap(0x0,20480,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34366992384 (0x8006eb000)
mmap(0x0,12288,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34367012864 (0x8006f0000)
mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34367025152 (0x8006f3000)
mmap(0x0,12288,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34367029248 (0x8006f4000)
close(3)                                         = 0 (0x0)
sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
sigprocmask(SIG_SETMASK,{ },0x0)                 = 0 (0x0)
sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
sigprocmask(SIG_SETMASK,{ },0x0)                 = 0 (0x0)
open("/etc/group",O_RDONLY|O_CLOEXEC,0666)       = 3 (0x3)
fstat(3,{ mode=-rw-r--r-- ,inode=3220987,size=785,blksize=32768 }) = 0 (0x0)
read(3,"# $FreeBSD: releng/12.2/etc/grou"...,32768) = 785 (0x311)
close(3)                                         = 0 (0x0)
mmap(0x0,69632,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34367041536 (0x8006f7000)
pread(4,"\0\0\0\0U\^B\t\0\0\0\0\0\0\0\0\0"...,65536,0x30000) = 65536 (0x10000)
pwrite(4,"\0\0\0\0U\^B\t\0\0\0\0\0\0\0\0\0"...,65536,0x30000) = 65536 (0x10000)
pwrite(4,"\^B\0\0\0\f\0\^D\^A.\0\0\0\^B\0"...,8192,0x720000) ERR#22 'Invalid argument'
newfs: write(2,"newfs: ",7)                              = 7 (0x7)
wtfs: 8192 bytes at sector 14592write(2,"wtfs: 8192 bytes at sector 14592",32)  = 32 (0x20)
: write(2,": ",2)                                        = 2 (0x2)
fstatat(AT_FDCWD,"/usr/share/nls/C/libc.cat",0x7fffffffb400,0x0) ERR#2 'No such file or directory'
fstatat(AT_FDCWD,"/usr/share/nls/libc/C",0x7fffffffb400,0x0) ERR#2 'No such file or directory'
fstatat(AT_FDCWD,"/usr/local/share/nls/C/libc.cat",0x7fffffffb400,0x0) ERR#2 'No such file or directory'
fstatat(AT_FDCWD,"/usr/local/share/nls/libc/C",0x7fffffffb400,0x0) ERR#2 'No such file or directory'
Invalid argument
write(2,"Invalid argument\n",17)                 = 17 (0x11)
sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
sigprocmask(SIG_SETMASK,{ },0x0)                 = 0 (0x0)
sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
sigprocmask(SIG_SETMASK,{ },0x0)                 = 0 (0x0)
sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
sigprocmask(SIG_SETMASK,{ },0x0)                 = 0 (0x0)
sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
sigprocmask(SIG_SETMASK,{ },0x0)                 = 0 (0x0)
sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
sigprocmask(SIG_SETMASK,{ },0x0)                 = 0 (0x0)
exit(0x24)
process exit, rval = 36
 
One thing is clear, problem is in graid3, not gconcat:


Code:
# newfs -U -b 65536 -e 8192 -f 8192 -i 131072 -k 0 -m 2 -o space /dev/raid3/vasakp1
/dev/raid3/vasakp1: 11446354.0MB (23442132480 sectors) block size 65536, fragment size 8192
        using 3329 cylinder groups of 3438.44MB, 55015 blks, 27648 inodes.
        with soft updates
super-block backups (for fsck_ffs -b #) at:
 256, 7042176, 14084096, 21126016, 28167936, 35209856, 42251776, 49293696,
 56335616, 63377536, 70419456, 77461376, 84503296, 91545216, 98587136,
...
  23386216576, 23393258496, 23400300416, 23407342336, 23414384256, 23421426176,
 23428468096, 23435510016
newfs: wtfs: 8192 bytes at sector 14464: Invalid argument
 
The trace shows that newfs tries to do a write of 8192 bytes at offset 0x720000, and get EINVAL. It is writing to file descriptor 4, and I don't know what file that is. Most likely the raw device that is being initialized; one could get a longer trace and get the corresponding open command. But the parameters are valid: 8192 bytes is a small number, and offset 0x720000 is just ~8MB into the device. Coincidentally, that address is sector 14592. So this has to work.

This means that some layer underneath newfs wrongly returns EINVAL, so the finger indeed needs to be pointed at graid or gconcat. And those are in the kernel, and much harder to debug than userspace stuff like newfs.

Question: Why don't you just use ZFS? It has the RAID layer built in. I think it would be easier to get to work.
 
I have checked some ZFS examples and it seems completely different from the classic RAID levels. It is beyound my ability.

As I said before, I have used raid3 for years without major problems. But I don't know what was wrong, so this time went with RAID5:
Code:
# graid5 status
       Name                  Status  Components
raid5/parem  COMPLETE CALM (safeop)  da0
                                     da1
                                     da2
                                     da3
                                     da4
                                     da5
raid5/vasak  COMPLETE CALM (safeop)  da8
                                     da9
                                     da12
                                     da13
                                     da14

# gconcat status
       Name  Status  Components
concat/suur      UP  raid5/vasak
                     raid5/parem



# gpart show /dev/concat/suur
=>         40  52744794544  concat/suur  GPT  (25T)
           40  52744794544            1  freebsd-ufs  (25T)

This time newfs worked and I have started transferring over 10TB data. Will take about a week (over internet). Let's see if i get any further problems.

I got something in the logs, but I don't understand it, so I just ignore it and hope it works anyway:

Code:
GEOM_CONCAT: Device suur created (id=2191242193).
GEOM_CONCAT: Disk raid5/vasak attached to suur.
GEOM_CONCAT: Disk raid5/parem attached to suur.
GEOM_CONCAT: Device concat/suur activated.
GEOM_CONCAT: Cannot add disk ufsid/5ff1276f99018c6c to suur (error=17).
GEOM: raid5/vasak: corrupt or invalid GPT detected.
GEOM: raid5/vasak: GPT rejected -- may not be recoverable.
 
I have checked some ZFS examples and it seems completely different from the classic RAID levels. It is beyound my ability.
ZFS' RAID-Z is conceptionally the same as RAID-5, and certainly easier to set up and operate. I'd also assume better "support" (forums, mailing-lists etc.) because much more people use it these days. So, getting a simple HOWTO and learn the ZFS basics might pay off pretty soon.
 
Back
Top