Fsck and softupdate errors

Hi,

I've been a happy FreeBSD user since the days of 1.1.5.1, but lately I'm having trouble with my 11.0-RELEASE-p2 server. It keeps giving fsck and softupdates errors. Even the point that it prevents a proper reboot ( tricky for a server on a remote location).

Of course I manually ran fsck from single user mode, and it says it fixed the problem. However after a reboot the problem immediately returns:
Code:
# fsck /usr
** /dev/gpt/usrfs (NO WRITE)
** Last Mounted on /usr
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
UNALLOCATED  I=6156805  OWNER=smokeping MODE=100644
SIZE=499 MTIME=Feb 13 09:12 2017
FILE=/local/var/smokeping/__sortercache/data.Curl.storable

UNEXPECTED SOFT UPDATE INCONSISTENCY

REMOVE? no

** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
UNREF FILE  I=6156843  OWNER=smokeping MODE=100644
SIZE=0 MTIME=Feb 13 09:12 2017
RECONNECT? no


CLEAR? no

** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? no

SUMMARY INFORMATION BAD
SALVAGE? no

BLK(S) MISSING IN BIT MAPS
SALVAGE? no

1068163 files, 3897382 used, 21903328 free (201736 frags, 2712699 blocks, 0.8% fragmentation)
------------------
The result of smartctl is:
Code:
 smartctl -l selftest /dev/ada0
smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.0-RELEASE-p2 amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, [URL='http://www.smartmontools.org']www.smartmontools.org[/URL]

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     47681         -
# 2  Extended offline    Interrupted (host reset)      70%     47675         -
# 3  Short offline       Completed without error       00%      7066         -

Please help
 
Yes, look like different, the "SOFT UPDATES" related messages let me though it was similar ....

Of course I manually ran fsck from single user mode, and it says it fixed the problem. However after a reboot the problem immediately returns:

It is not clear to me if the error is effectively cleared or not after running fsck, or if the the filesystem get corrupted again on next reboot. (or if it is simply a side effects of running fsck on a mounted filesystem).
 
Yes, look like different, the "SOFT UPDATES" related messages let me though it was similar ....



It is not clear to me if the error is effectively cleared or not after running fsck, or if the the filesystem get corrupted again on next reboot. (or if it is simply a side effects of running fsck on a mounted filesystem).

The error is cleared after running fsck. I go into single user mode, umount the filesystem, fsck -y and get a 'clean' result. I reboot and do a fsck of the affected filesystem and get the result as quoted above.
 
You might be seeing what I experienced which meant that I disabled soft update journalling. With journalling enabled the filesystem always seemed to have errors and required an fsck. After I disabled this the problem went away. Various other people similarly complained on the mailing lists a while ago but the complaints seemed to just get ignored. It's the tunefs -j option. tunefs -p /dev/<partition> would show if it's enabled or not.
 
You might be seeing what I experienced which meant that I disabled soft update journalling. With journalling enabled the filesystem always seemed to have errors and required an fsck. After I disabled this the problem went away. Various other people similarly complained on the mailing lists a while ago but the complaints seemed to just get ignored. It's the tunefs -j option. tunefs -p /dev/<partition> would show if it's enabled or not.
Thanks, but the filesystem has journaling disabled. It has softupdates enabled.
Code:
tunefs -p /usr
tunefs: POSIX.1e ACLs: (-a)                                disabled
tunefs: NFSv4 ACLs: (-N)                                   disabled
tunefs: MAC multilabel: (-l)                               disabled
tunefs: soft updates: (-n)                                 enabled
tunefs: soft update journaling: (-j)                       disabled
tunefs: gjournal: (-J)                                     disabled
tunefs: trim: (-t)                                         disabled
tunefs: maximum blocks per file in a cylinder group: (-e)  4096
tunefs: average file size: (-f)                            16384
tunefs: average number of files in a directory: (-s)       64
tunefs: minimum percentage of free space: (-m)             8%
tunefs: space to hold for metadata blocks: (-k)            0
tunefs: optimization preference: (-o)                      time
tunefs: volume label: (-L)
 
Haven't had the chance to go to the console for single user, however the weird thing is that /var (which sees a lot more activity than /usr) is has no errors

Code:
fsck /var
** /dev/gpt/varfs (NO WRITE)
** Last Mounted on /var
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
228618 files, 1835136 used, 11063719 free (5007 frags, 1382339 blocks, 0.0% fragmentation)
 
Back
Top