Solved ZFS Pool ok again?

  • Thread starter Deleted member 43773
  • Start date
D

Deleted member 43773

Guest
Hi all,

short question about my zpool. (long story, short question)

My main Desktop PC uses 2 HDDs with ZFS in mirror.

Yesterday my system completely crashed totally - something went wrong while installation of Guest Additions within VirtualBox (Win7), I don't really know why yet.
However: Total system's crash. Black screens, no reactions to nothing - dead... brute force: hardware reset button (Yep, Ouch!)
After that: loooooong BIOS hardware check (I was really worried, my hardware was gone..if you just see a blinking cursor on a black screen only over endless minutes, no BIOS starts...),
again loooong time BIOS needed to detach all drives (had problem to determine one HDD, the one concerned)
Life teached me, sometimes is best just to be patient, wait and trust.
After a while: Voilá, FreeBSD bootloader (phew!), long boot, system up again - big relief!

zpool status "degraded" - one of my two HDDs was shown as blank, unformatted, not implemented (long number, no partition "formerly ada1p3, remove from pool,...exchange!"
After a couple of minutes (checked if at least all arms and legs are still attached where they belong to) reboot - commonly, not with the hammer by pulling the plug :-D
zpool status already "degraded" and "replace disk", but disk (partiton) is already in the pool as ada1p3, shown as before the crash, except Checksum Error 36.
So I scrubbed the pool, nothing changed except now it was cheksum 60.

Knowing to have to replace the disk first thing this morning, I started my computer...checked some emails... let's repair the pool...
Of course, the very first thing before I start anything with my pools is
#zpool status

...everything is shown as nothing has happened at all:

Code:
# zpool status
  pool: zroot
 state: ONLINE
  scan: scrub repaired 96K in 0 days 00:35:33 with 0 errors on Mon Feb  1 19:02:58 2021
config:

        NAME        STATE     READ WRITE CKSUM
        zroot       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada0p3  ONLINE       0     0     0
            ada1p3  ONLINE       0     0     0

errors: No known data errors

Also gpart shows me, that everything is normal again.

Code:
=>        40  1953525088  ada0  GPT  (932G)
          40        1024     1  freebsd-boot  (512K)
        1064         984        - free -  (492K)
        2048     4194304     2  freebsd-swap  (2.0G)
     4196352  1949327360     3  freebsd-zfs  (930G)
  1953523712        1416        - free -  (708K)

=>        40  1953525088  ada1  GPT  (932G)
          40        1024     1  freebsd-boot  (512K)
        1064         984        - free -  (492K)
        2048     4194304     2  freebsd-swap  (2.0G)
     4196352  1949327360     3  freebsd-zfs  (930G)
  1953523712        1416        - free -  (708K)

So, no damages so far as stated by system tools.
Right?
Is really everything alright again, now?
Can I trust this?
 
Knowing to have to replace the disk first thing this morning, I started my computer...checked some emails... let's repair the pool...
Of course, the very first thing before I start anything with my pools is
#zpool status

...everything is shown as nothing has happened at all:
Not sure, but looks like a drive cable issue.
 
Yes, something like that, whatever temporary disconnection of one of the drives.
The storyline then is normal:
ZFS takes the drive offline, marks the pool degraded, writes onto the other disk.
Then, when the disk reappears, obviousely there are differences between the two disks. These appear as checksum errors; some immediately, some during the next scrub.
Then during reboot the error count on the pool may be reset to 0.
 
Sorry, to answer late, but I did not receive any E-Mail to be noticed there are answers (I thought, I activated it...), so I checked just by coincidence.

1. No, it was no cable issue, what of course was the very first point I checked out.

2. System is running with unchanged system's information since then ... - well, what I just see is

Code:
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
maybe I could do this :cool: ....

... I am an upgrade denier, since I made way too many experiences, that updates, upgrades and patches are randomly messed up, and way too often afterwards too many things look and feel otherwise or don't even work anymore... all for "security" and "improvement in user's interests" only - yeah maybe, but please, let the sticky fingers from my configurations and leave the UI alone, damned! I want to use this software, not learning its usage ever all over again and again - especially not, if the GUI becomes worse with every upgrade..*usingextremelybadcursewords*)

After one, two days later - cooling off after the first shock - I analyzed what probably really happened:
I was installing some upgrades for the Win 7 Pro 64 I'd formerly installed on my VirtualBox.
I went afk - doing something useful instead of staring stupidly at the screen watching progress bars.
When I came back, all my screens were black - Xorg's energy saver has been activated (Where can I configure this anyway? [I'll find out myself, but would be thankful for any quick hint what spares me filtering endless quoting-idiocy in forums]).
Simultaneously KB and Mouse are catched within VB.
So I thought, the machines has crashed and on a running system hit the reset-bu... - classical PICNIC- or IBM-Error *cough*

However, as a consequence I now can also login from one of my other machines on my main computer, having at least a shell to see if the machine is really totally completely crashed (what with FreeBSD in all the years I am learning this system only happend one single time only, when I mistakenly mounted a HDD containing another FreeBSD on / ... (you have the permission to laugh.))

Point is - and that's my question of this thread:
May I trust this infos, that my system is free of damages again, that resulted by my brute action,
or may there are still filesystem issues, and if, how do I check, find out and repair?

At the moment I am looking for the offender who uses up nearly all of my 8GB so my system starts to swap!
For several days my system noticeable slows down. top shows me that down to belwo 300M are left free of my 8G and swapinfo tells me a usage of up to 10% of my swap.
Even if this may be offtopic, but could this be an ZFS issues, some kind of late-result of my reset-button-action?
How can I figure this out?

(now I have to check for E-Mail activation, so I recieve your answers in time.)

Thanks.
 
[assuming HW is ok] ZFS is CoW - it's always in a consistent state, even if your system crashed in the most horrible way possible (unless you tweaked developer's knobs that you must not touch - I strongly assume you didn't). There is no fsck_zfs(8), because there's no need for it... If you're sure you don't want to run (or directly access from) any previous kernel version of FreeBSD on that ZFS pool, then you can safely upgrade the ZFS pool. Since that only affects ZFS internals, it can not affect your data & thus your precious application's settings, no way, 100% guaranteed. Else I owe you a drink of your choice; just send me the paypal of your favorite pub.

IMHO swap usage of 10% is usual for average desktop workload, even on systems much larger than 8 GB RAM. The VM system swaps out unused stuff to make room for an app or kernel task that needs it more importently; that's good & normal. OTT, you should investigate the slowdown; i.e. 1st make sure it is real & not the psychological effect that you're in a state of uncertainty because that machine crashed a few days ago, thus you don't trust that machine anymore & suspect "enemy bits" all around.
 
Thanks. I already scrubbed the pool (did it again; doesn't hurt [I let cron scrub the pool weekly anyway]).
So after all you confirmed, what I knew about ZFS. Thanks, that's what the whole point of this thread was, to get confirmation by someone, who really knows ZFS.
So no damages left by the crash. If there were any, they had been corrected by now.
Thanks a lot.
 
Back
Top