System panic

Perfect, almost there. Assuming you have enough free space in /var run this from single mode: /etc/rc.d/savecore start and check the /var/crash afterwards.
edit: actually in single mode your fileset(s) may be in readonly mode. Do you know how to remount it read-write?
 
Perfect, almost there. Assuming you have enough free space in /var run this from single mode: /etc/rc.d/savecore start and check the /var/crash afterwards.
edit: actually in single mode your fileset(s) may be in readonly mode. Do you know how to remount it read-write?
Ok , so I set readonly=off and ran savecore.

I am finally able to see /var/crash/core.txt.0 !!! Alongside vmcore.last plus a couple of other new files

However less on that file doesn't work. What do I do with it and how do I access it feom say a Ubuntu stick?
 
What does it mean it doesn't work ? That's the text summary of the crash, it should be readable. Copy the whole contents of the /var/crash to that usb stick and share it from there.
 
While I never did this before as there was no need you can save some time and avoid manual file copying. After crash once you are in single mode mount the usb key to, let's say /a and run the savecore command manually: savecore /a /dev/ada0p2 - it will save it to that directory directly and hence will be on USB key right away.
 
When
What does it mean it doesn't work ? That's the text summary of the crash, it should be readable. Copy the whole contents of the /var/crash to that usb stick and share it from there.
When I tried less on one of the files it said there was no debugger or something like that. Now trying to copy the files, one of them is 450+ mb as well. Will post files sooner.
 
Ok, you don't have gdb installed. Not a problem. Along with that please can you do cksum /boot/kernel/kernel and post what version of FreeBSD you're running exactly?
 
Ok, you don't have gdb installed. Not a problem. Along with that please can you do cksum /boot/kernel/kernel and post what version of FreeBSD you're running exactly?
Oops logged out now - really need to focus on recoering data and getting my system back now.

Here are the files:
"bounds" contains only
core.txt.0 contains only
Unable to find a kernel debugger.
Please install the devel/gdb port or gdb package.
info.0 contains only
Dump header from device: /dev/ada0p2
Architecture: amd64
Architecture Version: 2
Dump Length: 477515776
Blocksize: 512
Compression: none
Dumptime: 2022-12-12 12:21:49 +0400
Hostname: toaster
Magic: FreeBSD Kernel Dump
Version String: FreeBSD 13.1-RELEASE-p3 GENERIC
Panic String: VERIFY3(0 == zap_add_int(zfsvfs->z_os, zfsvfs->z_unlinkedobj, zp->z_id, tx)) failed (0 == 97)

Dump Parity: 3470720052
Bounds: 0
Dump Status: good
info.last contains only
Dump header from device: /dev/ada0p2
Architecture: amd64
Architecture Version: 2
Dump Length: 477515776
Blocksize: 512
Compression: none
Dumptime: 2022-12-12 12:21:49 +0400
Hostname: toaster
Magic: FreeBSD Kernel Dump
Version String: FreeBSD 13.1-RELEASE-p3 GENERIC
Panic String: VERIFY3(0 == zap_add_int(zfsvfs->z_os, zfsvfs->z_unlinkedobj, zp->z_id, tx)) failed (0 == 97)

Dump Parity: 3470720052
Bounds: 0
Dump Status: good
and then there's vmcore binary files I think which are 450+ mb

Let me know if this helps fix the system please?
 
The text is not good enough and full trace should be provided. But this is very important:

Panic String: VERIFY3(0 == zap_add_int(zfsvfs->z_os, zfsvfs->z_unlinkedobj, zp->z_id, tx)) failed (0 == 97)

Issue you are having is related to ZFS (cause of the panic is ZFS) and you need somebody with ZFS internals to tell you more (hence PR).

freebsd-version -kru
HW details - at least some description.
+stack backtrace and we have all info needed
 
freebsd-version -kru
HW details - at least some description.
+stack backtrace and we have all info needed
Please check image
 

Attachments

  • IMG20221212183901.jpg
    IMG20221212183901.jpg
    391.1 KB · Views: 55
So anyone still following this thread - seems like the culprit IS zfs, as many suspected, see this message: https://forums.freebsd.org/threads/system-panic.87387/post-591355

Now trying to recover data:
I'm trying to boot into a later BE (p4) and activated it via beadm activate - however it seems to be chrooting me into p3 only (as shown by uname -a, even after reboot). What am I doing wrong? 🤔 (Later BE to recover latest data)

See image for reference
 

Attachments

  • IMG20221212184549.jpg
    IMG20221212184549.jpg
    472.4 KB · Views: 61
Strange. BE is set to p4 but single user mode login (which is the only thing I can do rn) uname says p3 version login 🤔

Please see image below for this

Why is this happening?
 

Attachments

  • IMG20221212190057.jpg
    IMG20221212190057.jpg
    471.8 KB · Views: 40
p4 didn't involve the kernel, it only had some userland updates. P5 is also just a couple of userland updates. So a p3 kernel is perfectly normal.
 
p4 didn't involve the kernel, it only had some userland updates. P5 is also just a couple of userland updates. So a p3 kernel is perfectly normal.
So p3 is a month old, I'd like to backup from p4/p5 .... How can I make that happen?

Should I try running freebsd-update fetch/install?

Edit: Sorry I'm a bit confused about this. I guess the files/data don't really depend upon p3/4/5 , or do they?
Thank you! Please let me know if there's anything else you need from me or if there's a solution to my issue!
 
Ok _martin's advice finally cracked it!!

Basically now zroot/tmp is not mounted.

Next what should I do? Is there a way to fix this zroot/tmp issue for good or do I need to still go ahead and backup coz this might blow up soon?

I see system is panicing during /tmp cleanup. Idea is to either disable this fileset or create a new one. The thing is I don't want to touch ZFS too much as we don't know what state is it in. Disabling it, however, should be ok.
In single mode do zfs set mountpoint=none zroot/tmp and reboot. This dataset would not be mounted but rather /tmp in / would be used. This could be the convenience you need to get to the full system and do backup from there.
 
Can you rename it? Or would that blow up ZFS? zfs rename zroot/tmp zroot/tmp.broken If that works I would create a new tmp; zfs create -o mountpoint=/tmp zroot/tmp
After it's been mounted make sure to chmod 1777 /tmp as it needs the sticky(7) bit there.

Or you can just leave it as-is. It just means /tmp ends up in zroot/ROOT/default
 
I told him not to touch ZFS as much as possible. Maybe there are more issues there anyway but this way he's in full environment and can do a backup. Setting mount point to none is the least invasive approach.

He doesn't need to do anything to /tmp directory. FreeBSD "self-healing" /etc/rc.d/tmp does take care of it.
 
The thing is - we are all guessing. We don't know what's happening.
Yup. I've gleaned from reading too many of the OP's posts that it's an older system prone to overheating. My stab in the dark is that some aging component has started to fail, but only shows symptoms when the system overheats. Not too many paths forward besides new hardware.

I admire the time and effort you and others have spent trying to save the OP's data, though.
 
Somehow I'm not able to mount the other disk that I need to backup to, after doing
geli attach /dev/da0p3
Enter passphrase:
sudo mount /dev/da0p3.eli /mnt
mount: /dev/da0p3.eli: No such file or directory


I see the eli active though.

Also the data seems to have taken a hit ☹️, Firefox won't start without asking me to create a new profile when I had multiple windows running. And Chrome won't even start. That's where I had some of my important stuff.

OP's posts that it's an older system prone to overheating.
What specifically gives it away that it has overheating issues?
I admire the time and effort you and others have spent trying to save the OP's data, though.
Definitely. All of them are rockstars for having gone out of their way to help me 🙏
Even though my data seems to have taken a hit
 
covacat: Sounds interesting, even more so that the KB is not that old. Sadly I don't have valid MOS either.

It's up to you how you decide to do a backup, there are more ways to skin a cat. I would opt for filesystem backup using rsync and would not do zfs send. I mean as you do have corrupted pool issue is there one way or the other. It would be my personal preference though.

In a private chat you mentioned this disk you're using is somewhat backup of the original one. Pay attention you don't have pools with the same name on both disks.
If da0p3.eli doesn't exist after you entered passphrase you didn't enter a proper one then. Syslog (/var/log/messages) might give you more information about that.
 
Actually, I do have MOS support. I won't blindly copy-paste the contents of the link here though.
Suggested solutions were mentioned here actually (scrub). if that fails restore is needed.

I went through your pictures you shared here again. One where you share zpool status -v (those 3 errors) is important. This picture is what lead me to the suggestion to disable rpool/tmp dataset in the first place. I suggest you attemp to clean it this way.

a) chromium: I'm not sure how much data you have there (bookmarks, saved passwords, etc.) but I'd rather have chromium recreate everything from scratch. As root (without chromium running do this), purposely split into two commands:
Code:
cp -rp /usr/home/c1utt4r/.config/chromium /var/crash
rm -rf /usr/home/c1utt4r/.config/chromium

b) zroot/tmp .. As mention before you could probably remove zroot/tmp and recreate it again. But this is something I'd do rather _after_ you have backup done.
Interesting point: if you can't fix metadata on a dataset you should restore the whole pool, i.e. don't trust the pool at all.

You had reported issues only on a) and b) so I'd say your data are still safe. And as you don't have any other means of backup this is the only option for you.
 
looks a lot like https://support.oracle.com/knowledge/Sun Microsystems/2421977_1.html
just i can't remember my larry support account
This is the link that the result which shows error points to https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A/ .... seems like a metadata level corruption .... not sure how to get rid of it .... but first I guess I need to salvage whatever data remains
If da0p3.eli doesn't exist after you entered passphrase you didn't enter a proper one then. Syslog (/var/log/messages) might give you more information about that.
Passphrase is correct - it attaches itself but it doesn't mount. Here is the output to show

sudo zdb -l /dev/da0p3.eli
------------------------------------
LABEL 0
------------------------------------
version: 5000
name: 'zroot'
state: 0
txg: 3557598
pool_guid: 10535025700179738651
hostid: 2647270205
hostname: ''
top_guid: 1525963299974165836
guid: 1525963299974165836
vdev_children: 1
vdev_tree:
type: 'disk'
id: 0
guid: 1525963299974165836
path: '/dev/ada0p3.eli'
phys_path: 'id1,enc@n3061686369656d30/type@0/slot@3/elmdesc@Slot_02/p3/eli'
whole_disk: 1
metaslab_array: 67
metaslab_shift: 31
ashift: 12
asize: 311476617216
is_log: 0
DTL: 284
create_txg: 4
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
labels = 0 1 2 3



It's up to you how you decide to do a backup, there are more ways to skin a cat. I would opt for filesystem backup using rsync and would not do zfs send. I mean as you do have corrupted pool issue is there one way or the other. It would be my personal preference though.
I was hoping to use zfs for file permissions, etc being the same, and possibly easier. If data is corrupted (as it seems) maybe zfs is a better option than rsync ?
 
If that eli is zfs pool you need to import it, you can't mount it as a regular FS (you are mounting it as FFS actually as that's the default fs for FreeBSD).
Also, as I had suspected, that pool is also named zroot. If you run zpool import you should see the pool.

I'm not particularly proud of editing my posts but I noticed this:
Code:
zdb -l /dev/da0p3.eli 

path: '/dev/ada0p3.eli'
You didn't explain how you got to that disk but pay attention. It seems those are clones of some sort -- you can make a mess if you try to import it.
 
Back
Top