System panic

elgrande · Dec 10, 2022

I'd comment out ~~fuse_load and~~ kld_list and see if it helps.
Since there is no more hal, you can remove hald_enable anyhow.

Edit: It most probably is the graphics driver, so the kld_list line should do it.

Tracker · Dec 10, 2022

elgrande said:
I'd comment out ~~fuse_load and~~ kld_list and see if it helps.
Since there is no more hal, you can remove hald_enable anyhow.

Edit: It most probably is the graphics driver, so the kld_list line should do it.

Thanks. Tried commenting out all 3. Still crashing. Maybe I should try to reinstall drm-kmod module with these still commented out.

Update: uncommented/commented fuse_loaf to no effect

larshenrikoern · Dec 10, 2022

Hi

I would suggest to replace kld_list="/boot/modules/i915kms.ko" with kld_list="i915kms"

I think this is the common way of loading kernel modules

Tracker · Dec 10, 2022

Tried installing drm-kmod via pkg - didn't work . Restarting leads to crash.

Tried only kld_list="i915kms" - didn't work. Restarting leads to crash.

What next?

PS: Fwiw this is quite an old machine, been working fine since a couple of years with freebsd. Would suspect chrome/machine crashing issue might be due to some recent update. But can't say for sure and unable to figure out how.

((

elgrande · Dec 10, 2022

Just to be sure:

Code:

i915kms_load="YES"

is still commented out in loader.conf?

larshenrikoern · Dec 10, 2022

Hi again

What version of FreeBSD are you using ?? And what graphics (or in this case cpu) are you using ??

Tracker · Dec 10, 2022

elgrande said:
Just to be sure:

Code:

i915kms_load="YES"

is still commented out in loader.conf?

It's always been commented out.

In the dmesg | less output after crash im still seeing something to do with : !drm_modeset_is_locked.... Error I can was referring to earlier.

larshenrikoern said:
Hi again

What version of FreeBSD are you using ?? And what graphics (or in this case cpu) are you using ??

I'm a little unsure how to answer this. Uname -a gives out 13.1-RELEASE-p3 after I chroot using single user mode and do:
zfs set readonly=off latest-zfs-snapshot-i-see-with-zfs-list

i somewhat remember it being upgraded to P4 release. I've tried upgrading from single user mode as well but seems to be showing 13.1-RELEASE-p3 ..... Will try again.

Regarding graphics it's an Intel cpu with no graphics card I think. It's been working flawlessly for a couple of years except the minor hiccups.

Also BE is set to latest snapshot that was taken when trying to perform upgrade in single user mode. Maybe I should set that back? But wont that affect new data not being shown/recovered?

Tracker · Dec 10, 2022

Ok weird. So after installing drm-510-kmod through pkg I thought I should try and compile it through ports/usr/ports/graphics/drm-510-kmod

I tried to do make install clean and the system rebooted again! Instead of throwing any errors or warning.

I would highly suspect this pkg/port is the culprit. Am I correct in thinking so? I'm still a bit lost about what to do about it

larshenrikoern · Dec 10, 2022

Tracker said:
It's always been commented out.

In the dmesg | less output after crash im still seeing something to do with : !drm_modeset_is_locked.... Error I can was referring to earlier.

I'm a little unsure how to answer this. Uname -a gives out 13.1-RELEASE-p3 after I chroot using single user mode and do:
zfs set readonly=off latest-zfs-snapshot-i-see-with-zfs-list

i somewhat remember it being upgraded to P4 release. I've tried upgrading from single user mode as well but seems to be showing 13.1-RELEASE-p3 ..... Will try again.

Regarding graphics it's an Intel cpu with no graphics card I think. It's been working flawlessly for a couple of years except the minor hiccups.

Also BE is set to latest snapshot that was taken when trying to perform upgrade in single user mode. Maybe I should set that back? But wont that affect new data not being shown/recovered?

I would be sure you was using a supported FreeBSD version. If not that could be your problem. But you are on latest release

As far I know if your old snapshot is booting your data is still there and should be able to find. But as I am on UFS I will not acclaim myself ZFS guru.

Tracker · Dec 10, 2022

larshenrikoern said:
I would be sure you was using a supported FreeBSD version. If not that could be your problem. But you are on latest release

As far I know if your old snapshot is booting your data is still there and should be able to find. But as I am on UFS I will not acclaim myself ZFS guru.

This is definitely not something exotic. Always upgraded through standard freebsd update fetch/install

Yes seems like it's booting into the p3 release snapshot, even though I see a couple more snapshots below after trying to update/upgrade. I tried setting the BE to the latest snapshot versions. Maybe I should set BE to the older versions? (just not sure if the latest data will be lost or not)

Tracker · Dec 10, 2022

Changing BE to a month old BE also DOESN'T work and still crashes.

Does anyone else think it's the drm-510-kmod issue only?

_martin · Dec 10, 2022

Tracker said:
I tried to do make install clean and the system rebooted again! Instead of throwing any errors or warning.

Rebooted when it was compiling? If so again, this could be a faulty RAM. Creating memtest usb takes few minutes and you can verify that modules are ok.

Can you share what is the virtual address you had the page fault on ?

Tracker · Dec 10, 2022

_martin said:
Rebooted when it was compiling? If so again, this could be a faulty RAM. Creating memtest usb takes few minutes and you can verify that modules are ok.

Can you share what is the virtual address you had the page fault on ?

Yes, it rebooted while compiling. I tried to compile some other game just to see if it was drm-510--kmod specific and same thing. Rebooted again! Seems like it could be RAM (which might make sense also because my Chrome instance was a memory hog). Thanks

I'll try to use a Ubuntu stick I have for memtest, not sure how to do it with freebsd.

Regarding virtual address, this was the output

 

iung: iun_read_firmuare: ucode rev=0x89dd8401 ulang: link state changed to UP



c:619



WARNING !drm_modeset_is_locked (&crtc->mutex) failed at /urkdirs/usr/ports cs/drm-518-kmod/work/drm-kmod-drn_v5.10.113_8/drivers/gpu/drm/drm_atomic_ HARNING Idrm_modeset_is_locked(&crtc->mutex) failed at /urkdirs/usr/ports.



c:619



drm/drm_atomic_helper.c:669



kernel trap 12 with interrupts disabled



cs/drm-510-kmod/work/drm-kmod-drm_v5.10.113_8/drivers/gpu/drm/drm_atomic_h HARNING !drm_modeset_is_locked (&dev->mode_config.connection_mutex) failed a kdirs/usr/ports/graphics/drm-518-kmod/work/drm-kmod-drm_v5.18.113_8/drivers



Fatal trap 12: page fault while in kernel mode cpuid 8: apic id = 89 fault virtual address = 8x1d8858fdb85

_martin · Dec 10, 2022

The best way is to boot the memtest itself, OS independent. You can download it (free version is ok) from here.
That virtual address is bogus (expected with kernel panic), not sure if you made some mistakes during manual "copy-paste". But even 0x1d8858fdb85 would be a bogus one.

Alain De Vos · Dec 10, 2022

Try to comment out the line

Code:

kld_list="/boot/modules/i915kms.ko"

Then

Code:

pkg update -f
pkg install -f gpu-firmware-kmod

Tracker · Dec 10, 2022

_martin said:
The best way is to boot the memtest itself, OS independent. You can download it (free version is ok) from here.
That virtual address is bogus (expected with kernel panic), not sure if you made some mistakes during manual "copy-paste". But even 0x1d8858fdb85 would be a bogus one.

I am running memtest86 v4.10 currently

After 1 pass it hasn't shown any errors so far. Will leave it on for some time to complete and report back.

Yes the address is what you wrote - I made a mistake with copy/paste

Alain De Vos said:
Try to comment out the line

Code:

kld_list="/boot/modules/i915kms.ko"

Then

Code:

pkg update -f pkg install -f gpu-firmware-kmod

Will try this after the memtest86 cycle is complete. Will report back.

Hope to make some progress

_martin · Dec 10, 2022

Ok, let it run through all the tests. I'm assuming you mean "1 pass of the test pass, not the entire testing". It does show the fancy all over the screen "PASS" text once it's done.

The virtual address is totally bogus, it doesn't show signs of small buffer overruns, etc. As you had GPF and now page fault we can assume it was jumping "all over". It would help to see if the issue is always on the same function, i.e. stack trace. Maybe you should find out how to decrease the sizeof the picture and post it that way.

It was mentioned here above, you should comment practically everything out from rc.conf to see if it boots up.

But .. you said you are having troubles compiling stuff. That usually is what I mentioned above - HW issue. That sudden reboot is a sign of triple fault.

elgrande · Dec 10, 2022

Can you boot a fresh 13.1-RELEASE stick into multi user mode?
That would be an enlightening test.

Tracker · Dec 10, 2022

_martin said:
Ok, let it run through all the tests. I'm assuming you mean "1 pass of the test pass, not the entire testing". It does show the fancy all over the screen "PASS" text once it's done.

The virtual address is totally bogus, it doesn't show signs of small buffer overruns, etc. As you had GPF and now page fault we can assume it was jumping "all over". It would help to see if the issue is always on the same function, i.e. stack trace. Maybe you should find out how to decrease the sizeof the picture and post it that way.

It was mentioned here above, you should comment practically everything out from rc.conf to see if it boots up.

But .. you said you are having troubles compiling stuff. That usually is what I mentioned above - HW issue. That sudden reboot is a sign of triple fault.

It's been running since a couple of hours now, 3 passes done - memtest86 .... No errors so far. Should I let it continue running? See image below.

Regarding rc.conf (after this memtest86 is done) - what about trying to put in the default rc.conf instead - which I read online is at /boot/defaults/loader.conf ?

For stacktrace please see last image

Tracker · Dec 10, 2022

elgrande said:
Can you boot a fresh 13.1-RELEASE stick into multi user mode?
That would be an enlightening test.

I have a 12.x stick available - would that be fine? Right now just running memtest86 since 3-4 hrs now. Not sure how much longer I should let it run.

elgrande · Dec 10, 2022

Tracker said:
I have a 12.x stick available - would that be fine? Right now just running memtest86 since 3-4 hrs now. Not sure how much longer I should let it run.

I guess 12.x is a good first test.
If it boots fine this indicates it might be a borked installation and not a hardware error.
If it boots fine I‘d create a 13.1 stick to double check.
Memtest afaik can run reaaaaally long like 24h long, imho you can do the boot test first, but up to you

_martin · Dec 10, 2022

Tracker said:
I have a 12.x stick available - would that be fine? Right now just running memtest86 since 3-4 hrs now

Memory modules are OK, memtest does pass as shown in the picture.

Observe the crashes and check if it fails always at the same place intel_bw_cal_min_cdclk+0x89. You could open a PR (bug report) for this too.
Suggestions to test with other releases is not a bad idea but if this situation started happening few days ago then still something other is up.

Tracker said:
Regarding rc.conf (after this memtest86 is done) - what about trying to put in the default rc.conf instead

On FreeBSD 13/ZFS I think you'd be ok even with empty rc.conf (zfs_enable="YES" is not needed for boot to succeed I think). This way you can test the boot without the intel driver.

Tracker · Dec 10, 2022

4:27 hrs and running this memtest86 :/
Pass 3 .... 85% complete apparently says line at the top.... Still no errors though.... Was hoping to find something

Tracker · Dec 10, 2022

elgrande said:
I guess 12.x is a good first test.
If it boots fine this indicates it might be a borked installation and not a hardware error.
If it boots fine I‘d create a 13.1 stick to double check.

I'll try with the current one 12.x iirc. Last time I tried it took me to the installation screen then I turned it off not sure if there was a multiuser option there - will check again.

_martin said:
On FreeBSD 13/ZFS I think you'd be ok even with empty rc.conf (zfs_enable="YES" is not needed for boot to succeed I think). This way you can test the boot without the intel driver

Ok. I'll try backing up current rc.conf as rc.comf.backup and having an empty rc.comf (or even zfs enable option)

Will report back. This bug report process might be too lengthy, not sure I will be able to do it anytime soon- have important work/data on machine I need to get running. Seems like I might have to buy another machine: /

elgrande · Dec 10, 2022

Before buying a new machine I would at least try a reinstall of 13.1 - you never know.

System panic

Attachments