System panic

I'd comment out fuse_load and kld_list and see if it helps.
Since there is no more hal, you can remove hald_enable anyhow.

Edit: It most probably is the graphics driver, so the kld_list line should do it.
 
I'd comment out fuse_load and kld_list and see if it helps.
Since there is no more hal, you can remove hald_enable anyhow.

Edit: It most probably is the graphics driver, so the kld_list line should do it.

Thanks. Tried commenting out all 3. Still crashing. Maybe I should try to reinstall drm-kmod module with these still commented out.

Update: uncommented/commented fuse_loaf to no effect
 
Hi

I would suggest to replace kld_list="/boot/modules/i915kms.ko" with kld_list="i915kms"

I think this is the common way of loading kernel modules
 
Tried installing drm-kmod via pkg - didn't work . Restarting leads to crash.

Tried only kld_list="i915kms" - didn't work. Restarting leads to crash.

What next?

PS: Fwiw this is quite an old machine, been working fine since a couple of years with freebsd. Would suspect chrome/machine crashing issue might be due to some recent update. But can't say for sure and unable to figure out how. :(((
 
Just to be sure:
Code:
i915kms_load="YES"
is still commented out in loader.conf?
It's always been commented out.

In the dmesg | less output after crash im still seeing something to do with : !drm_modeset_is_locked.... Error I can was referring to earlier.


Hi again

What version of FreeBSD are you using ?? And what graphics (or in this case cpu) are you using ??
I'm a little unsure how to answer this. Uname -a gives out 13.1-RELEASE-p3 after I chroot using single user mode and do:
zfs set readonly=off latest-zfs-snapshot-i-see-with-zfs-list

i somewhat remember it being upgraded to P4 release. I've tried upgrading from single user mode as well but seems to be showing 13.1-RELEASE-p3 ..... Will try again.

Regarding graphics it's an Intel cpu with no graphics card I think. It's been working flawlessly for a couple of years except the minor hiccups.

Also BE is set to latest snapshot that was taken when trying to perform upgrade in single user mode. Maybe I should set that back? But wont that affect new data not being shown/recovered?
 
Ok weird. So after installing drm-510-kmod through pkg I thought I should try and compile it through ports/usr/ports/graphics/drm-510-kmod

I tried to do make install clean and the system rebooted again! Instead of throwing any errors or warning.

I would highly suspect this pkg/port is the culprit. Am I correct in thinking so? I'm still a bit lost about what to do about it
 
It's always been commented out.

In the dmesg | less output after crash im still seeing something to do with : !drm_modeset_is_locked.... Error I can was referring to earlier.



I'm a little unsure how to answer this. Uname -a gives out 13.1-RELEASE-p3 after I chroot using single user mode and do:
zfs set readonly=off latest-zfs-snapshot-i-see-with-zfs-list

i somewhat remember it being upgraded to P4 release. I've tried upgrading from single user mode as well but seems to be showing 13.1-RELEASE-p3 ..... Will try again.

Regarding graphics it's an Intel cpu with no graphics card I think. It's been working flawlessly for a couple of years except the minor hiccups.

Also BE is set to latest snapshot that was taken when trying to perform upgrade in single user mode. Maybe I should set that back? But wont that affect new data not being shown/recovered?
I would be sure you was using a supported FreeBSD version. If not that could be your problem. But you are on latest release ;)

As far I know if your old snapshot is booting your data is still there and should be able to find. But as I am on UFS I will not acclaim myself ZFS guru.
 
I would be sure you was using a supported FreeBSD version. If not that could be your problem. But you are on latest release ;)

As far I know if your old snapshot is booting your data is still there and should be able to find. But as I am on UFS I will not acclaim myself ZFS guru.
This is definitely not something exotic. Always upgraded through standard freebsd update fetch/install

Yes seems like it's booting into the p3 release snapshot, even though I see a couple more snapshots below after trying to update/upgrade. I tried setting the BE to the latest snapshot versions. Maybe I should set BE to the older versions? (just not sure if the latest data will be lost or not)
 
Changing BE to a month old BE also DOESN'T work and still crashes.

Does anyone else think it's the drm-510-kmod issue only?
 
I tried to do make install clean and the system rebooted again! Instead of throwing any errors or warning.
Rebooted when it was compiling? If so again, this could be a faulty RAM. Creating memtest usb takes few minutes and you can verify that modules are ok.

Can you share what is the virtual address you had the page fault on ?
 
Rebooted when it was compiling? If so again, this could be a faulty RAM. Creating memtest usb takes few minutes and you can verify that modules are ok.

Can you share what is the virtual address you had the page fault on ?
Yes, it rebooted while compiling. I tried to compile some other game just to see if it was drm-510--kmod specific and same thing. Rebooted again! Seems like it could be RAM (which might make sense also because my Chrome instance was a memory hog). Thanks 👍

I'll try to use a Ubuntu stick I have for memtest, not sure how to do it with freebsd.

Regarding virtual address, this was the output

iung: iun_read_firmuare: ucode rev=0x89dd8401 ulang: link state changed to UP

c:619

WARNING !drm_modeset_is_locked (&crtc->mutex) failed at /urkdirs/usr/ports cs/drm-518-kmod/work/drm-kmod-drn_v5.10.113_8/drivers/gpu/drm/drm_atomic_ HARNING Idrm_modeset_is_locked(&crtc->mutex) failed at /urkdirs/usr/ports.

c:619

drm/drm_atomic_helper.c:669

kernel trap 12 with interrupts disabled

cs/drm-510-kmod/work/drm-kmod-drm_v5.10.113_8/drivers/gpu/drm/drm_atomic_h HARNING !drm_modeset_is_locked (&dev->mode_config.connection_mutex) failed a kdirs/usr/ports/graphics/drm-518-kmod/work/drm-kmod-drm_v5.18.113_8/drivers

Fatal trap 12: page fault while in kernel mode cpuid 8: apic id = 89 fault virtual address = 8x1d8858fdb85
 
The best way is to boot the memtest itself, OS independent. You can download it (free version is ok) from here.
That virtual address is bogus (expected with kernel panic), not sure if you made some mistakes during manual "copy-paste". But even 0x1d8858fdb85 would be a bogus one.
 
The best way is to boot the memtest itself, OS independent. You can download it (free version is ok) from here.
That virtual address is bogus (expected with kernel panic), not sure if you made some mistakes during manual "copy-paste". But even 0x1d8858fdb85 would be a bogus one.
I am running memtest86 v4.10 currently

After 1 pass it hasn't shown any errors so far. Will leave it on for some time to complete and report back.

Yes the address is what you wrote - I made a mistake with copy/paste


Try to comment out the line
Code:
kld_list="/boot/modules/i915kms.ko"
Then
Code:
pkg update -f
pkg install -f gpu-firmware-kmod
Will try this after the memtest86 cycle is complete. Will report back.

Hope to make some progress🤞
 
Ok, let it run through all the tests. I'm assuming you mean "1 pass of the test pass, not the entire testing". It does show the fancy all over the screen "PASS" text once it's done.

The virtual address is totally bogus, it doesn't show signs of small buffer overruns, etc. As you had GPF and now page fault we can assume it was jumping "all over". It would help to see if the issue is always on the same function, i.e. stack trace. Maybe you should find out how to decrease the sizeof the picture and post it that way.

It was mentioned here above, you should comment practically everything out from rc.conf to see if it boots up.

But .. you said you are having troubles compiling stuff. That usually is what I mentioned above - HW issue. That sudden reboot is a sign of triple fault.
 
Ok, let it run through all the tests. I'm assuming you mean "1 pass of the test pass, not the entire testing". It does show the fancy all over the screen "PASS" text once it's done.

The virtual address is totally bogus, it doesn't show signs of small buffer overruns, etc. As you had GPF and now page fault we can assume it was jumping "all over". It would help to see if the issue is always on the same function, i.e. stack trace. Maybe you should find out how to decrease the sizeof the picture and post it that way.

It was mentioned here above, you should comment practically everything out from rc.conf to see if it boots up.

But .. you said you are having troubles compiling stuff. That usually is what I mentioned above - HW issue. That sudden reboot is a sign of triple fault.
It's been running since a couple of hours now, 3 passes done - memtest86 .... No errors so far. Should I let it continue running? See image below.

Regarding rc.conf (after this memtest86 is done) - what about trying to put in the default rc.conf instead - which I read online is at /boot/defaults/loader.conf ?

For stacktrace please see last image
 

Attachments

  • 1670685897454.jpg
    1670685897454.jpg
    484.9 KB · Views: 52
  • 1670686865648.jpg
    1670686865648.jpg
    439.2 KB · Views: 60
  • 1670687096524.jpg
    862.6 KB · Views: 47
Can you boot a fresh 13.1-RELEASE stick into multi user mode?
That would be an enlightening test.
I have a 12.x stick available - would that be fine? Right now just running memtest86 since 3-4 hrs now. Not sure how much longer I should let it run.
 
I have a 12.x stick available - would that be fine? Right now just running memtest86 since 3-4 hrs now. Not sure how much longer I should let it run.
I guess 12.x is a good first test.
If it boots fine this indicates it might be a borked installation and not a hardware error.
If it boots fine I‘d create a 13.1 stick to double check.
Memtest afaik can run reaaaaally long like 24h long, imho you can do the boot test first, but up to you ;)
 
I have a 12.x stick available - would that be fine? Right now just running memtest86 since 3-4 hrs now
Memory modules are OK, memtest does pass as shown in the picture.

Observe the crashes and check if it fails always at the same place intel_bw_cal_min_cdclk+0x89. You could open a PR (bug report) for this too.
Suggestions to test with other releases is not a bad idea but if this situation started happening few days ago then still something other is up.

Regarding rc.conf (after this memtest86 is done) - what about trying to put in the default rc.conf instead
On FreeBSD 13/ZFS I think you'd be ok even with empty rc.conf (zfs_enable="YES" is not needed for boot to succeed I think). This way you can test the boot without the intel driver.
 
4:27 hrs and running this memtest86 :/
Pass 3 .... 85% complete apparently says line at the top.... Still no errors though.... Was hoping to find something
 
I guess 12.x is a good first test.
If it boots fine this indicates it might be a borked installation and not a hardware error.
If it boots fine I‘d create a 13.1 stick to double check.
I'll try with the current one 12.x iirc. Last time I tried it took me to the installation screen then I turned it off not sure if there was a multiuser option there - will check again.
On FreeBSD 13/ZFS I think you'd be ok even with empty rc.conf (zfs_enable="YES" is not needed for boot to succeed I think). This way you can test the boot without the intel driver
Ok. I'll try backing up current rc.conf as rc.comf.backup and having an empty rc.comf (or even zfs enable option)

Will report back. This bug report process might be too lengthy, not sure I will be able to do it anytime soon- have important work/data on machine I need to get running. Seems like I might have to buy another machine: /
 
Back
Top