I'm running into an issue with FreeBSD crashing on about a weekly basis. The system is running on a KVM instance with 6 passed through hard drives (zroot mirror, one zfs raidz1 pool with cache drive), passed through Realtek gigabit lan, and 64GB ram. The host system is using an AMD Ryzen 7 2800 8 core 16thread cpu with 12 threads allocated to this vm.
The crashes seem almost random, the only thing that has lessened them is unloading the re_ko_mod driver for the ethernet card. Each of the crashes seems to blame a different reason as well:
Many of them seem to be signalling a page fault but RAM is fairly new and when tested returns 0 errors. Any help would be appreciated.
The crashes seem almost random, the only thing that has lessened them is unloading the re_ko_mod driver for the ethernet card. Each of the crashes seems to blame a different reason as well:
Code:
Feb 9 07:51:41 fileserver kernel:
Feb 9 07:51:41 fileserver syslogd: last message repeated 1 times
Feb 9 07:51:41 fileserver kernel: Fatal trap 12: page fault while in kernel mode
Feb 9 07:51:41 fileserver kernel: cpuid = 10; apic id = 0a
Feb 9 07:51:41 fileserver kernel: fault virtual address = 0x7b21b
Feb 9 07:51:41 fileserver kernel: fault code = supervisor read data, page not present
Feb 9 07:51:41 fileserver kernel: instruction pointer = 0x20:0xffffffff8220c1e6
Feb 9 07:51:41 fileserver kernel: stack pointer = 0x0:0xfffffe019d5f3cf0
Feb 9 07:51:41 fileserver kernel: frame pointer = 0x0:0xfffffe019d5f3d50
Feb 9 07:51:41 fileserver kernel: code segment = base 0x0, limit 0xfffff, type 0x1b
Feb 9 07:51:41 fileserver kernel: = DPL 0, pres 1, long 1, def32 0, gran 1
Feb 9 07:51:41 fileserver kernel: processor eflags = interrupt enabled, resume, IOPL = 0
Feb 9 07:51:41 fileserver kernel: current process = 5 (dp_sync_taskq_1)
Feb 9 07:51:41 fileserver kernel: trap number = 12
Feb 9 07:51:41 fileserver kernel: panic: page fault
Feb 9 07:51:41 fileserver kernel: cpuid = 10
Feb 9 07:51:41 fileserver kernel: time = 1675936880
Feb 9 07:51:41 fileserver kernel: KDB: stack backtrace:
Feb 9 07:51:41 fileserver kernel: #0 0xffffffff80c694c5 at kdb_backtrace+0x65
Feb 9 07:51:41 fileserver kernel: #1 0xffffffff80c1bb7f at vpanic+0x17f
Feb 9 07:51:41 fileserver kernel: #2 0xffffffff80c1b9f3 at panic+0x43
Feb 9 07:51:41 fileserver kernel: #3 0xffffffff810afdf5 at trap_fatal+0x385
Feb 9 07:51:41 fileserver kernel: #4 0xffffffff810afe4f at trap_pfault+0x4f
Feb 9 07:51:41 fileserver kernel: #5 0xffffffff810875d8 at calltrap+0x8
Feb 9 07:51:41 fileserver kernel: #6 0xffffffff82224770 at dnode_sync+0x110
Feb 9 07:51:41 fileserver kernel: #7 0xffffffff8220b5c9 at sync_dnodes_task+0x89
Feb 9 07:51:41 fileserver kernel: #8 0xffffffff821a29ef at taskq_run+0x1f
Feb 9 07:51:41 fileserver kernel: #9 0xffffffff80c7daa1 at taskqueue_run_locked+0x181
Feb 9 07:51:41 fileserver kernel: #10 0xffffffff80c7edb2 at taskqueue_thread_loop+0xc2
Feb 9 07:51:41 fileserver kernel: #11 0xffffffff80bd8abe at fork_exit+0x7e
Feb 9 07:51:41 fileserver kernel: #12 0xffffffff8108864e at fork_trampoline+0xe
Feb 9 07:51:41 fileserver kernel: Uptime: 14h23m50s
Feb 9 07:51:41 fileserver kernel: (ada2:ahcich2:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00
Feb 9 07:51:41 fileserver kernel: (ada2:ahcich2:0:0:0): CAM status: Command timeout
Feb 9 07:51:41 fileserver kernel: (ada2:ahcich2:0:0:0): Error 5, Retries exhausted
Feb 9 07:51:41 fileserver kernel: (ada2:ahcich2:0:0:0): Synchronize cache failed
Feb 9 07:51:41 fileserver kernel: (ada5:ahcich6:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00
Feb 9 07:51:41 fileserver kernel: (ada5:ahcich6:0:0:0): CAM status: Command timeout
Feb 9 07:51:41 fileserver kernel: (ada5:ahcich6:0:0:0): Error 5, Retries exhausted
Feb 9 07:51:41 fileserver kernel: (ada5:ahcich6:0:0:0): Synchronize cache failed
Feb 9 07:51:41 fileserver kernel: Automatic reboot in 15 seconds - press a key on the console to abort
Feb 9 07:51:41 fileserver kernel: Rebooting...
Code:
Feb 1 07:40:55 fileserver kernel: panic: bad pte va 8004d2000 pte 1a0527404
Feb 1 07:40:55 fileserver kernel: cpuid = 11
Feb 1 07:40:55 fileserver kernel: time = 1675214102
Feb 1 07:40:55 fileserver kernel: KDB: stack backtrace:
Feb 1 07:40:55 fileserver kernel: #0 0xffffffff80c694a5 at kdb_backtrace+0x65
Feb 1 07:40:55 fileserver kernel: #1 0xffffffff80c1bb5f at vpanic+0x17f
Feb 1 07:40:55 fileserver kernel: #2 0xffffffff80c1b9d3 at panic+0x43
Feb 1 07:40:55 fileserver kernel: #3 0xffffffff810a0a6f at pmap_remove_pages+0x92f
Feb 1 07:40:55 fileserver kernel: #4 0xffffffff80bd0523 at exec_new_vmspace+0x223
Feb 1 07:40:55 fileserver kernel: #5 0xffffffff80ba2d46 at exec_elf64_imgact+0xb16
Feb 1 07:40:55 fileserver kernel: #6 0xffffffff80bcee2d at kern_execve+0x77d
Feb 1 07:40:55 fileserver kernel: #7 0xffffffff80bce35a at sys_execve+0x5a
Feb 1 07:40:55 fileserver kernel: #8 0xffffffff810b06ec at amd64_syscall+0x10c
Feb 1 07:40:55 fileserver kernel: #9 0xffffffff81087ecb at fast_syscall_common+0xf8
Feb 1 07:40:55 fileserver kernel: Uptime: 5d8h19m50s
Feb 1 07:40:55 fileserver kernel: Automatic reboot in 15 seconds - press a key on the console to abort
Feb 1 07:40:55 fileserver kernel: Rebooting...
Code:
Jan 22 06:04:28 fileserver kernel: Fatal trap 12: page fault while in kernel mode
Jan 22 06:04:28 fileserver kernel: cpuid = 9; apic id = 09
Jan 22 06:04:28 fileserver kernel: fault virtual address = 0x440
Jan 22 06:04:28 fileserver kernel: fault code = supervisor read data, page not present
Jan 22 06:04:28 fileserver kernel: instruction pointer = 0x20:0xffffffff80c269ce
Jan 22 06:04:28 fileserver kernel: stack pointer = 0x28:0xfffffe0114fa1d20
Jan 22 06:04:28 fileserver kernel: frame pointer = 0x28:0xfffffe0114fa1dc0
Jan 22 06:04:28 fileserver kernel: code segment = base 0x0, limit 0xfffff, type 0x1b
Jan 22 06:04:28 fileserver kernel: = DPL 0, pres 1, long 1, def32 0, gran 1
Jan 22 06:04:28 fileserver kernel: processor eflags = interrupt enabled, resume, IOPL = 0
Jan 22 06:04:28 fileserver kernel: current process = 5 (dbu_evict)
Jan 22 06:04:28 fileserver kernel: trap number = 12
Jan 22 06:04:28 fileserver kernel: panic: page fault
Jan 22 06:04:28 fileserver kernel: cpuid = 9
Jan 22 06:04:28 fileserver kernel: time = 1674381882
Jan 22 06:04:28 fileserver kernel: KDB: stack backtrace:
Jan 22 06:04:28 fileserver kernel: #0 0xffffffff80c694a5 at kdb_backtrace+0x65
Jan 22 06:04:28 fileserver kernel: #1 0xffffffff80c1bb5f at vpanic+0x17f
Jan 22 06:04:28 fileserver kernel: #2 0xffffffff80c1b9d3 at panic+0x43
Jan 22 06:04:28 fileserver kernel: #3 0xffffffff810afdf5 at trap_fatal+0x385
Jan 22 06:04:28 fileserver kernel: #4 0xffffffff810afe4f at trap_pfault+0x4f
Jan 22 06:04:28 fileserver kernel: #5 0xffffffff810875b8 at calltrap+0x8
Jan 22 06:04:28 fileserver kernel: #6 0xffffffff821f4b56 at dnode_destroy+0x256
Jan 22 06:04:28 fileserver kernel: #7 0xffffffff821f5a32 at dnode_buf_evict_async+0x92
Jan 22 06:04:28 fileserver kernel: #8 0xffffffff80c7da81 at taskqueue_run_locked+0x181
Jan 22 06:04:28 fileserver kernel: #9 0xffffffff80c7ed92 at taskqueue_thread_loop+0xc2
Jan 22 06:04:28 fileserver kernel: #10 0xffffffff80bd8a9e at fork_exit+0x7e
Jan 22 06:04:28 fileserver kernel: #11 0xffffffff8108862e at fork_trampoline+0xe
Jan 22 06:04:28 fileserver kernel: Uptime: 1d12h7m45s
Jan 22 06:04:28 fileserver kernel: Automatic reboot in 15 seconds - press a key on the console to abort
Jan 22 06:04:28 fileserver kernel: Rebooting...
Many of them seem to be signalling a page fault but RAM is fairly new and when tested returns 0 errors. Any help would be appreciated.