Shutdown / reboot problem - NMI ISA 30, EISA FF" & "NMI ... going to debugger

Hey guys

Short edit: I've read in the rules that problems about derivatives should be posted in the respective forums (which I did) - in this case, as it seems to be a problem related to FreeBSD itself, I hope to be allowed to post the problem here as well!
thank you

I'm new to FreeBSD / FreeNAS. I already posted my problem as a FreeNAS bug but it's probably a FreeBSD problem so someone here might be able to help me.

First my system might be useful:
  • ASRock Rack C236 WSI
  • Intel i3-6100
  • 2x Crucial CT16G4WFD8213 (16GB DDR4-2133, ECC, unbuffered)
  • 5x WD Red 4000 GB in RaidZ-2
  • Mach Xtreme Technology ES SLC Pen Drive (32GB) as boot drive
  • OS FreeNAS 9.10 (stable)


My problem occurs during reboot or shutdown:
sometimes while shutting down local daemons / syncing disks instead of:

Code:
Syncing disks, vnodes remaining... 0 0 0 0 0

the following shows

Code:
Syncing disks, vnodes remaining... 0 NMI ISA 30, EISA FF" & "NMI ... going to debugger

After that, the output goes crazy with a

Code:
Tracing command kernel pid 0 tid XYZ

where XYZ is an increasing integer (picture's of both screens attached).

At this point it just remains until I manually power off the PC.

To be honest, I have no idea what the problem could be ... I was not able to figure out a pattern for when that problem occurs ... sometimes it start and shuts down/reboot without a problem, sometimes not ...

I tried some things already (keep in mind I work mostly with the FreeNAS WebGUI, although I know how to use the shell in case more information is needed):
  • I disabled all jails
  • I disabled all plugins
  • I removed all static routes
  • I changed several BIOS option (hyper threading, different power states, dedicated memory for graphics)
  • I switched my LAN connections on the board (Intel i210 and i219)
  • I changed FreeNAS trains (from 9.10 stable to nightlies and back)
  • I checked RAM with memtest86 for errors (didn't find any)
  • I scrubed boot drive and installed drives

Maybe one more note: as long as I do not turn off, the system seems to run fine.

Anyway, I only found one similar post so far (on a pfSense forum https://forum.pfsense.org/index.php?PHPSESSID=ddve1nin6gal13t04rdupsed90&topic=102216.15) and I asked the originator but didn't get an answer yet whether or not the problem was solved.


Sooo...any help at all would be really appreciated!

Thanks and cheers, Silvan

PS: link to my bug report and FreeNAS forum entry:
https://bugs.freenas.org/issues/15089
https://forums.freenas.org/index.php?threads/shutdown-reboot-problem-nmi-going-to-debugger.43135/
 

Attachments

  • first.jpg
    first.jpg
    318.8 KB · Views: 460
  • second.jpg
    second.jpg
    466.1 KB · Views: 330
Well I finally think I found something:
watchdogd(8) service seems to give me the hard time! when I disable watchdogd(8) with the following command:
# service watchdogd stop, the NMI debugger starts sometimes ... so I guess here's the problem.
So 2 questions now:

1. Do I need watchdog?
2. How can I disable it if not needed?

Thanks for the help and best regards
 
well I can confirm that after 3 days of running, rebooting and shut downs, no further problem occurred whenever watchdogd(8) was disabled.

I'd really appreciate it if anyone could tell me if problems could occur if I disable watchdogd(8)? As far as I understand it, watchdog tries to ensure the continuous readiness of the 'server' ... but is there a catch if it's not running?

thanks and best regards
 
ok so the machine is running since the 16th May straight without any issues.

Still I wonder if there could occur a problem at some point if watchdog is not running so if anyone knows a bit more about that, I'd appreciate it very much.

Thanks and best regards
 
Code:
The watchdogd utility interfaces with the kernel's    watchdog facility to
     ensure that the system is in a working state.  If watchdogd is unable to
     interface with the    kernel over a specific timeout,    the kernel will    take
     actions to    assist in debugging or restarting the computer
From watchdogd(8), so seems like it tries to detect when the system/hardware fails and proceed to properly shutdown/reboot the system. I guess disabling it cause no issues at all, but if he was complaining about something before, a problem might occur soon and I don't know how the system would react, as watchdog is disabled.
 
good to hear that it should not cause problems ...
and since it was watchdog itself that caused the problem during shutdown, I think it's save to assume that nothing else should be 'defective'

Thanks for your response man, really appreciated!
 
Back
Top