What caused the reboot?

One of my FreeBSD servers rebooted earlier today. There's nothing in the messages or dmesg.today files in /var/log to indicate why, only the timestamp when it started booting up again. This server is located where people have bumped against it in the past - unfortunate, but really the only place available for the machine. Other than these two files, is there any way to tell why a machine rebooted? (Obviously if somebody's butt pressed against the reset button, there won't be a log entry - I just want to be sure that there's nothing I'm overlooking that might indicate a more serious problem.)
 
Did you check # dmesg -a for a "reboot" message? They won't show up with just # dmesg. If you see a message with a username then that user initiated it, otherwise you should just see a shutdown sequence, which would mean a button press or keyboard reboot. If you don't see any shutdown sequence then that probably means a power outage. You could also check /var/log/messages for sudo calls near the time of the reboot.

Kevin Barry
 
You can do some checks to rule out possible causes. In /etc/rc.conf enabled dumpdev="AUTO" and run # kgdb after a reboot.

More options:

* Check power cables.
* Check the capacitors and making sure they're in good condition.
* Run sysutils/memtest or sysutils/memtest86.

Show your # dmesg, # vmstat -i and anything else you consider relevant.
 
Thanks for the tips guys.

I'm the only one with root access to the box, so I know it wasn't somebody executing a command to reboot it. I checked the logs and there's still nothing - goes from running fine to boot messages. It's got a UPS on it, but that doesn't mean something internally couldn't have disrupted the power; given this is the first time, I think I'll leave it and monitor what happens. If there's a piece of hardware starting to fail, it should recur.

Probably somebody's butt rebooted it... really wish there were a better spot for me to keep this server.
 
I fully understand your frustration. This is a problem when your server doesn't have iLO or some other console which can catch these events/messages.

If there's no log nor crash it's very probable that PSU/power is an issue.

This reminds me a story where in some low-budget firm main server was rebooted always on Sunday. Nobody knew why. They found out one maintenance weekend. Each time cleaning lady came to a room she needed a plug for a vacuum cleaner. As there was no free plug available she took the first one she got hands on. And yeah, it was the prod server :)
 
Ruler2112 said:
Probably somebody's butt rebooted it... really wish there were a better spot for me to keep this server.
You should almost certainly have a shutdown sequence in dmesg in that case, e.g. "Waiting (max 60 seconds) for system process ...", right before the boot messages in question. If not, that sounds like a kernel panic or a problem with hardware. Are you overclocking your RAM or anything like that?

Kevin Barry
 
Back
Top