Hi
I’m having a weird issue where our FreeBSD systems occasionally become unresponsive. I've seen it so far on the following versions:
When the issue occurs, it locks out new ssh sessions, but existing ones may still work for a while. Sometimes the filesystem becomes inaccessible (but not always). Eventually the console locks up and we can no longer access the server at all.
All servers that have the issue seem to be the largest busiest servers, with lots of network activity. The issue feels like some find if resource starvation, but we're not sure which resource it could be. Eventually the box needs to be hard reset, and the logs are always clean after a reboot. We cant seem to find any errors in logs, eg. dmesg, /var/log/messages. We've also seen it appear to be occurring more frequently since upgrading servers to 14.3.
Can anyone provide some ideas on what we can do or where we can look to track this one down? Any suggestions of what the problem might be? We haven't been able to reliably reproduce it anywhere, it just seems to randomly occur.
thanks for any help anyone can provide.
I’m having a weird issue where our FreeBSD systems occasionally become unresponsive. I've seen it so far on the following versions:
- 13.3
- 14.2
- 14.3
When the issue occurs, it locks out new ssh sessions, but existing ones may still work for a while. Sometimes the filesystem becomes inaccessible (but not always). Eventually the console locks up and we can no longer access the server at all.
All servers that have the issue seem to be the largest busiest servers, with lots of network activity. The issue feels like some find if resource starvation, but we're not sure which resource it could be. Eventually the box needs to be hard reset, and the logs are always clean after a reboot. We cant seem to find any errors in logs, eg. dmesg, /var/log/messages. We've also seen it appear to be occurring more frequently since upgrading servers to 14.3.
Can anyone provide some ideas on what we can do or where we can look to track this one down? Any suggestions of what the problem might be? We haven't been able to reliably reproduce it anywhere, it just seems to randomly occur.
thanks for any help anyone can provide.