Hello everyone, I have a server running on FreeBSD 13.0, the server randomly freeze after some days or a month.
Here is the phenomenon:
1. Unable to connect to the ssh, when input ssh command, no any response.
2. Alot of services can not be visit, some simple service like static nginx page can be opened in a short time, but if you refreshed page some times, the page will be stuck, and have no response, some other services is the same.
3. Another server has a always logged ssh to FreeBSD Server, and opened a top command, when FreeBSD freeze, this ssh can still visit and top command can refresh and output system status,the memory is normal, cpu usage is normal, ZFS ARC is normal, swap is normal, clock is normal, looks like anything is normal. Any hot key for top can use, but when press q to quit top, and type other command, like "systat -ifstat", the command stuck, no any output, Ctrl + Z or C no response.
4. Ping server always normal.
5. The redis-server on freebsd is normal, because redis service can response and very good.
6. Unable to login from console, when type username and password, press enter, no any output.
Environment:
FreeBSD 13.0,Intel Xeon 4Core + 16GB Memory, Two 2T Disk, ZFS Mirror, Root on ZFS. It's a new machine, it's been less than half a year since we bought it.
Main system only running sshguard+ipfw, mount a nfs and use nullfs to a jail, jail file system running on zfs dataset clone, services all running in this jail.
Server has two bge network interface, one for lan, one for wan, the services is network heavy service.
In jail, running nginx, php-fpm, php cli server, mysql, redis-server, there is alot of nfs write, read by php.
Some try:
At first it was suspected to be a ZFS ARC problem, and I set arc max to 2G, but in top ARC is very normal..
When look at dmesg,or any log by system or services, every log stopped record when system freeze, means there is no any abnormal log.. but looks like some service that no need read or write file is normal.
I have no way to probe the system, because a new login cannot be generated, and a new command cannot be executedWhen system freeze, I have no anyway to see.
What should I do now?
Here is the phenomenon:
1. Unable to connect to the ssh, when input ssh command, no any response.
2. Alot of services can not be visit, some simple service like static nginx page can be opened in a short time, but if you refreshed page some times, the page will be stuck, and have no response, some other services is the same.
3. Another server has a always logged ssh to FreeBSD Server, and opened a top command, when FreeBSD freeze, this ssh can still visit and top command can refresh and output system status,the memory is normal, cpu usage is normal, ZFS ARC is normal, swap is normal, clock is normal, looks like anything is normal. Any hot key for top can use, but when press q to quit top, and type other command, like "systat -ifstat", the command stuck, no any output, Ctrl + Z or C no response.
4. Ping server always normal.
5. The redis-server on freebsd is normal, because redis service can response and very good.
6. Unable to login from console, when type username and password, press enter, no any output.
Environment:
FreeBSD 13.0,Intel Xeon 4Core + 16GB Memory, Two 2T Disk, ZFS Mirror, Root on ZFS. It's a new machine, it's been less than half a year since we bought it.
Main system only running sshguard+ipfw, mount a nfs and use nullfs to a jail, jail file system running on zfs dataset clone, services all running in this jail.
Server has two bge network interface, one for lan, one for wan, the services is network heavy service.
In jail, running nginx, php-fpm, php cli server, mysql, redis-server, there is alot of nfs write, read by php.
Some try:
At first it was suspected to be a ZFS ARC problem, and I set arc max to 2G, but in top ARC is very normal..
When look at dmesg,or any log by system or services, every log stopped record when system freeze, means there is no any abnormal log.. but looks like some service that no need read or write file is normal.
I have no way to probe the system, because a new login cannot be generated, and a new command cannot be executedWhen system freeze, I have no anyway to see.
What should I do now?