So there's some interesting thoughts in this blog post:
https://blog.tyk.nu/blog/fun-with-freebsd-listen-queue-overflow/
This person does take you through the process of tracking down which process is generating kernel messages like this:
The "TL;DR" version of the above blog is that if you notice this after the process generating the errors has exited, you're basically out of luck:
Any thoughts on a good way to deal with this situation? The two places I run into this all involve short-running processes, so it's a mystery as to who the culprit is.
https://blog.tyk.nu/blog/fun-with-freebsd-listen-queue-overflow/
This person does take you through the process of tracking down which process is generating kernel messages like this:
Code:
sonewconn: pcb 0xfffff80036a761d0: Listen queue overflow: 76 already in queue awaiting acceptance (30 occurrences)
sonewconn: pcb 0xfffff80036a761d0: Listen queue overflow: 76 already in queue awaiting acceptance (10 occurrences)
sonewconn: pcb 0xfffff80339c7f910: Listen queue overflow: 151 already in queue awaiting acceptance (28 occurrences)
sonewconn: pcb 0xfffff80339c7f910: Listen queue overflow: 151 already in queue awaiting acceptance (172 occurrences)
sonewconn: pcb 0xfffff80036a761d0: Listen queue overflow: 76 already in queue awaiting acceptance (6 occurrences)
sonewconn: pcb 0xfffff80339c7f910: Listen queue overflow: 151 already in queue awaiting acceptance (49 occurrences)
sonewconn: pcb 0xfffff80339c7f910: Listen queue overflow: 151 already in queue awaiting acceptance (77 occurrences)
The "TL;DR" version of the above blog is that if you notice this after the process generating the errors has exited, you're basically out of luck:
This would have been a lot easier if FreeBSD logged the pid or socket info along with the error when it happens. I was fortunate that this was a long-running process so the pcb stayed the same. If this had been a short-lived process it would have been considerably more difficult to find it. Process accounting combined with logging the pid with the error would be preferable. Alternatively one could jimmy up something to keep an eye on /var/log/messages and run netstat to find the pcb immediately after the error happens.
Any thoughts on a good way to deal with this situation? The two places I run into this all involve short-running processes, so it's a mystery as to who the culprit is.