Solved NTP Servers Down?

Is there a known outage with ntp servers (e.g., pool.ntp.org, 129.6.15.28 (time-a-g.nist.gov), 132.163.97.3 (time.nist.gov), tick.mit.edu, tick.usno.navy.mil, et al)?

Suddenly (although nothing's changed at my end), for the past two weeks, INTERMITTENTLY, I'm seeing alot of these popping-up in /var/log/messages:
ntpd: sendto (192.5.41.40): Network is unreachable

Clues...
1) 192.5.41.40 is the first server listed in /etc/ntp.conf.
2) After reordering the server list in ntp.conf, the "Network is unreachable" message indicates the IP of the server I just moved to the top of ntp.conf.
3) I AM able to SUCCESSFULLY ping any other server by name or IP (e.g., yahoo, google, my own server, etc.) from my server.
4) I CANNOT ping any of the ntp servers—by name or IP address from the server.
5) I also CANNOT ping any of the ntp servers from my WINDOWS server, which is on a separate network/connection.
6) Stopping/restarting ntpd does NOT seem to help.
7) Rebooting the server does NOT seem to help.

Thanks in advance.
 
OP
JLAIP
UPDATE:

After reconfiguring /etc/ntf.conf with a single server, 0.freebsd.pool.ntp.org, and adding:
driftfile /var/lib/ntp/drift
...things appeared to go back to normal (i.e., no "Network is unreachable" warnings). However, a coupla days ago, the warnings returned with a vengeance. Now, they pop-up pretty much every ~15~30 minutes, regardless of the currently selected ntp server.

Curiously, I'm also noticing that if I try to ping the "unreachable" ntp server right after the warning pops-up, sometimes, it's not pingable, but sometimes it IS. Is that just freak timing or has anyone else experience this?

Finally, every once in awhile, right after an "unreachable" warning appears, I get a "/var is out of space" warning right after it....and /var looks something like this (I'm just making the numbers up to demonstrate the point, so don't worry if they don't add up):
/dev/ar0s1d 763470 22342 -23325 0% /var

This's a relatively lightly used server and, normally, /var looks like this:
/dev/ar0s1d 1883470 662242 1070552 38% /var

Every time I get that "out of space" warning and the Avail space goes negative, I go searching through every directory on /var, looking for a large temp file or group of unusual files, but never find anything. So I just reboot the server and it's back to normal.....until it happens again, days later (right after a seemingly random "unreachable" warning appears.

Since my first post, I've had the ISP check our modem and connection, both good and strong (when they were here) and I've replaced the server's NIC. But no change.

All of these issues began around mid-July and've been ongoing since then. Since others here've reported similar problems accessing ntp servers, has anyone else experienced anything like what I'm describing...or have any recommendations?
 
OP
JLAIP
Can’t answer on ntp but your var space issue sounds like what happens if you delete a log file that a long-running process is using thinking you will free up the space - but you won’t be because the file will still exist and be written to; only when you reboot will it clear up.
e.g.
Yes, that was one of the items I found during my research (prior to the post here). I've been restarting the system to clear the space issue, but it just keeps reoccurring, intermittently, but always right after one of the "unreachable" warnings appear. Just an educated guess, but I reckon it's a tmp file just chock full o' "Network is unreachable" messages.
Thanks for the input though!
 

SirDice

Administrator
Staff member
Administrator
Moderator
Can’t answer on ntp but your var space issue sounds like what happens if you delete a log file that a long-running process is using thinking you will free up the space - but you won’t be because the file will still exist and be written to; only when you reboot will it clear up.
The application will keep the file descriptor open, so the file is never completely deleted, and the application continues to write to it. No need to reboot, just restart the application that kept the file descriptor open. Some processes will 'refresh' their file descriptors when you send a SIGUSR1 signal to it, others might require a SIGHUP. Stopping and starting is a surefire way of doing this but that may disrupt your services.
 
OP
JLAIP
Is anyone else seeing these ntp "Network is unreachable" messages?
I just replaced the server's NIC, cabling and modem (and the ISP's tech checked the line for problems (none - great signal, no noise, etc., all ok)), but no change. So the problem's clearly not with my hardware.

I've tried a number of different ntp servers, but since mid-July they've all been returning these "unreachable" errors. Most of the time, I'm able to successfully ping the freebsd pool ntp server right after one of the messages appear. Anyone see anything from these that stick out as a clue?
 

Attachments

  • unreachable 001.jpg
    unreachable 001.jpg
    183.9 KB · Views: 15
  • unreachable 002.jpg
    unreachable 002.jpg
    129.8 KB · Views: 16
  • unreachable 003.jpg
    unreachable 003.jpg
    115.2 KB · Views: 15
  • unreachable 004.jpg
    unreachable 004.jpg
    88.3 KB · Views: 15
OP
JLAIP
Update....and possible fix??

Shortly after my last post, my research led me to a linux forum in which I found similar ntp errors. The fix was to append "tos maxdist 30" to the end of /etc/ntp.conf. So I did.
It's been ~ four hours and not a single "Network is unreachable" message AND every ntpq -pn now results in a proper output.

So, for anyone suffering similarly "unreachable" ntp updates who may find this post in the future, this....tentatively, for now.....may be the solution.
I'll report back tomorrow with a more definitive answer.
 

cy@

Developer
JLAIP,

tos maxdist sets the maximum distance to the NTP server. For every packet received it reduces the factor in half. For example, a max dist of 16 will result in four packets received before it settles on an NTP server. The default is 1.5.

What is happening is that there is a wide root dispersion. By increasing maxdist you are telling ntpd to accept questionable NTP servers.

An NTP server may be questionable due to distance, network problem, or Windows NTP server. ntpd (and chrony) have difficulty with Windows NTP servers. (Chrony users would use the maxdistance statement.)

If you want to get into the weeds RFC 5905 would make some good bedtime reading.
 
OP
JLAIP
UPDATE:

It's been ~ 24 hours since I added the tos line to /etc/ntp.conf and all's well. No "unreachable" messages, no "out of space" messages and ntpq -pn is once again returning the expected results—displaying an asterisk next to a responding ntp server and without any excessive jitter.
So I'm gonna call this SOLVED.
 
Top