Terrible clock skew

Hi everyone,

I've been experiencing some terrible clock skew and just can't figure it out. By terrible I mean I'm losing 30 minutes a day. The loss only occurs when I bring the system under heavy load. The load is multiple Rsync backups to a ZFS pool(with gzip compression) backed by a 16 disk Raid50. I'm using a hardware Raid controller for battery backed write caching.

Mobo: Supermicro X8DTL
CPU: Dual Intel 5620 quad cores
Raid: Areca 1620

I have ntp enabled but the skew happens to fast and it stops trying. I've tried a bunch of stuff from the various lists. I tried all my clock sources. TSC(-100) HPET(900) ACPI-fast(1000) i8254(0). I tried changing the kern.hz flag lower. I also tried disabling the enhanced speed step feature of this chip. Nothing works.

Currently I'm defaults except the following settings.

EIST Disabled in BIOS
kern.hz="100"
kern.timecounter.hardware=i8254

Is there anything else I could do to debug this? I can't really blame the hardware because I have these same boards/chips running in Linux with no clock issues.

Thanks!
Dave
 
syshackmin said:
I've been experiencing some terrible clock skew and just can't figure it out. By terrible I mean I'm losing 30 minutes a day. The loss only occurs when I bring the system under heavy load. The load is multiple Rsync backups to a ZFS pool(with gzip compression) backed by a 16 disk Raid50. I'm using a hardware Raid controller for battery backed write caching.

Mobo: Supermicro X8DTL
CPU: Dual Intel 5620 quad cores
Raid: Areca 1620
FWIW, I have a bunch of X8DTH-iF boards here (a very close relation to yours) with E5520 CPUs and 16-port 3Ware 9650SE-16ML controllers, and they stay NTP-synchronized with no complaints regardless of load. We're running different RAID controllers, though.

The CMOS clock on my X8DTH's drifts slightly when not disciplined - a powered-off system will come up and sync with an NTP offset of a second or so if the system has been off for a couple hours. Nowhere near as bad as what you're seeing, though.
 
Clocks have been known to change their drift due to differences in temperature. So maybe you have a cooling problem?

However, this sounds like fairly extreme drift for heating alone to be the issue. I would make sure you sync with a ntp server, on startup, run a ntpd to keep it in sync, and be aware that certain securelevels (if you're using them at all) restrict clock adjustment to <= 1 second, which can impact the ability of ntp to sync when you have really bad clocks.
 
What version of FreeBSD do you have ?
I'm running current as of r216223 and I have the same problem under heavy load ie my clock cannot be kept in sync.
 
I'm running the release version.

8.1-RELEASE FreeBSD 8.1-RELEASE #0: Mon Jul 19 02:36:49 UTC 2010

I may be experiencing heat problems so I'm going to look into that more. I had ordered a new fan shroud for the case and will be installing that this weekend hopefully.
 
Figured this out. I got a new heat sink and a fan shroud for my case(was a giant 24-drive chassis) and the skew is gone. CPU must have been cooking under load and reducing its clock.
 
Back
Top