FreeBSD 9 randomly freezes

Hi All,

I'm having a issue with my FreeBSD 9-RELEASE installation (I use the ZFSguru system image). All works fine for a few days/weeks but then the system randomly freezes. There's no pattern in these freezes (I guess). Once I had it in a SSH session, after typing su, the system froze. But sometimes in the morning I wake up and I can't access my ZFS shares.

I have a bootdisk (SSD) with the FreeBSD/ZFSguru installation and two pools. One is a three-disk RAID-Z1 and the other one is a six-disk RAID-Z2. I have virtualbox 4.0.16 installed with a Windows Server 2008 R2 VM for recording six IP cameras. They have three cores assigned with a 90% execution cap. Audio and USB2.0 support is disabled. Guest additions are installed.

I have no clue what is freezing my system, as I also tested it with no VM's running. System still freezes from time to time, most of the time it's after 1-2 weeks. Maybe there's no or bad support for the Z68 chipset with a Core i5 2500K? Or should I run MemTest for longer than two complete cycles?

To be honest, I don't know what to do. I hope one of you recognizes this problem and can help me out.

P.S.: I'm new to FreeBSD, discovered it via ZFSguru by reading about ZFS on a Dutch forum. Below you can find my system specs.

&quot said:
9x Samsung EcoGreen F4EG HD204UI, 2TB
1x Intel 320 120GB
IBM ServeRAID M1015 SAS/SATA Controller

Intel Core i5 2500K Boxed
Gigabyte GA-Z68X-UD3H-B3
Corsair Vengeance CMZ32GX3M4A1600C10 ( 4x 8GB DDR3 1600Mhz )
be quiet! Dark Power Pro P9 550W
 
Hello,

Or should I run MemTest for longer than two complete cycles?
I have had a machine with a damaged non-ECC memory latch, Memtest identified the damage after pass 17. ince this time I always leave the memtest running for 24-48 hours and do not stop after pass 1.

Best regards
Shakky4711
 
flup_bel said:
Maybe there's no or bad support for the Z68 chipset with a Core i5 2500K?

Working here.

Or should I run MemTest for longer than two complete cycles?

It wouldn't hurt. Some problems are rare or due to factors that might not be encountered in a short test, like a nearby welder being powered up.
 
I'm running MemTest86+ v4.20 at the moment. It has been running 'troublefree' for almost 2 hours now.
 
flup_bel said:
I'm running MemTest86+ v4.20 at the moment. It has been running 'troublefree' for almost 2 hours now.

Sadly, this morning I woke up and the "blue MemTest86+ screen" was missing. Instead I got this "dancing pixels". I tried to capture them with my camera (see attachment). These "dancing pixels", is it the RAM or could it also be the mainboard/CPU?

Anyway, I already removed two of the four RAM modules and re-launched MemTest86+. At noon I'll go home to have a look.
 

Attachments

  • DansendePixels.jpg
    DansendePixels.jpg
    84 KB · Views: 283
Definitely something not right with the hardware. Check CPU and graphics card fans. Make sure they are attached firmly and not clogged with dust or lint.
 
wblock@ said:
Definitely something not right with the hardware. Check CPU and graphics card fans. Make sure they are attached firmly and not clogged with dust or lint.

The CPU is cooled by a large Scythe cooler with 12cm fan. I have two 12cm intake fans and also a 12cm exhaust fan. The four CPU cores are hitting 52-56 degrees Celsius with 60-75% load (all three cores in the W2008R2 VM are almost at 100%), so I guess these temperatures are just fine?

At the moment MemTest86+ is testing one 8GB module. Currently it's at its fourth pass. No errors so far :).
 
Some results:
  • One DDR3 module in slot 1: 5 passes, all ok
  • Two DDR3 modules in slot 1 & 2 (Dual Channel): 4 passes, all ok
  • Four DDR3 modules in all slots (Dual Channel ): 0 passes, no errors but a crash
I actually managed to film this crash with a webcam and motion detection software. You can see this on Youtube: http://youtu.be/jMGta55Egyk?hd=1. Anyone familiar with this kind of behavior?

I took out the two modules from slot 1 and 2, that passed four times. Because I think they're ok.
MemTest86+ is now running the modules from slot 3 and 4. The "suspects" :f.
 
Final results:

Individual RAM module in their 'own' slot: 9-10-11 passes without errors or crashes.
Two or more RAM modules: after 0 up to 4 passes I get a system crash (as seen in the Youtube movie). Although the system crashed, MemTest86+ was running error-free the whole time.

What do you guys think, a faulty memory controller in the i5 CPU? Should I RMA my CPU or also my mainboard?
 
On my PC at home I had problems with sudden freezes and reboots due to the motherboard (BIOS) not setting the right RAM speed automatically. Maybe you could check if that's the case for you too.
 
I think I want to hit myself for this! I was browsing the i5-2500K specs on the Intel site to make sure my CPU was running on the right Vcore. Then I noticed that the i5-2500K only supports 1066/1333Mhz RAM, while my RAM was running at 1600Mhz the whole time. I changed the memory multiplier to 13.33 and my problem was solved. No more strange crashes/freezes. I'm a happy FreeBSD user now :).

Anyway thanks to all for helping me out!
 
I think that's the same motherboard as mine, and my memory multiplier is 16.00 with no problems. But it is 2x4G, and the timing probably becomes more critical with more slots used. In fact, that's what the memory test you did seemed to indicate.
 
Back
Top