RAM Testing for FreeBSD?

This might be a stupid question, and it won't be my last but I am curious if there is any way to run a comprehensive RAM test (check for physical RAM issues) from within a running FreeBSD operating system? Currently I run MemTest86+ for three full passes and then I feel good about my RAM but this requires me to take my computer offline and then it takes overnight to complete. I was thinking that there might be an application which runs in the background which would do a confidence test. And I'm really asking this because I don't have ECC RAM and the cost of upgrading to ECC RAM is very high because it include not only the RAM but the main board and CPU.

Thanks,
Joe
 
As possible solution, you can use memtester which stress test to find memory subsystem faults.

The only downside is that memtester runs in user space. So in other words, will not test the entire memory space (as does memtest86) because the memory is being utilized by the OS and other processes.
 
Any memory tester that can be used on a live system has a quite serious weakness that is can never test as much memory as testers like Memtest can. The memory occupied by the kernel can not be tested and that leaves a considerable amount of total memory untested.
 
Thanks for all the feedback. I didn't think I'd find a complete tool to do what I really need. I was hoping that there might be a memory tester than would reallocate the software in the RAM and then test the old block of RAM. In the good 'ole days we could do that, allocate a block of memory and move the data over, then use the old block for whatever we wanted. Those were the days of assembler language, almost machine level code (done a lot of that too back in the day).

The reason I want to use a background memory tester is when you do a scrub (if your file system is ZFS), bad RAM will likely equal data corruption. I wish I knew about that before I purchased all my hardware a few years ago. I do take precautions like backing up my important data to DVD-R but being able to avert the corruption in the first place would be nice. I have set a scrub to occur every 120 days. I doubt my data will become corrupt due to my RAM, I have tested it quite a bit but failure will happen.
 
If you are concerned about the data integrity risks on ZFS, this thread will encourage you to upgrade to ECC RAM as soon as possible.
 
Interesting read, well really scary to be truthful. I had no idea the reasons a bit error could occur and thought it was merely a failing RAM chip. I still have the problem with money, well unless my wife hits the lottery. I do want to upgrade to a system with ECC RAM but if I do that then I'm buying a new MB, CPU and RAM. I don't need a particularly speedy CPU nor RAM but ECC RAM isn't cheap. At one time I recall CPUs being the most expensive part of a computer, well it's not any more.

So buying a new MB, well if I have to spend that money I was thinking about buying one that has some kind of remote management capability since I place my server in the basement where it's nice and cool. But I'm not sure what this is called because I'm looking for basically a way to control the computer as if it were right in front of me when booting it up. I guess an Ethernet based KVM, assuming it could piggyback on my wired network. I don't know, so much to think about but if I buy the MB, the CPU and RAM will follow about a month later.
 
For a home server, a KVM is probably not necessary. Go to the machine for the few times direct keyboard input is needed. For the rest, use SSH. You can even use WOL to turn it on remotely.
 
JoeSchmuck said:
Interesting read, well really scarey to be truthful. I had no idea the reasons a bit error could occur and thought it was merely a failing RAM chip. I still have the problem with money, well unless my wife hits the lottery. I do want to upgrade to a system with ECC RAM but if I do that then I'm buying a new MB, CPU and RAM. I don't need a particularly speedy CPU nor RAM but ECC RAM isn't cheap. At one time I recall CPUs being the most expensive part of a computer, well it's not any more.
On many recent AMD systems (even desktop-class) you get ECC essentially "for free". I've been building / buying Intel-based server hardware and for that I believe I still need to make sure everything has ECC support.

So buying a new MB, well if I have to spend that money I was thinking about buying one that has some kind of remote management capability since I place my server in the basement where it's nice and cool. But I'm not sure what this is called because I'm looking for basically a way to control the computer as if it were right in front of me when booting it up. I guess an Ethernet based KVM, assuming it could piggyback on my wired network.
It goes by a number of different names. Dell calls it DRAC (Dell Remote Access Controller). HP calls it ILO (Integrated Lights-Out [management]), and so on.

There are generally two ways those devices will let you into the system - emulated KVM, usually in a web browser as you mentioned; and SOL (Serial Over LAN) which gives you some sort of console you access with a terminal emulator. KVM is nicer, as long as you have a supported browser and your config, which might be locked down by corporate IT if you're in a business setting (I know you're not, but mentioning it for completeness) to prevent using ActiveX controls, running Java applets, etc.

You usually get IMPI thrown in for free on systems with remote management hardware. This will let you measure temperatures, fan speed, voltages and so on. You can check the "Hardware monitoring" section of my RAIDzilla II project for a [static] example of what sort of data is provided.

Next is how those systems connect to your LAN. This could be a dedicated Ethernet connector port for remote management or a port shared between normal system use and remote management. The latter can be difficult, because each implementation is subtly different and the FreeBSD Ethernet device driver you're using needs to be aware that something else is sitting on the port. For example, while it might be useful to re-negotiate speed/duplex when FreeBSD brings the port up during startup, that will make the remote management unreachable during that interval.

All of this is probably overkill for the average home network - just have an old monitor and keyboard handy if you need to do any BIOS-level or single-user FreeBSD work on the box. All of my systems have remote management, but then again I have 10 or so systems and 140TB or thereabouts in my spare dining room...
 
Wow, that was a mouth full! Very informative too. I've been trying to read about it on the internet since last night and even the user manuals for specific motherboards are cryptic to say the least. I just thought it would be nice to have the ability to have a KVM type setup but over the ethernet so I could completely control the server during reboots, software upgrades, you know, when you are rebooting the computer all the time working an issue, changing a BIOS setting, whatever may come your way. Right now I have to bring the server upstairs if I need to do anything that extensive which beats dragging down a monitor, keyboard, mouse, desk, chair, etc... So if I can purchase a server with this type of feature for a little extra money, I'd leap at it.

I will look into the AMD systems as well. Do you have any recommendations off the top of your head. I'm not asking you to do my legwork but if you have something in mind I would appreciate the advice.
 
JoeSchmuck said:
Wow, that was a mouth full! Very informative too.
Thanks!

I will look into the AMD systems as well. Do you have any recommendations off the top of your head. I'm not asking you to do my legwork but if you have something in mind I would appreciate the advice.
Not really. I tend to pick a server-class motherboard and stick with it all the way through "last lifetime buy" (end of production). A number of people have posted here about motherboards they use. Try searching using Google with "site:forums.freebsd.org amd ecc" (without the quotes). The vBulletin software in use here doesn't think any 3-letter words are worth searching for, so a search for something like "ecc" here reports "Sorry - no matches. Please try some different terms."

An example of the kind of board I look for is the Supermicro X8DTH-iF. Monitoring of everything, remote KVM via dedicated Ethernet port, all slots are fully interchangable (no mismash of x1, x4, x8, x16), support for more memory than I'll ever need (192 GB, I'm currently running them with 48 GB). That particular board uses Intel E5500 / E5600 CPUs and is way outside of your likely price range (it's $400+ for the naked board, to which you need to add Xeon CPUs and registered memory).

[Note to mods - the "useless" italic on/off is to prevent "site:forums" from turning into "site:forums". Otherwise, even if I have "Disable smilies in text" checked, anyone who quotes me will have them come back.]
 
Oh yes, that is outside my needs, dual CPUs and all. I do understand I don't need to populate both CPU sockets. I did just receive two servers, each cost just over $3500. Wish one of them were mine but they belong to the military. Installing those tomorrow morning. I'm sure they have lots of features but I'm just installing new hardware, I don't play with it once I load the software.
 
The Linux version of SuperPi runs under FreeBSD.

This does a better job of testing for too tight RAM timings or other setup problems than memtest86, but it doesn't test all cells (which would be difficult to do with a kernel up).
 
Thanks for the inputs, not sure how SuperPi would be considered a RAM tester, it looks more like a CPU tester. I've broken down and ordered a new MB, CPU, and ECC RAM. I'd still like to see if there is a way to fully test Non-ECC RAM in particular just before a ZFS scrub. Unfortunately just because a RAM test passes before the scrub doesn't mean a bit error won't occur during the scrub which is why I made the purchase. I think a routine RAM test for Non-ECC RAM is smart for any home server regardless of ZFS or not and then if using ZFS, maybe disabling the automatic scrub process in favor of a manual selection.
 
SuperPi and the similar programs have multiple run modes and if you choose one that consumes more RAM they are quite good memory testers. However, you won't get anything else but an indication that something is wrong, the program will not tell you which physical address caused the error.
 
JoeSchmuck said:
Thanks for the inputs, not sure how SuperPi would be considered a RAM tester, it looks more like a CPU tester. I've broken down and ordered a new MB, CPU, and ECC RAM. I'd still like to see if there is a way to fully test Non-ECC RAM in particular just before a ZFS scrub. Unfortunately just because a RAM test passes before the scrub doesn't mean a bit error won't occur during the scrub which is why I made the purchase. I think a routine RAM test for Non-ECC RAM is smart for any home server regardless of ZFS or not and then if using ZFS, maybe disabling the automatic scrub process in favor of a manual selection.

We know it from practical use. It is a RAM tester. We know that because as you mess up your RAM timings and frequencies SuperPi will be the first to detect it. CPU instability on the other hand shows up pretty late in SuperPi, much later than in better CPU stability tests. SuperPi also picks this up much earlier than memtest86, in fact you can run memtest86 with many RAM timings and frequencies that are utterly broken and won't work under real world load. But as we said, you won't get a complete coverage of RAM cells that way.

Are you sure you new combo actually supports ECC?
 
cracauer@ said:
We know it from practical use. It is a RAM tester. We know that because as you mess up your RAM timings and frequencies SuperPi will be the first to detect it. CPU instability on the other hand shows up pretty late in SuperPi, much later than in better CPU stability tests. SuperPi also picks this up much earlier than memtest86, in fact you can run memtest86 with many RAM timings and frequencies that are utterly broken and won't work under real world load. But as we said, you won't get a complete coverage of RAM cells that way.

Are you sure you new combo actually supports ECC?

Thanks for the information. I know of someone who looks like they are having RAM issues but Memtest86 passes and yet I'd swear he is having RAM issues. I'll see if I can get him to try out SuperPi and I will give it a shot as well. I will use it to test out my new system first and once everything has been relocated I will test out my old system. And I just couldn't get my head wrapped around this being used as a RAM tester, well for timing issues as you described.

Again, thanks, it's good to learn something new.
 
With respect to SuperPi, is there a bootable version like Memtest86 or a DOS version? I have my new system and I don't want to load an OS just to test out the stability.

Thanks.
 
I currently use The Ultimate Boot CD (version 5.2.1). It's a good tool and I use it a lot for MemTest86 and GPart but the CPU stress tests that I ran did not appear to affect the CPU in my system, meaning no fan speed increase and the heat sink was not warm tot he touch. I'm running an AMD FX-4300 and I would like to stress out the CPU now. I have run MemTest86 for 4 complete passes so I'm fine with the RAM working as expected. So I'll try to find another CPU Stress Test on UBCD and if not, guess I'll live with it as-is. I'm not overclocking anything yet but it would be nice to get a baseline.

But as far as this thread goes, I've solved my memory testing issue by just purchasing ECC RAM.

Cheers
 
Back
Top