How to find out if ECC is enabled

ajr

New Member


Messages: 2

Hi all,

both, the mainboard AsRock X470D4U (with recent firmware P3.10)
and Ryzen 7 3700X CPU should work with ECC RAM (while not officially supported).
RAM is KSM26ED8/16ME DDR4-2666 ECC DIMM.

dmesg and dmidecode do not indicate ECC (see below), I think.
Data path should show additional ECC bits.

There is only one related BIOS setting, I’m aware of:
The only Bios setting, I tried was:
Advanced -> AMD CBS -> UMC-> DDR4 -> Common RAS -> ECC conf _DRAM ECC enable
Changing this from AUTO to ENABLED makes no difference.
Any Bios related hint welcome!

Looking at a Xeon E5 CPU, dmidecode shows Total Width: 72 bits und Data Width: 64 bits.
72-64=8; These are the 8 parity bits.
The Ryzen CPU shows Total Width von 128 Bit und Data Width: 64 bits.

Questions:
Is ECC enabled?
If not, how to enable it?
If yes, why does FreeBSD does not recognice it?

Thanks, Ajr

PS: I'm aware that this is release candidate OS, but should be near to release (-;
PPS: parts from dmesg and dmidecode:
Code:
---<<BOOT>>---
Copyright (c) 1992-2019 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
    The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 12.1-RC1 r353428 GENERIC amd64
FreeBSD clang version 8.0.1 (tags/RELEASE_801/final 366581) (based on LLVM 8.0.1)
VT(efifb): resolution 800x600
CPU: AMD Ryzen 7 3700X 8-Core Processor              (3593.32-MHz K8-class CPU)
Origin="AuthenticAMD"  Id=0x870f10  Family=0x17  Model=0x71  Stepping=0
Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
Features2=0x7ed8320b<SSE3,PCLMULQDQ,MON,SSSE3,FMA,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
AMD Features2=0x75c237ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX,<b30>>
Structured Extended Features=0x219c91a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,PQM,PQE,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA>
Structured Extended Features2=0x400004<UMIP,RDPID>
XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
AMD Extended Feature Extensions ID EBX=0x10eb757<CLZERO,IRPerf,XSaveErPtr>
SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
TSC: P-state invariant, performance statistics
real memory  = 68719476736 (65536 MB)
avail memory = 66818646016 (63723 MB)
. . .
Code:
root@b5:~ # dmidecode  -t memory
# dmidecode 3.2
# SMBIOS entry point at 0xed2b3000
Found SMBIOS entry point in EFI, reading table from /dev/mem.
SMBIOS 3.2 present.

Handle 0x000F, DMI type 16, 23 bytes
Physical Memory Array
    Location: System Board Or Motherboard
    Use: System Memory
    Error Correction Type: Multi-bit ECC
    Maximum Capacity: 128 GB
    Error Information Handle: 0x000E
    Number Of Devices: 4

Handle 0x0017, DMI type 17, 84 bytes
Memory Device
    Array Handle: 0x000F
    Error Information Handle: 0x0016
    Total Width: 128 bits
    Data Width: 64 bits
    Size: 16384 MB
    Form Factor: DIMM
    Set: None
    Locator: DIMM 0
    Bank Locator: P0 CHANNEL A
    Type: DDR4
    Type Detail: Synchronous Unbuffered (Unregistered)
    Speed: 2666 MT/s
    Manufacturer: Kingston
    Serial Number: E72F267D
    Asset Tag: Not Specified
    Part Number: 9965745-002.A00G
    Rank: 2
    Configured Memory Speed: 2400 MT/s
    Minimum Voltage: 1.2 V
    Maximum Voltage: 1.2 V
    Configured Voltage: 1.2 V
    Memory Technology: DRAM
    Memory Operating Mode Capability: Volatile memory
    Firmware Version: Unknown
    Module Manufacturer ID: Bank 2, Hex 0x98
    Module Product ID: Unknown
    Memory Subsystem Controller Manufacturer ID: Unknown
    Memory Subsystem Controller Product ID: Unknown
    Non-Volatile Size: None
    Volatile Size: 16 kB
    Cache Size: None
    Logical Size: None
. . .
 

gpw928

Well-Known Member

Reaction score: 119
Messages: 358

Hardware Canucks have an interesting article titled ECC MEMORY & AMD’S RYZEN – A DEEP DIVE.

They tested an ASRock X370 Taichi, and discuss the BIOS settings for ECC. Hopefully they are similar to your motherboard.

Spoiler: disappointment awaits -- they found single bit ECC errors being corrected, and double bit ECC errors being ignored...
 
OP
A

ajr

New Member


Messages: 2

Unfortunately, the 3 bios settings mentioned in this article are already in place (auto).
Interesting is the edac-util, which is available on Linux but not on FreeBSD.
So I'm still looking for a solution . . .

Thanks for answering,
ajr
 

Mastakilla

New Member


Messages: 4

I've been on the same mission as yourself to figure this out ;)

My most comprehensive (and FreeNAS / FreeBSD related) post about this, you can find here:
https://www.ixsystems.com/community/threads/freenas-build-with-10gbe-and-ryzen.77752/page-2

But I'm also working on this on the fora below:
https://hardwarecanucks.com/forum/t...ryzen-a-deep-dive-comment-thread.75041/page-4
https://forum.level1techs.com/t/asr...-server-boards-x470d4u-x470d4u2-2t/139490/846
https://forum.level1techs.com/t/asrock-rack-x470d4u2-2t/147588/56

I'm not 100% certain yet, but
  • It seems like the current implementation from Asrock Rack is flawed.
  • Also I suspect that full Ryzen 3000 support hasn't been implemented in the FreeBSD kernel yet. For example, Linux only has full Ryzen 3000 support since a couple months with kernel 5.4 (which you can't find in any stable distro yet).

It would be nice if someone from FreeBSD itself could confirm this...
 

IngoZ

New Member


Messages: 1

Not yet for Ryzen 3000... PassMark told me they are still waiting for the datasheet from their AMD representative (more than 3 months already now)
I have the AsRock X470D4U paired with Kingston ECC ram and a Ryzen 3600 running FreeBSD 12.1.

When using Memtest86 it displays ECC enabled: Yes

I have run dmidecode -t memory like the original poster and I get similar output using mfsBSD bootstick.
 

Mastakilla

New Member


Messages: 4

When using Memtest86 it displays ECC enabled: Yes
Thanks for bringing this to my attention!

I've been testing with memtest86 v8.2 and asked Passmark themselves about the missing Ryzen 3000 support and they've responded:
Hello,

It appears that ECC detection support for Ryzen 3000 series chipset has not yet been implemented in MemTest86.

We are in contact with our AMD representative at the moment to obtain access to the necessary datasheets. Once acquired, we should be able to implement support for your chipset.

I should add that AMD haven't been very responsive recently.
We asked this question to AMD about 2 months back, but nothing yet.

Will keep you updated.

Kind regards,
Richard Ng
Using MemTest86 v8.3 it indeed seems to work better... Seems like they forgot to
  • Keep me updated
  • Add this to the changelog
I've just re-tested it and it indeed now reports:
1576855814197.png
 
Last edited:

`Orum

Well-Known Member

Reaction score: 33
Messages: 256

Assuming it's enabled, it's odd that dmidecode shows a 64-bit width, though I suppose that could be a result of a false report from the BIOS. Please correct me if I'm wrong.

The ultimate test is to install some ECC RAM that you know has bad areas, boot into FreeBSD and use a ton of RAM (ZFS makes this easy), and look for MCA messages to show up on the console.
 

RobCrowston

New Member

Reaction score: 24
Messages: 19

The ultimate test is to install some ECC RAM that you know has bad areas, boot into FreeBSD and use a ton of RAM (ZFS makes this easy), and look for MCA messages to show up on the console.
It doesn't need to be physically bad. Just tighten the timings in the UEFI beyond what the RAM can cope with, you'll be flooded with machine check errors.
 

`Orum

Well-Known Member

Reaction score: 33
Messages: 256

Just tighten the timings in the UEFI beyond what the RAM can cope with, you'll be flooded with machine check errors.
My only issue with this method is it's tough to get it to the point where it's bad, but not so bad that your system is completely unusable. If you have DIMMs in separate channels though, you can just mess with it on the higher-addressed slot and you are probably okay.
 

Mastakilla

New Member


Messages: 4

My only issue with this method is it's tough to get it to the point where it's bad, but not so bad that your system is completely unusable. If you have DIMMs in separate channels though, you can just mess with it on the higher-addressed slot and you are probably okay.
I also noticed that when using ECC DIMMs, there is a very fine line between being 100% stable to not booting at all. The trick around this is configuring the memory so that it is just still booting and then lowering the voltage step by step...
 
Top