It appears I have an issue with a RAM module or two. I see this in messages
I am correlating the address 0x5d419fdc0 with dmidecode to match on this DIMM
What kind of error is that ? Is it an actual error or warning ? The memory is ECC so I'm not fully sure what's happening. ( There are 2 different banks mentioned 7 & 12 but with the same address )
I am seeing some CRC errors with ZFS and I wonder if this could have anything to do with it
Code:
Feb 5 14:40:17 dfs12 kernel: MCA: Bank 7, Status 0x8c00004000010093
Feb 5 14:40:17 dfs12 kernel: MCA: Global Cap 0x0000000001000c17, Status 0x0000000000000000
Feb 5 14:40:17 dfs12 kernel: MCA: Vendor "GenuineIntel", ID 0x306e4, APIC ID 0
Feb 5 14:40:17 dfs12 kernel: MCA: CPU 0 COR (1) RD channel 3 memory error
Feb 5 14:40:17 dfs12 kernel: MCA: Address 0x5d419fdc0
Feb 5 14:40:17 dfs12 kernel: MCA: Misc 0x40185286
Feb 7 15:02:57 dfs12 kernel: MCA: Bank 7, Status 0x8c00004000010093
Feb 7 15:02:57 dfs12 kernel: MCA: Global Cap 0x0000000001000c17, Status 0x0000000000000000
Feb 7 15:02:57 dfs12 kernel: MCA: Vendor "GenuineIntel", ID 0x306e4, APIC ID 0
Feb 7 15:02:57 dfs12 kernel: MCA: CPU 0 COR (1) RD channel 3 memory error
Feb 7 15:02:57 dfs12 kernel: MCA: Address 0x5d419fdc0
Feb 7 15:02:57 dfs12 kernel: MCA: Misc 0x1421ad486
Feb 14 08:57:11 dfs12 kernel: MCA: Bank 12, Status 0x8c000041000800c3
Feb 14 08:57:11 dfs12 kernel: MCA: Global Cap 0x0000000001000c17, Status 0x0000000000000000
Feb 14 08:57:11 dfs12 kernel: MCA: Vendor "GenuineIntel", ID 0x306e4, APIC ID 0
Feb 14 08:57:11 dfs12 kernel: MCA: CPU 0 COR (1) MS channel 3 memory error
Feb 14 08:57:11 dfs12 kernel: MCA: Address 0x5d419fdc0
Feb 14 08:57:11 dfs12 kernel: MCA: Misc 0x90840800080028c
Feb 14 13:42:46 dfs12 kernel: MCA: Bank 7, Status 0x8c00004000010093
Feb 14 13:42:46 dfs12 kernel: MCA: Global Cap 0x0000000001000c17, Status 0x0000000000000000
Feb 14 13:42:46 dfs12 kernel: MCA: Vendor "GenuineIntel", ID 0x306e4, APIC ID 0
Feb 14 13:42:46 dfs12 kernel: MCA: CPU 0 COR (1) RD channel 3 memory error
Feb 14 13:42:46 dfs12 kernel: MCA: Address 0x5d419fdc0
Feb 14 13:42:46 dfs12 kernel: MCA: Misc 0x1421cae86
Feb 14 14:57:35 dfs12 kernel: MCA: Bank 7, Status 0x8c00004000010093
Feb 14 14:57:35 dfs12 kernel: MCA: Global Cap 0x0000000001000c17, Status 0x0000000000000000
Feb 14 14:57:35 dfs12 kernel: MCA: Vendor "GenuineIntel", ID 0x306e4, APIC ID 0
Feb 14 14:57:35 dfs12 kernel: MCA: CPU 0 COR (1) RD channel 3 memory error
Feb 14 14:57:35 dfs12 kernel: MCA: Address 0x5d419fdc0
Feb 14 14:57:35 dfs12 kernel: MCA: Misc 0x152180086
I am correlating the address 0x5d419fdc0 with dmidecode to match on this DIMM
Code:
Handle 0x0043, DMI type 17, 34 bytes
Memory Device
Array Handle: 0x002F
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 16 GB
Form Factor: DIMM
Set: None
Locator: P1-DIMMD1
Bank Locator: P0_Node0_Channel3_Dimm0
Type: DDR3
Type Detail: Registered (Buffered)
Speed: 1333 MT/s
Manufacturer: Samsung
Serial Number: 13A2597E
Asset Tag: DimmD1_AssetTag
Part Number: M393B2G70BH0-YH9
Rank: 2
Configured Memory Speed: 1333 MT/s
Handle 0x0044, DMI type 20, 35 bytes
Memory Device Mapped Address
Starting Address: 0x00400000000
Ending Address: 0x007FFFFFFFF
Range Size: 16 GB
Physical Device Handle: 0x0043
Memory Array Mapped Address Handle: 0x0030
Partition Row Position: 1
What kind of error is that ? Is it an actual error or warning ? The memory is ECC so I'm not fully sure what's happening. ( There are 2 different banks mentioned 7 & 12 but with the same address )
I am seeing some CRC errors with ZFS and I wonder if this could have anything to do with it