![]() |
ASRock AM4 Pro4 Motherboard |
Specifically, while ASUS AM3 boards support ECC mode and I have been using this for some years now. I have never witnessed any error being reported during this time, but I am located near the sea level which has influence on the number of cosmic rays.
There is also Machine Check Exceptions support on these platforms but the motherboard itself is hit and miss. These are useful to track errors in CPU caches and other parts, that help prevent data corruption and make you aware of damaged hardware (mostly PSU or board VRMs).
Some TLDR for CPUs:
- AMD Phenom I/II and Athlon 64 X2 chips support error reporting through module "edac_mce_amd". This module works without ECC and reports cache or other errors related to the CPU.
- Athlon II also works.
- For AM4 Ryzen, APUs only support ECC if they are from "Pro" line.
- ASUS AM3 M4A8xx motherboards officially support ECC but you should not rely on it.
- ASUS AM4 and Ryzen support ECC through RASDaemon but only on up to date BIOS.
- Older ASUS AM4 BIOS report through kernel methods but only uncorrectable errors(UE) are logged in '/sys' nodes.
- All ASRock AM4 boards seem to support ECC mode.
- Gigabyte AM4 B550 boards mention ECC mode support.
- Only ECC Unbuffered RAM is supported. (PC3/4-xxxxxE JEDEC specs)
On the kernel side:
- RAM ECC is supported through "amd64_edac" module.
- CPU error reporting is handled by "edac_mce_amd".
- ASUS M4A87TD/USB3
- ASUS M4A88TD-V EVO/USB3
- ASUS A320M-K
- ASUS EX320M Gaming
- ASUS ROG Strix B450-F
Testing ECC Support
ASUS AM3 Motherboards
![]() |
ASUS M4A88T |
The kernel itself only shows messages with no detail, no matter what kernel parameters are passed to 'mce' boot parameter:
[Hardware error] Machine Check Exception
From testing, these will be corrected errors but I don't know how it will handle uncorrectable errors, as those are harder to reproduce. There is some level of functionality here but it seems the kernel will not be aware of corruption of memory from uncorrectable errors.
The first problem is there is no additional information on what exactly the error is, so the OS will not know if it needs to kill some process to prevent data corruption. There should be additional lines after the [Hardware error] entry but the motherboards is not handling the error further.
Also, the '/sys' nodes for 'mc*' entries in edac module will not be populated with error counts. So you can't really track them over time without custom scripts that monitor the kernel log.
I don't consider ECC to be fully functional on these boards because of this, though some posts seemed to imply ECC was correctly supported.
These boards also don't report any kind of error related to CPU errors. I was first aware of this functionality when a damaged ASRock board started locking up but due to errors reported to the OS. On compatible boards, these show up on the kernel messages in the following format:
[Hardware error] Machine Check Excpetion loggedThese do not get recorded on MCE Log but are specifically handled by the kernel. ('edac_mce_amd' module) This is useful because uncorrected errors can then discard buffers or kill the process with corrupted data.
[Hardware error] ERROR DETAILS
Because of ASUS not enabling this functionality, you may get some data corruption if the PSU or motherboard VRM are damaged. I would not rely on this hardware without regularly testing CPU stability with something like Prime95.
ASUS AM4
![]() |
ASUS EX320M Gaming |
CPU related MCE/RAS may require some register tweaking, according to AMD's documentation for these CPUs. I have managed to reproduce CPU crashes by undervolting, with no errors reported in the kernel or RAS Daemon.
No comments:
Post a Comment