Blob Blame History Raw
From: Hristo Venev <hristo@venev.name>
Date: Thu, 11 May 2023 20:45:07 +0300
Subject: EDAC/amd64: Add support for ECC on family 19h model 60h-7Fh
Git-commit: 6c79e42169fe10308d72051950c411b3524c7aa9
Patch-mainline: v6.5-rc1
References: jsc#PED-7615

Ryzen 9 7950X uses model 61h. Treat it as Epyc 9004, but with 2 channels
instead of 12.

With two 32GB dual-rank DIMMs the sizes appear to be reported correctly:

  EDAC MC0: Giving out device to module amd64_edac controller F19h_M60h: DEV 0000:00:18.3 (INTERRUPT)
  EDAC amd64: F19h_M60h detected (node 0).
  EDAC MC: UMC0 chip selects:
  EDAC amd64: MC: 0:     0MB 1:     0MB
  EDAC amd64: MC: 2: 16384MB 3: 16384MB
  EDAC MC: UMC1 chip selects:
  EDAC amd64: MC: 0:     0MB 1:     0MB
  EDAC amd64: MC: 2: 16384MB 3: 16384MB
  AMD64 EDAC driver v3.5.0

ECC errors can also be detected:

  mce: [Hardware Error]: Machine check events logged
  [Hardware Error]: Corrected error, no action required.
  [Hardware Error]: CPU:0 (19:61:2) MC21_STATUS[Over|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: 0xdc2040000400011b
  [Hardware Error]: Error Addr: 0x00000007ff7e93c0
  [Hardware Error]: IPID: 0x0000009600050f00, Syndrome: 0x000100010a801203
  [Hardware Error]: Unified Memory Controller Ext. Error Code: 0, DRAM ECC error.
  EDAC MC0: 1 CE Cannot decode normalized address on mc#0csrow#3channel#0 (csrow:3 channel:0 page:0x0 offset:0x0 grain:64 syndrome:0x1)
  [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD

According to Mario Limonciello, the same code should also work for
models 70h-7Fh (follow thread in Link).

  [ bp: Massage, the translation logic updates are pending. ]

Signed-off-by: Hristo Venev <hristo@venev.name>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20230425201239.324476-1-hristo@venev.name
Link: https://lore.kernel.org/r/20230511174506.875153-2-hristo@venev.name

Acked-by: Nikolay Borisov <nik.borisov@suse.com>
---
 drivers/edac/amd64_edac.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -3818,6 +3818,14 @@ static int per_family_init(struct amd64_
 		case 0x20 ... 0x2f:
 			pvt->ctl_name			= "F19h_M20h";
 			break;
+		case 0x60 ... 0x6f:
+			pvt->ctl_name			= "F19h_M60h";
+			pvt->flags.zn_regs_v2		= 1;
+			break;
+		case 0x70 ... 0x7f:
+			pvt->ctl_name			= "F19h_M70h";
+			pvt->flags.zn_regs_v2		= 1;
+			break;
 		case 0xa0 ... 0xaf:
 			pvt->ctl_name			= "F19h_MA0h";
 			pvt->max_mcs			= 12;