Blob Blame History Raw
From 61c43829dfeb41f8a849108b23688d25cd9014c8 Mon Sep 17 00:00:00 2001
From: Borislav Petkov <bp@suse.de>
Date: Mon, 19 May 2014 20:38:03 +0200
Subject: [PATCH] MCE, AMD: Check for userspace agent before decoding
Patch-mainline: never, temp mcelog fix
References: bnc#871881

Now that mcelog supports AMD, we want to check whether it is running in
userspace, and, if so, to not decode to dmesg. The moment mcelog gets
unloaded or killed or b0rked, we fallback to decoding to dmesg.

This patch is not upstream because:

Basically, at the time, Thomas Renninger and I decided to have only one
RAS agent on SLE12 and that is mcelog. That's why we need a way to stop
EDAC from decoding an error.

So this small patch simply stops decoding if mcelog is running on the
system. If not, AMD MCEs still get decoded into dmesg by edac_mce_amd.

That's why, by the way, our mcelog can decode AMD MCEs too, which I've
added.

Looking at SLE13, we will move to rasdaemon, most likely and then we can
get rid of that patch.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/mce.h       |    1 +
 arch/x86/kernel/cpu/mcheck/mce.c |    6 ++++++
 drivers/edac/mce_amd.c           |    3 +++
 3 files changed, 10 insertions(+)

--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -329,6 +329,7 @@ struct cper_sec_mem_err;
 extern void apei_mce_report_mem_error(int corrected,
 				      struct cper_sec_mem_err *mem_err);
 
+int mce_get_userspace_consumers_count(void);
 /*
  * Enumerate new IP types and HWID values in AMD processors which support
  * Scalable MCA.
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1846,6 +1846,12 @@ static int mce_chrdev_release(struct ino
 	return 0;
 }
 
+int mce_get_userspace_consumers_count(void)
+{
+	return mce_chrdev_open_count;
+}
+EXPORT_SYMBOL_GPL(mce_get_userspace_consumers_count);
+
 static void collect_tscs(void *data)
 {
 	unsigned long *cpu_tsc = (unsigned long *)data;
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1053,6 +1053,9 @@ int amd_decode_mce(struct notifier_block
 	struct cpuinfo_x86 *c = &cpu_data(m->extcpu);
 	int ecc;
 
+	if (mce_get_userspace_consumers_count())
+		return NOTIFY_STOP;
+
 	if (amd_filter_mce(m))
 		return NOTIFY_STOP;