RAS: Add a Corrected Errors Collector
Introduce a simple data structure for collecting correctable errors along with accessors. More detailed description in the code itself. The error decoding is done with the decoding chain now and mce_first_notifier() gets to see the error first and the CEC decides whether to log it and then the rest of the chain doesn't hear about it - basically the main reason for the CE collector - or to continue running the notifiers. When the CEC hits the action threshold, it will try to soft-offine the page containing the ECC and then the whole decoding chain gets to see the error. Signed-off-by:Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170327093304.10683-5-bp@alien8.de Signed-off-by:
Ingo Molnar <mingo@kernel.org>
Showing
- Documentation/admin-guide/kernel-parameters.txt 6 additions, 0 deletionsDocumentation/admin-guide/kernel-parameters.txt
- arch/x86/include/asm/mce.h 5 additions, 4 deletionsarch/x86/include/asm/mce.h
- arch/x86/kernel/cpu/mcheck/mce.c 115 additions, 76 deletionsarch/x86/kernel/cpu/mcheck/mce.c
- arch/x86/ras/Kconfig 14 additions, 0 deletionsarch/x86/ras/Kconfig
- drivers/ras/Makefile 2 additions, 1 deletiondrivers/ras/Makefile
- drivers/ras/cec.c 532 additions, 0 deletionsdrivers/ras/cec.c
- drivers/ras/debugfs.c 1 addition, 1 deletiondrivers/ras/debugfs.c
- drivers/ras/debugfs.h 8 additions, 0 deletionsdrivers/ras/debugfs.h
- drivers/ras/ras.c 11 additions, 0 deletionsdrivers/ras/ras.c
- include/linux/ras.h 12 additions, 1 deletioninclude/linux/ras.h
Loading
Please register or sign in to comment