On Wed, Oct 05, 2011 at 03:17:27PM +0530, K.Prasad wrote:
[..]
Fine with me. I see that the various IA32_MCi_Status registers will
hold
information about the error and use that to classify MCEs.
I think the best way to go about is to retain NT_NOCOREDUMP for non-DRAM
errors also, but use the note-name field in the elf-note and distinguish the
various types of errors...say, by using names such as "PANIC_MCE_DRAM",
"PANIC_MCE_CACHE", etc (similar to the error codes described in the Intel
manual). The upstream tools like 'makedumpfile' and 'crash' will have to
be taught to parse the elf-note name and act accordingly.
I am assuming that basic MCE error messages are available in kernel log.
Why can't user space simply scan the logs for MCE error and just save
the log buffers in case of MCE. That way a user gets the MCE information
while not trying to save the whole dump. And no need of an extra ELF note.
Thanks
Vivek