On Wed, Oct 05, 2011 at 12:00:38PM -0400, Valdis.Kletnieks(a)vt.edu wrote:
On Wed, 05 Oct 2011 11:52:17 EDT, Vivek Goyal said:
> I am assuming that basic MCE error messages are available in kernel log.
They're not in the kernel log if it's an MCE that causes the kernel to declare
a panic. There's some MCE's that you can retry the operation and continue, and
some that you can get away with poisoning a page, killing the process, and rest
of the system is OK. But some you really need to roll over and die because you
can't guarantee kernel integrity anymore.
And at that point, those messages are never gonna make it to syslogd and onto
disk.
AFAICT, and for the sake of getting the MCE info, arguably one could
look for log_buf in the vmcore of the old kernel and try to find the
last lines in the kernel log ringbuffer. They should be the MCE error
information from the do_machine_check MCE handler.
This all assuming of course we've managed to dump vmcore successfully by
sidestepping the landmines :-).
--
Regards/Gruss,
Boris.