On Tuesday, February 3, 2015 12:53 PM, Dave Anderson <anderson(a)redhat.com> wrote:
I'll move the hardware error check to the bottom, and only use it
if there
are no other relevant strings found, and then re-test that configuration.
how about match this string "Machine Check Exception:" ? or can I use a pattern
matching ?this is a memory failure at bank 12, usually indicates we need to replace this
memory in bank 12,
here we internally have a tool depends on crash to analyze kernel crash and find out
reasons and solutions, I hope the
"[Hardware Error]: CPU 14: Machine Check Exception: 5 Bank 12: fe00014b001000c3"
line
can be matched, instead of the less useful
"Kernel panic - not syncing: Fatal machine check on current CPU" currently
selected,
<0>[Hardware Error]: CPU 14: Machine Check Exception: 5 Bank 12: fe00014b001000c3
<0>[Hardware Error]: RIP !INEXACT! 10:<ffffffff810ace8a>
{tick_check_idle+0xca/0xe0}
<0>[Hardware Error]: TSC 52e41ed579869d ADDR 53a92b000 MISC 908424000803e8c
<0>[Hardware Error]: PROCESSOR 0:306e4 TIME 1423045186 SOCKET 0 APIC 5
<0>[Hardware Error]: Some CPUs didn't answer in synchronization
<0>[Hardware Error]: Machine check: Invalid
<0>Kernel panic - not syncing: Fatal machine check on current CPU
<4>Pid: 0, comm: swapper Tainted: G M ---------------
2.6.32-431.23.3.el6.YAHOO.20140804.x86_64 #1
<4>Call Trace:
<4> <#MC> [<ffffffff8152866c>] ? panic+0xa7/0x16f
<4> [<ffffffff8102880f>] ? mce_panic+0x20f/0x230
<4> [<ffffffff81029c87>] ? do_machine_check+0x7a7/0xaf0
<4> [<ffffffff810ace8a>] ? tick_check_idle+0xca/0xe0
<4> [<ffffffff8152bc9c>] ? machine_check+0x1c/0x30
<4> [<ffffffff810ace8a>] ? tick_check_idle+0xca/0xe0
<4> <<EOE>> <IRQ> [<ffffffff8107a51c>] ?
irq_enter+0x6c/0x80
<4> [<ffffffff8102b1d3>] ? smp_threshold_interrupt+0x13/0x40
<4> [<ffffffff8100bd13>] ? threshold_interrupt+0x13/0x20
<4> <EOI> [<ffffffff812e14ee>] ? intel_idle+0xde/0x170
<4> [<ffffffff812e14d1>] ? intel_idle+0xc1/0x170
<4> [<ffffffff814274a7>] ? cpuidle_idle_call+0xa7/0x140
<4> [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110
<4> [<ffffffff8152219c>] ? start_secondary+0x2ac/0x2ef
Thanks,
- Derek