[Crash-utility] Re: crash 4.0-8.9 w/ 2.6.30-rc6

Wednesday, 27 May 2009

On Wed, May 27 2009 at  9:23am -0400,
Dave Anderson <anderson(a)redhat.com&gt; wrote:

...

 ----- "Mike Snitzer" <snitzer(a)redhat.com&gt; wrote:

 > On Wed, May 27 2009 at  8:37am -0400,
 > Dave Anderson <anderson(a)redhat.com&gt; wrote:
 > 
 > > 
 > > ----- "Mike Snitzer" <snitzer(a)redhat.com&gt; wrote:
 > > 
 > > > Hi Dave,
 > > > 
 > > > crash is failing with the following when I try to throw a
 > 2.6.30-rc6
 > > > vmcore at it:
 > > > 
 > > > crash: invalid structure size: x8664_pda
 > > >        FILE: x86_64.c  LINE: 584  FUNCTION: x86_64_cpu_pda_init()
 > > > 
 > > > [/usr/bin/crash] error trace: 449c7f => 4ce815 => 4d00cf =>
 > 50936d
 > > > 
 > > >   50936d: SIZE_verify+168
 > > >   4d00cf: (undetermined)
 > > >   4ce815: x86_64_init+3205
 > > >   449c7f: main_loop+152
 > > > 
 > > > I can dig deeper but your help would be very much appreciated.
 > > > 
 > > > Mike
 > > 
 > > The venerable "been-there-since-the-beginning-of-x86_64" x8664_pda
 > > data structure no longer exists.  It was a per-cpu array of a
 > fundamental
 > > data structure that things like "current", the per-cpu magic number,
 > the
 > > cpu number, the current kernel stack pointer, the per-cpu IRQ stack
 > pointer,
 > > etc. all came from:  
 > > 
 > > /* Per processor datastructure. %gs points to it while the kernel
 > runs */
 > > struct x8664_pda {
 > >         struct task_struct *pcurrent;   /* Current process */
 > >         unsigned long data_offset;      /* Per cpu data offset from
 > linker address */
 > >         unsigned long kernelstack;  /* top of kernel stack for
 > current */
 > >         unsigned long oldrsp;       /* user rsp for system call */
 > > #if DEBUG_STKSZ > EXCEPTION_STKSZ
 > >         unsigned long debugstack;   /* #DB/#BP stack. */
 > > #endif
 > >         int irqcount;               /* Irq nesting counter. Starts
 > with -1 */
 > >         int cpunumber;              /* Logical CPU number */
 > >         char *irqstackptr;      /* top of irqstack */
 > >         int nodenumber;             /* number of current node */
 > >         unsigned int __softirq_pending;
 > >         unsigned int __nmi_count;       /* number of NMI on this
 > CPUs */
 > >         int mmu_state;
 > >         struct mm_struct *active_mm;
 > >         unsigned apic_timer_irqs;
 > > } ____cacheline_aligned_in_smp;
 > > 
 > > There have been upstream rumblings about replacing it with a more efficient
 > > per-cpu implementation for some time now, but I haven't studied how the
new
 > > scheme works yet.  It will be a major re-work for the crash utility, so
you're
 > > pretty much out of luck for now.  (Try "gdb vmlinux vmcore" for basic
info)
 > 
 > Ah OK.  I was just looking to get a stack trace.  Unfortunately gdb
 > isn't playing nice either:
 > 
 > (gdb) bt
 > #0  kstat_irqs_cpu (irq=<value optimized out>, cpu=2) at
 > kernel/irq/handle.c:555
 > Cannot access memory at address 0xffff88007e5e7d50

 Mike,

 Try the "--minimal" option that the IBM guys put into 4.0-7.1:

          - Implementation of a "--minimal" command line option, which brings 
            up a crash session that is restricted to the "log", "dis",
"rd", 
            "sym", "eval" and "exit" commands.  This option
may provide a way to 
            extract some minimal/quick information from a corrupted or truncated 
            dumpfile, or in situations where one of the several kernel subsystem 
            initialization routines, which are not called, would abort the
            crash session.  (sharyath(a)in.ibm.com, sachinp(a)in.ibm.com)

 So just enter this:

  $ crash --minimal vmlinux vmcore

 And you should at least get the kernel trace info with the "log" command.

Very cool, thanks!

Mike

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

[Crash-utility] Re: crash 4.0-8.9 w/ 2.6.30-rc6