On Wed, May 27 2009 at 9:23am -0400,
Dave Anderson <anderson(a)redhat.com> wrote:
----- "Mike Snitzer" <snitzer(a)redhat.com> wrote:
> On Wed, May 27 2009 at 8:37am -0400,
> Dave Anderson <anderson(a)redhat.com> wrote:
>
> >
> > ----- "Mike Snitzer" <snitzer(a)redhat.com> wrote:
> >
> > > Hi Dave,
> > >
> > > crash is failing with the following when I try to throw a
> 2.6.30-rc6
> > > vmcore at it:
> > >
> > > crash: invalid structure size: x8664_pda
> > > FILE: x86_64.c LINE: 584 FUNCTION: x86_64_cpu_pda_init()
> > >
> > > [/usr/bin/crash] error trace: 449c7f => 4ce815 => 4d00cf =>
> 50936d
> > >
> > > 50936d: SIZE_verify+168
> > > 4d00cf: (undetermined)
> > > 4ce815: x86_64_init+3205
> > > 449c7f: main_loop+152
> > >
> > > I can dig deeper but your help would be very much appreciated.
> > >
> > > Mike
> >
> > The venerable "been-there-since-the-beginning-of-x86_64" x8664_pda
> > data structure no longer exists. It was a per-cpu array of a
> fundamental
> > data structure that things like "current", the per-cpu magic number,
> the
> > cpu number, the current kernel stack pointer, the per-cpu IRQ stack
> pointer,
> > etc. all came from:
> >
> > /* Per processor datastructure. %gs points to it while the kernel
> runs */
> > struct x8664_pda {
> > struct task_struct *pcurrent; /* Current process */
> > unsigned long data_offset; /* Per cpu data offset from
> linker address */
> > unsigned long kernelstack; /* top of kernel stack for
> current */
> > unsigned long oldrsp; /* user rsp for system call */
> > #if DEBUG_STKSZ > EXCEPTION_STKSZ
> > unsigned long debugstack; /* #DB/#BP stack. */
> > #endif
> > int irqcount; /* Irq nesting counter. Starts
> with -1 */
> > int cpunumber; /* Logical CPU number */
> > char *irqstackptr; /* top of irqstack */
> > int nodenumber; /* number of current node */
> > unsigned int __softirq_pending;
> > unsigned int __nmi_count; /* number of NMI on this
> CPUs */
> > int mmu_state;
> > struct mm_struct *active_mm;
> > unsigned apic_timer_irqs;
> > } ____cacheline_aligned_in_smp;
> >
> > There have been upstream rumblings about replacing it with a more efficient
> > per-cpu implementation for some time now, but I haven't studied how the
new
> > scheme works yet. It will be a major re-work for the crash utility, so
you're
> > pretty much out of luck for now. (Try "gdb vmlinux vmcore" for basic
info)
>
> Ah OK. I was just looking to get a stack trace. Unfortunately gdb
> isn't playing nice either:
>
> (gdb) bt
> #0 kstat_irqs_cpu (irq=<value optimized out>, cpu=2) at
> kernel/irq/handle.c:555
> Cannot access memory at address 0xffff88007e5e7d50
Mike,
Try the "--minimal" option that the IBM guys put into 4.0-7.1:
- Implementation of a "--minimal" command line option, which brings
up a crash session that is restricted to the "log", "dis",
"rd",
"sym", "eval" and "exit" commands. This option
may provide a way to
extract some minimal/quick information from a corrupted or truncated
dumpfile, or in situations where one of the several kernel subsystem
initialization routines, which are not called, would abort the
crash session. (sharyath(a)in.ibm.com, sachinp(a)in.ibm.com)
So just enter this:
$ crash --minimal vmlinux vmcore
And you should at least get the kernel trace info with the "log" command.
Very cool, thanks!
Mike