On Wed, May 27 2009 at 8:37am -0400,
Dave Anderson <anderson(a)redhat.com> wrote:
----- "Mike Snitzer" <snitzer(a)redhat.com> wrote:
> Hi Dave,
>
> crash is failing with the following when I try to throw a 2.6.30-rc6
> vmcore at it:
>
> crash: invalid structure size: x8664_pda
> FILE: x86_64.c LINE: 584 FUNCTION: x86_64_cpu_pda_init()
>
> [/usr/bin/crash] error trace: 449c7f => 4ce815 => 4d00cf => 50936d
>
> 50936d: SIZE_verify+168
> 4d00cf: (undetermined)
> 4ce815: x86_64_init+3205
> 449c7f: main_loop+152
>
> I can dig deeper but your help would be very much appreciated.
>
> Mike
The venerable "been-there-since-the-beginning-of-x86_64" x8664_pda
data structure no longer exists. It was a per-cpu array of a fundamental
data structure that things like "current", the per-cpu magic number, the
cpu number, the current kernel stack pointer, the per-cpu IRQ stack pointer,
etc. all came from:
/* Per processor datastructure. %gs points to it while the kernel runs */
struct x8664_pda {
struct task_struct *pcurrent; /* Current process */
unsigned long data_offset; /* Per cpu data offset from linker address */
unsigned long kernelstack; /* top of kernel stack for current */
unsigned long oldrsp; /* user rsp for system call */
#if DEBUG_STKSZ > EXCEPTION_STKSZ
unsigned long debugstack; /* #DB/#BP stack. */
#endif
int irqcount; /* Irq nesting counter. Starts with -1 */
int cpunumber; /* Logical CPU number */
char *irqstackptr; /* top of irqstack */
int nodenumber; /* number of current node */
unsigned int __softirq_pending;
unsigned int __nmi_count; /* number of NMI on this CPUs */
int mmu_state;
struct mm_struct *active_mm;
unsigned apic_timer_irqs;
} ____cacheline_aligned_in_smp;
There have been upstream rumblings about replacing it with a more efficient
per-cpu implementation for some time now, but I haven't studied how the new
scheme works yet. It will be a major re-work for the crash utility, so you're
pretty much out of luck for now. (Try "gdb vmlinux vmcore" for basic info)
Ah OK. I was just looking to get a stack trace. Unfortunately gdb
isn't playing nice either:
(gdb) bt
#0 kstat_irqs_cpu (irq=<value optimized out>, cpu=2) at kernel/irq/handle.c:555
Cannot access memory at address 0xffff88007e5e7d50
In the meantime, can you give me a copy of your vmcore? (offline --
note that
I'm forwarding this to the crash-utility mailing list). And I'll start working
on it.
OK, will do.
Mike