This message was bounced due to its size of its attachment;
I've since bumped up the maximum allowable message size:
Re: [Crash-utility]
crash 4.0-2.8 fails on 2.6.14-rc5 (EM64T)
Date: Wed, 26 Oct 2005 09:15:47
-0700
From: Badari Pulavarty <pbadari@us.ibm.com>
To: <crash-utility@redhat.com>
References: 1, 2, 3, 4
On Wed, 2005-10-26 at 11:51 -0400, Dave Anderson wrote:
> >
> > crash: read error: kernel virtual address: ffff8100050eb084
type:
> > "tss_struct ist array"
> >
>
> I see that the 2.6.13 kernel defines its init_tss
> array like so:
>
> DEFINE_PER_CPU(struct tss_struct, init_tss)
> ____cacheline_maxaligned_in_smp;
>
> whereas, the earlier 2.6 kernels do it like this:
>
> DECLARE_PER_CPU(struct tss_struct,init_tss);
>
> If this change modifies the way that per-cpu variable addresses
> are laid out, then I can't tell you what to do without significant
> further investigation. But until proven otherwise, let's presume
> that the calculations of the per-cpu data is done the same way.
>
> There are two places where that error message comes from, both
> in x86_64_ist_init(), but given that the above per-cpu declarations
> are functionally equivalent, there would be the following
> kernel symbol in your vmlinux, verifiable like so:
>
> $ nm -Bn vmlinux | grep per_cpu__init_tss
> ffffffff80502100 D per_cpu__init_tss
> $
>
> If it's not there, crash is hosed, then signficant work needs
> to be done to find it. But if the symbol is still intact
in
> the 2.6.14 kernel, the failure should have come from an incorrect
> calculation of the vaddr of the init_tss below:
None of the above stuff changed, so we are fine.
> static void
> x86_64_ist_init(void)
> {
>
...
>
>
} else if (symbol_exists("per_cpu__init_tss")) {
>
for (c = 0; c < NR_CPUS; c++) {
>
if ((kt->flags & SMP) && (kt->flags &
> PER_CPU_OFF)) {
>
if (kt->__per_cpu_offset[c] == 0)
>
break;
>
vaddr = symbol_value
> ("per_cpu__init_tss") +
>
kt->__per_cpu_offset[c];
>
} else
>
vaddr = symbol_value
> ("per_cpu__init_tss");
>
>
vaddr += OFFSET(tss_struct_ist);
>
>
readmem(vaddr, KVADDR, &ms->stkinfo.ebase
> [c][0],
>
sizeof(ulong) * 7, "tss_struct ist
> array",
>
FAULT_ON_ERROR);
>
Yes. I realized that the problem is due to messed up
kt->__per_cpu_offset[c] value. These should be offset into the
array,
they should be small values. I see huge numbers.
per-cpu offset: 84afdf60
I also realized that this gets set at the lines I touched earlier
:(
I can't seem to find out what I screwed up. We are just reading
a value
from the kernel structure and setting it.
>
if (ms->stkinfo.ebase[c][0] == 0)
>
break;
>
}
> }
>
> I'm also presuming your test kernel is SMP. But I'm wondering
> whether
> the SMP and PER_CPU_OFF flags are set?
Yes.
> The SMP flag should have been pre-set in kernel_init(), but the
> PER_CPU_OFF flag gets set in x86_64_cpu_pda_init(), which you
> have modified.
>
> You can display the kt->flags contents with a printk x86_64_ist_init
> ().
> If PER_CPU_OFF is not set, then that's probably the issue here.
>
> Can you show your new versions of x86_64_cpu_pda_init()
and
> x86_64_get_smp_cpus()?
Here are new versions of x64-64 for your review.
Thanks,
Badari