This message was bounced due to its size of its attachment;
I've since bumped up the maximum allowable message size:
Re: [Crash-utility] crash 4.0-2.8 fails on 2.6.14-rc5 (EM64T)
Date: Wed, 26 Oct 2005 09:15:47 -0700
From: Badari Pulavarty <pbadari(a)us.ibm.com>
To: <crash-utility(a)redhat.com>
References: 1, 2, 3, 4
On Wed, 2005-10-26 at 11:51 -0400, Dave Anderson wrote:
>
> crash: read error: kernel virtual address: ffff8100050eb084 type:
> "tss_struct ist array"
>
I see that the 2.6.13 kernel defines its init_tss
array like so:
DEFINE_PER_CPU(struct tss_struct, init_tss)
____cacheline_maxaligned_in_smp;
whereas, the earlier 2.6 kernels do it like this:
DECLARE_PER_CPU(struct tss_struct,init_tss);
If this change modifies the way that per-cpu variable addresses
are laid out, then I can't tell you what to do without significant
further investigation. But until proven otherwise, let's presume
that the calculations of the per-cpu data is done the same way.
There are two places where that error message comes from, both
in x86_64_ist_init(), but given that the above per-cpu declarations
are functionally equivalent, there would be the following
kernel symbol in your vmlinux, verifiable like so:
$ nm -Bn vmlinux | grep per_cpu__init_tss
ffffffff80502100 D per_cpu__init_tss
$
If it's not there, crash is hosed, then signficant work needs
to be done to find it. But if the symbol is still intact in
the 2.6.14 kernel, the failure should have come from an incorrect
calculation of the vaddr of the init_tss below:
None of the above stuff changed, so we are fine.
static void
x86_64_ist_init(void)
{
...
} else if (symbol_exists("per_cpu__init_tss")) {
for (c = 0; c < NR_CPUS; c++) {
if ((kt->flags & SMP) && (kt->flags &
PER_CPU_OFF)) {
if (kt->__per_cpu_offset[c] == 0)
break;
vaddr = symbol_value
("per_cpu__init_tss") +
kt->__per_cpu_offset[c];
} else
vaddr = symbol_value
("per_cpu__init_tss");
vaddr += OFFSET(tss_struct_ist);
readmem(vaddr, KVADDR, &ms->stkinfo.ebase
[c][0],
sizeof(ulong) * 7, "tss_struct ist
array",
FAULT_ON_ERROR);
Yes. I realized that the problem is due to messed up
kt->__per_cpu_offset[c] value. These should be offset into the array,
they should be small values. I see huge numbers.
per-cpu offset: 84afdf60
I also realized that this gets set at the lines I touched earlier :(
I can't seem to find out what I screwed up. We are just reading a value
from the kernel structure and setting it.
if (ms->stkinfo.ebase[c][0] == 0)
break;
}
}
I'm also presuming your test kernel is SMP. But I'm wondering
whether
the SMP and PER_CPU_OFF flags are set?
Yes.
The SMP flag should have been pre-set in kernel_init(), but the
PER_CPU_OFF flag gets set in x86_64_cpu_pda_init(), which you
have modified.
You can display the kt->flags contents with a printk x86_64_ist_init
().
If PER_CPU_OFF is not set, then that's probably the issue here.
Can you show your new versions of x86_64_cpu_pda_init() and
x86_64_get_smp_cpus()?
Here are new versions of x64-64 for your review.
Thanks,
Badari