Dave Anderson wrote:

This message was bounced due to its size of its attachment;
I've since bumped up the maximum allowable message size:

         Re: [Crash-utility] crash 4.0-2.8 fails on 2.6.14-rc5 (EM64T)
       Date: Wed, 26 Oct 2005 09:15:47 -0700
       From: Badari Pulavarty <pbadari@us.ibm.com>
         To: <crash-utility@redhat.com>
References: 1, 2, 3, 4

On Wed, 2005-10-26 at 11:51 -0400, Dave Anderson wrote:
> >
> > crash: read error: kernel virtual address: ffff8100050eb084 type:
> > "tss_struct ist array"
> >
>
> I see that the 2.6.13 kernel defines its init_tss
> array like so:
>
> DEFINE_PER_CPU(struct tss_struct, init_tss)
> ____cacheline_maxaligned_in_smp;
>
> whereas, the earlier 2.6 kernels do it like this:
>
> DECLARE_PER_CPU(struct tss_struct,init_tss);
>
> If this change modifies the way that per-cpu variable addresses
> are laid out, then I can't tell you what to do without significant
> further investigation. But until proven otherwise, let's presume
> that the calculations of the per-cpu data is done the same way.
>
> There are two places where that error message comes from, both
> in x86_64_ist_init(), but given that the above per-cpu declarations
> are functionally equivalent, there would be the following
> kernel symbol in your vmlinux, verifiable like so:
>
> $ nm -Bn vmlinux | grep per_cpu__init_tss
> ffffffff80502100 D per_cpu__init_tss
> $
>
> If it's not there, crash is hosed, then signficant work needs
> to be done to find it. But if the symbol is still intact in
> the 2.6.14 kernel, the failure should have come from an incorrect
> calculation of the vaddr of the init_tss below:

None of the above stuff changed, so we are fine.

I'm still not convinced that the change from DEFINE_PER_CPU
to DECLARE_PER_CPU is not the culprit -- see below...

> static void
> x86_64_ist_init(void)
> {
>                ...
>
>                 } else if (symbol_exists("per_cpu__init_tss")) {
>                 for (c = 0; c < NR_CPUS; c++) {
>                         if ((kt->flags & SMP) && (kt->flags &
> PER_CPU_OFF)) {
>                                 if (kt->__per_cpu_offset[c] == 0)
>                                         break;
>                                 vaddr = symbol_value
> ("per_cpu__init_tss") +
>                                         kt->__per_cpu_offset[c];
>                         } else
>                                 vaddr = symbol_value
> ("per_cpu__init_tss");
>
>                         vaddr += OFFSET(tss_struct_ist);
>
>                         readmem(vaddr, KVADDR, &ms->stkinfo.ebase
> [c][0],
>                                 sizeof(ulong) * 7, "tss_struct ist
> array",
>                                 FAULT_ON_ERROR);
>
Yes. I realized that the problem is due to messed up
kt->__per_cpu_offset[c] value. These should be offset into the array,
they should be small values. I see huge numbers.
per-cpu offset: 84afdf60

Actually they should be rather large huge numbers. On my 4-cpu
x86_64 running RHEL4 (2.6.9-era), the offset values can be seen
below in the __per_cpu_offset[NR_CPUS] array:

crash> help -k
...
kernel_version: 2.6.9
   gcc_version: 3.4.4
runq_siblings: 0
__rq_idx[NR_CPUS]: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
__cpu_idx[NR_CPUS]: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
__per_cpu_offset[NR_CPUS]:
    0000010081909f60 0000010081911f60 0000010081919f60 0000010081921f60
    0000000000000000 0000000000000000 0000000000000000 0000000000000000
    0000000000000000 0000000000000000 0000000000000000 0000000000000000
    0000000000000000 0000000000000000 0000000000000000 0000000000000000
    0000000000000000 0000000000000000 0000000000000000 0000000000000000
    0000000000000000 0000000000000000 0000000000000000 0000000000000000
    0000000000000000 0000000000000000 0000000000000000 0000000000000000
    0000000000000000 0000000000000000 0000000000000000 0000000000000000
cpu_flags[NR_CPUS]:0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
crash>

The above can be verified by dumping the x8664_pda data structures
with "mach -c", and looking at the data_offset value:

crash> mach -c | grep data_offset
data_offset = 0x10081909f60,
data_offset = 0x10081911f60,
data_offset = 0x10081919f60,
data_offset = 0x10081921f60,
crash>

In order to access any of the kernel's per-cpu data structures, the
value of the relevant "per_cpu__xxxx" symbol values ("per_cpu__init_tss"
in this case), must be added to the per-cpu "data_offset" values found
in the associated x8665_pda structure in order to determine the real
data structure's virtual address.

That's why I'm wondering whether something changed that's associated
with the change from DECLARE_PER_CPU to DEFINE_PER_CPU?

I also realized that this gets set at the lines I touched earlier :(
I can't seem to find out what I screwed up. We are just reading a value
from the kernel structure and setting it.
>                         if (ms->stkinfo.ebase[c][0] == 0)
>                                 break;
>                 }
>         }
>
> I'm also presuming your test kernel is SMP. But I'm wondering
> whether
> the SMP and PER_CPU_OFF flags are set?
Yes.
> The SMP flag should have been pre-set in kernel_init(), but the
> PER_CPU_OFF flag gets set in x86_64_cpu_pda_init(), which you
> have modified.
>
> You can display the kt->flags contents with a printk x86_64_ist_init
> ().
> If PER_CPU_OFF is not set, then that's probably the issue here.
>
> Can you show your new versions of x86_64_cpu_pda_init() and
> x86_64_get_smp_cpus()?
Here are new versions of x64-64 for your review.

Your changes look OK. That's why I'm wondering whether the
per-cpu scheme has changed? That's kind of scary, because
the per-cpu offset is used in many processor-independent places
in the crash code, not just here for the init_tss structures.

When your code walks through the x8664_pda, dump out some of
the contents as a sanity check, most notably the data_offset
fields.

Thanks,
Dave