On Thu, 2005-10-27 at 13:17 -0400, Dave Anderson wrote:
Badari Pulavarty wrote:
> > That debug output certainly seems to pinpoint the issue at hand,
> doesn't it?
> > Very interesting...
> >
> > What's strange is that the usage of the cpu_pda[i].data_offset by
> the
> > per_cpu() macro in "include/asm-x86_64/percpu.h" is unchanged.
> >
> > It's probably something very simple going on here, but I don't
> have
> > any more ideas at this point.
>
> This is the reply I got from Andi Kleen..
>
> -------- Forwarded Message --------
> From: Andi Kleen <ak(a)suse.de>
> To: Badari Pulavarty <pbadari(a)us.ibm.com>
> Subject: Re: cpu_pda->data_offset changed recently ?
> Date: Thu, 27 Oct 2005 16:58:54 +0200
> On Thursday 27 October 2005 16:53, Badari Pulavarty wrote:
> > Hi Andi,
> >
> > I am trying to fix "crash" utility to make it work on 2.6.14-rc5.
> > (Its running fine on 2.6.10). It looks like crash utility reads
> > and uses cpu_pda->data_offset values. It looks like there is a
> > change between 2.6.10 & 2.6.14-rc5 which is causing "data_offset"
> > to be huge values - which is causing "crash" to break.
> >
> > I added printk() to find out why ? As you can see from following
> > what changed - Is this expected ? Please let me know.
>
> bootmem used to allocate from the end of the direct mapping on NUMA
> systems. Now it starts at the beginning, often before the
> kernel .text.
> This means it is negative. Perfectly legitimate. crash just has to
> handle it.
>
> -Andi
>
> --
>
That's what I thought it looked like, although the
x8664_pda.data_offset
field is an "unsigned long". Anyway, if you take any of the
per_cpu__xxx
symbols from the 2.6.14 kernel, subtract a cpu data_offset, does it
come up
with a legitimate virtual address?
Unfortunately, I don't know x86-64 kernel virtual address space
well enough to answer your question.
My understanding is x86-64 kernel addresses look something like:
addr: ffffffff80101000
But now (2.6.14-rc5) I do see address like:
pgdat: 0xffff81000000e000
which are causing read problems.
crash: read error: kernel virtual address: ffff81000000fa90 type:
"pglist_data node_next"
I am not sure what these address are and if they are valid.
Is there a way to verify these addresses, through gdb or /dev/kmem
or something like that ?
Thanks,
Badari