On Wed, 2006-02-22 at 15:31 -0800, Haren Myneni wrote:
Haren Myneni wrote:
>
>
> crash-utility-bounces(a)redhat.com wrote on 02/22/2006 07:31:43 AM:
>
> > Rachita Kothiyal wrote:
> >
> > >
> > > > Rachita Kothiyal wrote:
> > >
> > >
> > > >
> > > > This happens in get_idle_threads() when perusing the runqueues
> array,
> > > > where each per-cpu runqueue data structure contains a pointer to the
> > > > idle (swapper) task for that CPU. Now, this process requires
> that the
> > > > per-cpu address manipulations are working correctly in order to
> find the
> > > > each cpu's runqueue data structure. It looks like the ppc64
change
> > > > for per-cpu data accesses is suspect here:
> > > >
> > > > > Fix to recognize post-2.6.15 ppc64 kernels moving the
> per_cpu_offsets
> > > > > to the "paca" structure. Without this patch, crash
fails with the
> > > > > following error messages: "crash: cannot determine idle
task
> addresses
> > > > > from init_tasks[] or runqueues[]" and "crash: cannot
resolve
> > > > > init_task_union". (pbadari(a)us.ibm.com)
> > > >
> > >
> > > Right, but I thought this patch fixed this problem.
> > > (I am using crash-4.0-2.21, and it includes this patch)
> >
> > Right -- me too... ;-)
>
> Badari tested his patch on live system. He can give more information
> anyway.
>
> However, I used his patch for testing PPC64 vmcore before I post my
> patch. Did not see any issue when invoking crash tool. Tested on
> 2.6.16-rc2-gi9.
>
> I will also verify if I have the same vmcore.
>
> Thanks
> Haren
>
Dave,
I used Badari's patch (first version) for my testing on PPC64 kdump
vmcore. Later, this patch was changed to use paca[CPU#].hw_cpu_id to
determine whether the CPU exists (CPU hotplug case). The reason it
failed on vmcore is, when the kdump boot happens, this hw_cpu_id is set
to -1 for secondary cpus when they stopped. Hence, not setting
per_cpu_offset for these CPUs and causing this issue.
Instead of looking for hw_cpu_id, this patch will look for the
corresponding data_offset. If 0, means CPU does not exists.
Rachita, please let us know if still an issue. Badari, is there any
issue with this patch?
Haren,
Yes. My first version of the patch did exactly this. Later, while
discussing with you we decided that checking for "-1" for CPU ID
is the right thing to identify the presence of the CPU.
Yep. you are right on kdump boot, you do set cpuid to -1 when
they are stopped.
So, I guess I am okay with your fix.
Thanks,
Badari