Dave Anderson wrote:
Dave Anderson wrote:
>
> Sachin,
>
> This may require help from the IBM ppc64 people out there,
> but it appears that the issue at hand has something to do
> with the uniprocessor aspect.
>
> Your vmlinux is an SMP kernel, but I'm guessing that the wrong
> choice of "runq" addresses is being made below:
>
> if (symbol_exists("per_cpu__runqueues") &&
> VALID_MEMBER(runqueue_idle)) {
> runqbuf = GETBUF(SIZE(runqueue));
> for (i = 0; i < nr_cpus; i++) {
> if ((kt->flags & SMP) && (kt->flags &
PER_CPU_OFF)) {
> runq = symbol_value("per_cpu__runqueues")
+
> kt->__per_cpu_offset[i];
> } else
> runq = symbol_value("per_cpu__runqueues");
>
> readmem(runq, KVADDR, runqbuf,
> SIZE(runqueue), "runqueues entry
(per_cpu)",
> FAULT_ON_ERROR);
> tasklist[i] = ULONG(runqbuf + OFFSET(runqueue_idle));
> if (IS_KVADDR(tasklist[i]))
> cnt++;
>
> I don't know what your "kt->flags" is showing at the
> decision point above, but if you hack the code and force
> it to select the "other" runq value, does it work OK?
>
> When CONFIG_SMP is not configured into the kernel, then
> the direct value of "per_cpu__runqueues" is used, whereas
> with SMP kernels, the appropriate offset needs to be
> applied. At least that's how it works (has worked?) with
> the other architectures.
>
> Dave
Looking at ppc64_paca_init(), it appears that this might be
the problem if SMP is not set in kt->flags:
static void
ppc64_paca_init()
{
...
for (i = cpus = 0; i < nr_paca; i++) {
div_t val = div(i, BITS_FOR_LONG);
/*
* CPU online?
*/
if (!(cpu_online_map[val.quot] & (0x1UL << val.rem)))
continue;
readmem(symbol_value("paca") + (i * SIZE(ppc64_paca)),
KVADDR, cpu_paca_buf, SIZE(ppc64_paca),
"paca entry", FAULT_ON_ERROR);
kt->__per_cpu_offset[i] = ULONG(cpu_paca_buf + data_offset);
kt->flags |= PER_CPU_OFF;
cpus++;
}
kt->cpus = cpus;
if (kt->cpus > 1)
kt->flags |= SMP;
}
If SMP is not set coming into this function, and therefore won't
get set above, then the wrong runq pointer would be selected later
on get_idle_threads().
Dave
And then there's this code in kernel_init():
if ((sp1 = symbol_search("__per_cpu_start")) &&
(sp2 = symbol_search("__per_cpu_end")) &&
(sp1->type == 'A') && (sp2->type == 'A')
&&
(sp2->value > sp1->value))
kt->flags |= SMP|PER_CPU_OFF;
On a RHEL5 x86_64:
crash> sym -q __per_cpu_ | grep -e start -e end
ffffffff80603000 (A) __per_cpu_start
ffffffff80607288 (A) __per_cpu_end
crash>
On a RHEL5 x86:
crash> sym -q __per_cpu | grep -e start -e end
c03100a0 (A) __per_cpu_start
c0315ae4 (A) __per_cpu_end
crash>
But on your RHEL5 ppc64 kernel:
# nm -Bn vmlinux | grep __per_cpu
c000000000430100 D __per_cpu_start
c0000000004356f0 D __per_cpu_end
#
So if you remove the two "type == 'A'" qualifiers
from the if statement above, does it work OK?
Dave