----- Original Message -----
With some changes in the upstream ppc64 kernel starting from 2.6.34,
the
crash utility fails to read vmcore generated by post-v2.6.34 kernel versions.
In v2.6.34 ppc64, the upstream commit 1426d5a3bd07589534286375998c0c8c6fdc5260
(powerpc: Dynamically allocate pacas) now dynamically allocates the paca and
have changed data type of 'paca' symbol from array to pointer. With this change
in place crash utility fails to read vmcore generated for upstream kernel.
This patch fixes the crash tool to get correct address value for paca
symbol depending on its data type.
In v2.6.36 ppc64, the upstream commit fc53b4202e61c7e9008c241933ae282aab8a6082
overwrites the paca pointer variable to point to static paca (during crash
just before kexec) that contains valid data_offset only for crashing cpu.
Hence we can not rely on paca symbol anymore to get valid per cpu data_offset
values. Instead, this version introduces __per_cpu_offset array again which
was removed post v2.6.15 ppc64. This patch checks for existence of
'__per_cpu_offset' symbol before calling ppc64_paca_init().
This fix is backward compatible and works fine with vmcore generated by older
kernel (pre-2.6.34).
Hi Mahesh,
The only question I have re: this patch is that it will prevent
this part at the bottom of ppc64_paca_init() from running on
the newer kernels:
...
switch (map)
{
case POSSIBLE:
if (cpus > kt->cpus) {
i = get_highest_cpu_online() + 1;
if (i > kt->cpus)
kt->cpus = i;
}
break;
case ONLINE:
case PRESENT:
kt->cpus = cpus;
break;
}
if (kt->cpus > 1)
kt->flags |= SMP;
}
The potential kt->cpus override above was precisely this patch:
diff -r1.42 -r1.43
2514c2514,2527
< kt->cpus = cpus;
---
switch (map)
{
case POSSIBLE:
if (cpus > kt->cpus) {
i = get_highest_cpu_online() + 1;
if (i > kt->cpus)
kt->cpus = i;
}
break;
case ONLINE:
case PRESENT:
kt->cpus = cpus;
break;
}
that was introduced in crash-5.0.0:
- Fix for a 4.0-8.11 regression that introduced a bug in determining
the number of cpus in ppc64 kernels when the cpu_possible_[map/mask]
has more cpus than the cpu_online_[map/mask]. In that case, the
kernel contains per-cpu runqueue data and "swapper" tasks for the
extra cpus. Without the patch, on systems with a possible cpu count
that is larger than its online cpu count:
(1) the "sys" command will reflect the possible cpu count.
(2) the "ps" command will show the existent-but-unused
"swapper"
tasks as active on the extra cpus.
(3) the "set" command will allow the current context to be set to
any of the existent-but-unused "swapper" tasks.
(4) the "runq" command will display existent-but-unused runqueue
data for the extra cpus.
(5) the "bt" command on the existent-but-unused "swapper"
tasks will
indicate: "bt: cannot determine NT_PRSTATUS ELF note for active
task: <task>" on dumpfiles, and "(active)" on live
systems.
(anderson(a)redhat.com)
So your patch effectively reverts it for the newer kernels. Since 2.6.36 kernels
have the cpu_possible_mask, I think you should still apply the "case POSSIBLE"
logic above somewhere, correct?
Dave
Signed-off-by: Mahesh Salgaonkar <mahesh(a)linux.vnet.ibm.com>
---
ppc64.c | 34 ++++++++++++++++++++++++++++++++--
1 file changed, 32 insertions(+), 2 deletions(-)
Index: crash-5.1.7/ppc64.c
===================================================================
--- crash-5.1.7.orig/ppc64.c
+++ crash-5.1.7/ppc64.c
@@ -201,7 +201,22 @@ ppc64_init(int when)
machdep->section_size_bits = _SECTION_SIZE_BITS;
machdep->max_physmem_bits = _MAX_PHYSMEM_BITS;
- ppc64_paca_init();
+ /*
+ * starting from v2.6.36 we can not rely on paca structure
+ * to get per cpu data_offset. The upstream commit
+ * fc53b4202e61c7e9008c241933ae282aab8a6082 overwrites the
+ * paca pointer variable to point to static paca that contains
+ * valid data_offset only for crashing cpu.
+ *
+ * But the kernel v2.6.36 ppc64 introduces __per_cpu_offset
+ * symbol which was removed post v2.6.15 ppc64 and now we
+ * get the per cpu data_offset from __per_cpu_offset symbol
+ * during kernel_init() call. Hence for backward (pre-2.6.36)
+ * compatibility, call ppc64_paca_init() only if symbol
+ * __per_cpu_offset does not exist.
+ */
+ if (!symbol_exists("__per_cpu_offset"))
+ ppc64_paca_init();
machdep->vmalloc_start = ppc64_vmalloc_start;
MEMBER_OFFSET_INIT(thread_struct_pg_tables,
"thread_struct", "pg_tables");
@@ -2606,10 +2621,25 @@ ppc64_paca_init(void)
char *cpu_paca_buf;
ulong data_offset;
int map;
+ ulong paca;
if (!symbol_exists("paca"))
error(FATAL, "PPC64: Could not find 'paca' symbol\n");
+ /*
+ * In v2.6.34 ppc64, the upstream commit
+ * 1426d5a3bd07589534286375998c0c8c6fdc5260 (powerpc: Dynamically
+ * allocate pacas) now dynamically allocates the paca and have
+ * changed data type of 'paca' symbol from array to pointer. With
this
+ * change in place crash utility fails to read vmcore generated for
+ * upstream kernel.
+ * Add a check for paca variable data type before accessing.
+ */
+ if (get_symbol_type("paca", NULL, NULL) == TYPE_CODE_PTR)
+ readmem(symbol_value("paca"), KVADDR, &paca, sizeof(ulong),
+ "paca", FAULT_ON_ERROR);
+ else
+ paca = symbol_value("paca");
if (cpu_map_addr("possible"))
map = POSSIBLE;
else if (cpu_map_addr("present"))
@@ -2648,7 +2678,7 @@ ppc64_paca_init(void)
if (!in_cpu_map(map, i))
continue;
- readmem(symbol_value("paca") + (i * SIZE(ppc64_paca)),
+ readmem(paca + (i * SIZE(ppc64_paca)),
KVADDR, cpu_paca_buf, SIZE(ppc64_paca),
"paca entry", FAULT_ON_ERROR);