On 2011-09-15 10:39:12 Thu, Dave Anderson wrote:
----- Original Message -----
> With some changes in the upstream ppc64 kernel starting from 2.6.34, the
> crash utility fails to read vmcore generated by post-v2.6.34 kernel versions.
>
> In v2.6.34 ppc64, the upstream commit 1426d5a3bd07589534286375998c0c8c6fdc5260
> (powerpc: Dynamically allocate pacas) now dynamically allocates the paca and
> have changed data type of 'paca' symbol from array to pointer. With this
change
> in place crash utility fails to read vmcore generated for upstream kernel.
>
> This patch fixes the crash tool to get correct address value for paca
> symbol depending on its data type.
>
> In v2.6.36 ppc64, the upstream commit fc53b4202e61c7e9008c241933ae282aab8a6082
> overwrites the paca pointer variable to point to static paca (during crash
> just before kexec) that contains valid data_offset only for crashing cpu.
> Hence we can not rely on paca symbol anymore to get valid per cpu data_offset
> values. Instead, this version introduces __per_cpu_offset array again which
> was removed post v2.6.15 ppc64. This patch checks for existence of
> '__per_cpu_offset' symbol before calling ppc64_paca_init().
>
> This fix is backward compatible and works fine with vmcore generated by older
> kernel (pre-2.6.34).
Hi Mahesh,
The only question I have re: this patch is that it will prevent
this part at the bottom of ppc64_paca_init() from running on
the newer kernels:
...
switch (map)
{
case POSSIBLE:
if (cpus > kt->cpus) {
i = get_highest_cpu_online() + 1;
if (i > kt->cpus)
kt->cpus = i;
}
break;
case ONLINE:
case PRESENT:
kt->cpus = cpus;
break;
}
if (kt->cpus > 1)
kt->flags |= SMP;
}
The potential kt->cpus override above was precisely this patch:
diff -r1.42 -r1.43
2514c2514,2527
< kt->cpus = cpus;
---
> switch (map)
> {
> case POSSIBLE:
> if (cpus > kt->cpus) {
> i = get_highest_cpu_online() + 1;
> if (i > kt->cpus)
> kt->cpus = i;
> }
> break;
> case ONLINE:
> case PRESENT:
> kt->cpus = cpus;
> break;
> }
that was introduced in crash-5.0.0:
- Fix for a 4.0-8.11 regression that introduced a bug in determining
the number of cpus in ppc64 kernels when the cpu_possible_[map/mask]
has more cpus than the cpu_online_[map/mask]. In that case, the
kernel contains per-cpu runqueue data and "swapper" tasks for the
extra cpus. Without the patch, on systems with a possible cpu count
that is larger than its online cpu count:
(1) the "sys" command will reflect the possible cpu count.
(2) the "ps" command will show the existent-but-unused
"swapper"
tasks as active on the extra cpus.
(3) the "set" command will allow the current context to be set to
any of the existent-but-unused "swapper" tasks.
(4) the "runq" command will display existent-but-unused runqueue
data for the extra cpus.
(5) the "bt" command on the existent-but-unused "swapper"
tasks will
indicate: "bt: cannot determine NT_PRSTATUS ELF note for active
task: <task>" on dumpfiles, and "(active)" on live
systems.
(anderson(a)redhat.com)
So your patch effectively reverts it for the newer kernels. Since 2.6.36 kernels
have the cpu_possible_mask, I think you should still apply the "case POSSIBLE"
logic above somewhere, correct?
Hi Dave,
I see that the generic code kernel_init() sets the kt->cpus to online
cpu count (at kernel.c:307 => kt->cpus = machdep->get_smp_cpus())) correctly,
and
I dont see kt->cpus getting modified after that other than ppc64_paca_init().
Hence, even though this patch skips overriding of kt->cpus, the 'sys'
command correclty reflects the online cpu count for vmcore generated on
system where possible cpus are greater than online cpus.
for ppc64, machdep->get_smp_cpus = ppc64_get_smp_cpus
ppc64.c:
------------------
int
ppc64_get_smp_cpus(void)
{
return get_cpus_online();
}
----------------
However, Any change in generic code in the future would trigger the issue you
are talking about. Hence, I have modified the patch (see below) to split the
ppc64_paca_init function and separate out the paca handling from code that
gathers cpu map info. Now, we only ignore the paca handling and still
gather the cpu map info for newer kernels. Let me know your comments on
this.
Thanks,
-Mahesh.
----------
Crash-Utility: ppc64: Fix a crash issue where it fails to read vmcore
generated by post-v2.6.34 kernel version.
With some changes in the upstream ppc64 kernel starting from 2.6.34, the
crash utility fails to read vmcore generated by post-v2.6.34 kernel versions.
In v2.6.34 ppc64, the upstream commit 1426d5a3bd07589534286375998c0c8c6fdc5260
(powerpc: Dynamically allocate pacas) now dynamically allocates the paca and
have changed data type of 'paca' symbol from array to pointer. With this change
in place crash utility fails to read vmcore generated for upstream kernel.
This patch fixes the crash tool to get correct address value for paca
symbol depending on its data type.
In v2.6.36 ppc64, the upstream commit fc53b4202e61c7e9008c241933ae282aab8a6082
overwrites the paca pointer variable to point to static paca (during crash
just before kexec) that contains valid data_offset only for crashing cpu.
Hence we can not rely on paca symbol anymore to get valid per cpu data_offset
values. Instead, this version introduces __per_cpu_offset array again which
was removed post v2.6.15 ppc64. This patch checks for existence of
'__per_cpu_offset' symbol before calling ppc64_paca_init().
This fix is backward compatible and works fine with vmcore generated by older
kernel (pre-2.6.34).
Change in v2:
- Split the ppc64_paca_init function and separate out paca handling from
code that gathers cpu map info. With this change we only ignore the
paca handling and still gather cpu map info for newer kernels.
Signed-off-by: Mahesh Salgaonkar <mahesh(a)linux.vnet.ibm.com>
---
ppc64.c | 98 ++++++++++++++++++++++++++++++++++++++++++++++++++--------------
1 file changed, 78 insertions(+), 20 deletions(-)
Index: crash-5.1.7/ppc64.c
===================================================================
--- crash-5.1.7.orig/ppc64.c
+++ crash-5.1.7/ppc64.c
@@ -49,7 +49,8 @@ static char * ppc64_check_eframe(struct
static void ppc64_print_eframe(char *, struct ppc64_pt_regs *,
struct bt_info *);
static void parse_cmdline_args(void);
-static void ppc64_paca_init(void);
+static int ppc64_paca_init(int map);
+static void ppc64_init_cpu_info(void);
static void ppc64_clear_machdep_cache(void);
static void ppc64_vmemmap_init(void);
static int ppc64_get_kvaddr_ranges(struct vaddr_range *);
@@ -201,7 +202,7 @@ ppc64_init(int when)
machdep->section_size_bits = _SECTION_SIZE_BITS;
machdep->max_physmem_bits = _MAX_PHYSMEM_BITS;
- ppc64_paca_init();
+ ppc64_init_cpu_info();
machdep->vmalloc_start = ppc64_vmalloc_start;
MEMBER_OFFSET_INIT(thread_struct_pg_tables,
"thread_struct", "pg_tables");
@@ -2596,32 +2597,32 @@ parse_cmdline_args(void)
}
/*
- * Updating any smp-related items that were possibly bypassed
- * or improperly initialized in kernel_init().
+ * Initialize the per cpu data_offset values from paca structure.
*/
-static void
-ppc64_paca_init(void)
+static int
+ppc64_paca_init(int map)
{
int i, cpus, nr_paca;
char *cpu_paca_buf;
ulong data_offset;
- int map;
+ ulong paca;
if (!symbol_exists("paca"))
error(FATAL, "PPC64: Could not find 'paca' symbol\n");
- if (cpu_map_addr("possible"))
- map = POSSIBLE;
- else if (cpu_map_addr("present"))
- map = PRESENT;
- else if (cpu_map_addr("online"))
- map = ONLINE;
- else {
- map = 0;
- error(FATAL,
- "PPC64: cannot find 'cpu_possible_map' or\
- 'cpu_present_map' or 'cpu_online_map' symbols\n");
- }
+ /*
+ * In v2.6.34 ppc64, the upstream commit 1426d5a3 (powerpc: Dynamically
+ * allocate pacas) now dynamically allocates the paca and have
+ * changed data type of 'paca' symbol from array to pointer. With this
+ * change in place crash utility fails to read vmcore generated for
+ * upstream kernel.
+ * Add a check for paca variable data type before accessing.
+ */
+ if (get_symbol_type("paca", NULL, NULL) == TYPE_CODE_PTR)
+ readmem(symbol_value("paca"), KVADDR, &paca, sizeof(ulong),
+ "paca", FAULT_ON_ERROR);
+ else
+ paca = symbol_value("paca");
if (!MEMBER_EXISTS("paca_struct", "data_offset"))
return;
@@ -2648,7 +2649,7 @@ ppc64_paca_init(void)
if (!in_cpu_map(map, i))
continue;
- readmem(symbol_value("paca") + (i * SIZE(ppc64_paca)),
+ readmem(paca + (i * SIZE(ppc64_paca)),
KVADDR, cpu_paca_buf, SIZE(ppc64_paca),
"paca entry", FAULT_ON_ERROR);
@@ -2656,6 +2657,63 @@ ppc64_paca_init(void)
kt->flags |= PER_CPU_OFF;
cpus++;
}
+ return cpus;
+}
+
+static int
+ppc64_get_cpu_map()
+{
+ int map;
+
+ if (cpu_map_addr("possible"))
+ map = POSSIBLE;
+ else if (cpu_map_addr("present"))
+ map = PRESENT;
+ else if (cpu_map_addr("online"))
+ map = ONLINE;
+ else {
+ map = 0;
+ error(FATAL,
+ "PPC64: cannot find 'cpu_possible_map' or\
+ 'cpu_present_map' or 'cpu_online_map' symbols\n");
+ }
+ return map;
+}
+
+/*
+ * Updating any smp-related items that were possibly bypassed
+ * or improperly initialized in kernel_init().
+ */
+static void
+ppc64_init_cpu_info(void)
+{
+ int i, map, cpus, nr_cpus;
+
+ map = ppc64_get_cpu_map();
+ /*
+ * starting from v2.6.36 we can not rely on paca structure to get
+ * per cpu data_offset. The upstream commit fc53b420 overwrites
+ * the paca pointer variable to point to static paca that contains
+ * valid data_offset only for crashing cpu.
+ *
+ * But the kernel v2.6.36 ppc64 introduces __per_cpu_offset symbol
+ * which was removed post v2.6.15 ppc64 and now we get the per cpu
+ * data_offset from __per_cpu_offset symbol during kernel_init()
+ * call. Hence for backward (pre-2.6.36) compatibility, call
+ * ppc64_paca_init() only if symbol __per_cpu_offset does not exist.
+ */
+ if (!symbol_exists("__per_cpu_offset"))
+ cpus = ppc64_paca_init(map);
+ else {
+ if (!(nr_cpus = get_array_length("__per_cpu_offset", NULL, 0)))
+ nr_cpus = (kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS :
+ NR_CPUS);
+ for (i = cpus = 0; i < nr_cpus; i++) {
+ if (!in_cpu_map(map, i))
+ continue;
+ cpus++;
+ }
+ }
switch (map)
{
case POSSIBLE: