On Tue, Oct 4, 2022 at 10:22 PM Lianbo Jiang <lijiang(a)redhat.com> wrote:
Currently crash will fail and then exit, if the initialization of
the emergency stacks information fails. In real customer environments,
sometimes, a vmcore may be partially damaged, although such vmcores
are rare. For example:
# ./crash ../3.10.0-1127.18.2.el7.ppc64le/vmcore
../3.10.0-1127.18.2.el7.ppc64le/vmlinux -s
crash: invalid kernel virtual address: 38 type: "paca->emergency_sp"
#
Lets try to keep loading vmcore if such issues happen, so call
the readmem() with the RETURN_ON_ERROR instead of FAULT_ON_ERROR,
which allows the crash move on.
Reported-by: Dave Wysochanski <dwysocha(a)redhat.com>
Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com>
---
ppc64.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/ppc64.c b/ppc64.c
index 4ea1f7c0c6f8..f94b402ec64d 100644
--- a/ppc64.c
+++ b/ppc64.c
@@ -1224,13 +1224,13 @@ ppc64_init_paca_info(void)
ulong paca_loc;
readmem(symbol_value("paca_ptrs"), KVADDR, &paca_loc,
sizeof(void *),
- "paca double pointer", FAULT_ON_ERROR);
+ "paca double pointer", RETURN_ON_ERROR);
readmem(paca_loc, KVADDR, paca_ptr, sizeof(void *) * kt->cpus,
- "paca pointers", FAULT_ON_ERROR);
+ "paca pointers", RETURN_ON_ERROR);
} else if (symbol_exists("paca") &&
(get_symbol_type("paca", NULL, NULL) == TYPE_CODE_PTR)) {
readmem(symbol_value("paca"), KVADDR, paca_ptr, sizeof(void *)
* kt->cpus,
- "paca pointers", FAULT_ON_ERROR);
+ "paca pointers", RETURN_ON_ERROR);
} else {
free(paca_ptr);
return;
@@ -1245,7 +1245,7 @@ ppc64_init_paca_info(void)
for (i = 0; i < kt->cpus; i++)
readmem(paca_ptr[i] + offset, KVADDR,
&ms->emergency_sp[i],
sizeof(void *), "paca->emergency_sp",
- FAULT_ON_ERROR);
+ RETURN_ON_ERROR);
}
if (MEMBER_EXISTS("paca_struct", "nmi_emergency_sp")) {
@@ -1256,7 +1256,7 @@ ppc64_init_paca_info(void)
for (i = 0; i < kt->cpus; i++)
readmem(paca_ptr[i] + offset, KVADDR,
&ms->nmi_emergency_sp[i],
sizeof(void *), "paca->nmi_emergency_sp",
- FAULT_ON_ERROR);
+ RETURN_ON_ERROR);
}
if (MEMBER_EXISTS("paca_struct", "mc_emergency_sp")) {
@@ -1267,7 +1267,7 @@ ppc64_init_paca_info(void)
for (i = 0; i < kt->cpus; i++)
readmem(paca_ptr[i] + offset, KVADDR,
&ms->mc_emergency_sp[i],
sizeof(void *), "paca->mc_emergency_sp",
- FAULT_ON_ERROR);
+ RETURN_ON_ERROR);
}
free(paca_ptr);
--
2.37.1
Consider adding a 'Fixes' tag for the patch that introduced this
problem, per the bisect
in
https://bugzilla.redhat.com/show_bug.cgi?id=2127525
Fixes: cdd57e8b16ab ("ppc64: handle backtrace when CPU is in an
emergency stack")
Other than that, I tested this and now on the vmcore in question,
crash loads ok.
I get a lot of the below "invalid kernel virtual address", but I think
it is fine:
...
crash: invalid kernel virtual address: 2d0 type: "paca->mc_emergency_sp"
crash> quit
Tested-and-Reviewed-by: Dave Wysochanski <dwysocha(a)redhat.com>
Good job Lianbo!