On Thu, 6 Aug 2015 11:25:29 -0400 (EDT)
Dave Anderson <anderson(a)redhat.com> wrote:
Re: your dumpfile where the erroneous "panic" address in a
random user
task's exception frame register set gets picked up by mistake.
Your original patch request modified the "bt" command used for the
kernel stack searches in panic_search(). But that piece of code
is the last-ditch effort for finding a panic task, which follows
this path:
get_panic_context()
panic_search()
get_dumpfile_panic_task()
get_kdump_panic_task() (requires kdump "crashing_cpu" symbol)
get_diskdump_panic_task() (requires kdump "crashing_cpu" symbol)
On s390 we don't have the "crashing_cpu" symbol in the kernel.
get_active_set_panic_task() (bt -r raw stack dump of active
cpus)
...
Only if all of the above fail, does panic_search() initiate the
exhaustive walkthrough of all kernel stacks for evidence.
Since you have gotten that far, I'm wondering whether your
target dumpfile with the faulty "panic" address is from an
s390x "live dump"? In that case, there can never be any task
with any such evidence, making the backtrace search a waste of
time to begin with.
The "problem" dump is a s390 stand-alone dump of a hanging system.
All CPUs have been in "psw_idle" when the dump was generated:
PID: 0 TASK: c50f38 CPU: 0 COMMAND: "swapper/0"
LOWCORE INFO:
-psw : 0x0706c00180000000 0x000000000084410e
-function : psw_idle at 84410e
[snip]
#0 [00c1fe70] arch_cpu_idle at 104d4a
#1 [00c1fe90] cpu_startup_entry at 180430
#2 [00c1fee8] start_kernel at d1fb10
#3 [00c1ff60] _stext at 100020
And if so, I'm thinking that since s390x will have set LIVE_DUMP
flag set, if get_dumpfile_panic_task() returns NO_TASK, then
panic_search() should just return a NULL to get_panic_context()
if it's a live dump, which will just default to the idle task on
cpu 0.
Although it does not solve the above problem it makes sense for
live dumps. What about the following patch?
---
crash: do not search panic tasks for live dumps
Always return "NO_TASK" if the "LIVE_DUMP" flag is set because live
dumps
cannot have a panic task.
Signed-off-by: Michael Holzheu <holzheu(a)linux.vnet.ibm.com>
---
task.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/task.c
+++ b/task.c
@@ -6726,7 +6726,10 @@ get_dumpfile_panic_task(void)
{
ulong task;
- if (NETDUMP_DUMPFILE()) {
+ if (pc->flags2 & LIVE_DUMP) {
+ /* No panic task because system itself created the dump */
+ return NO_TASK;
+ } else if (NETDUMP_DUMPFILE()) {
task = pc->flags & REM_NETDUMP ?
tt->panic_task : get_netdump_panic_task();
if (task)