----- Original Message -----
On Mon, 10 Aug 2015 10:32:12 -0400 (EDT)
Dave Anderson <anderson(a)redhat.com> wrote:
>
>
> ----- Original Message -----
> >
> > On Thu, 6 Aug 2015 11:25:29 -0400 (EDT)
> > Dave Anderson <anderson(a)redhat.com> wrote:
> >
> > > Re: your dumpfile where the erroneous "panic" address in a
random user
> > > task's exception frame register set gets picked up by mistake.
> > >
> > > Your original patch request modified the "bt" command used for
the
> > > kernel stack searches in panic_search(). But that piece of code
> > > is the last-ditch effort for finding a panic task, which follows
> > > this path:
> > >
> > > get_panic_context()
> > > panic_search()
> > > get_dumpfile_panic_task()
> > > get_kdump_panic_task() (requires kdump
"crashing_cpu" symbol)
> > > get_diskdump_panic_task() (requires kdump
"crashing_cpu" symbol)
> >
> > On s390 we don't have the "crashing_cpu" symbol in the kernel.
> >
> > > get_active_set_panic_task() (bt -r raw stack dump of active
cpus)
> > > ...
> > >
> > > Only if all of the above fail, does panic_search() initiate the
> > > exhaustive walkthrough of all kernel stacks for evidence.
> > >
> > > Since you have gotten that far, I'm wondering whether your
> > > target dumpfile with the faulty "panic" address is from an
> > > s390x "live dump"? In that case, there can never be any task
> > > with any such evidence, making the backtrace search a waste of
> > > time to begin with.
> >
> > The "problem" dump is a s390 stand-alone dump of a hanging system.
> > All CPUs have been in "psw_idle" when the dump was generated:
> >
> > PID: 0 TASK: c50f38 CPU: 0 COMMAND: "swapper/0"
> > LOWCORE INFO:
> > -psw : 0x0706c00180000000 0x000000000084410e
> > -function : psw_idle at 84410e
> >
> > [snip]
> >
> > #0 [00c1fe70] arch_cpu_idle at 104d4a
> > #1 [00c1fe90] cpu_startup_entry at 180430
> > #2 [00c1fee8] start_kernel at d1fb10
> > #3 [00c1ff60] _stext at 100020
> >
> >
> > >
> > > And if so, I'm thinking that since s390x will have set LIVE_DUMP
> > > flag set, if get_dumpfile_panic_task() returns NO_TASK, then
> > > panic_search() should just return a NULL to get_panic_context()
> > > if it's a live dump, which will just default to the idle task on
> > > cpu 0.
> >
> > Although it does not solve the above problem it makes sense for
> > live dumps. What about the following patch?
> > ---
> > crash: do not search panic tasks for live dumps
> >
> > Always return "NO_TASK" if the "LIVE_DUMP" flag is set
because live dumps
> > cannot have a panic task.
> >
> > Signed-off-by: Michael Holzheu <holzheu(a)linux.vnet.ibm.com>
> > ---
> > task.c | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > --- a/task.c
> > +++ b/task.c
> > @@ -6726,7 +6726,10 @@ get_dumpfile_panic_task(void)
> > {
> > ulong task;
> >
> > - if (NETDUMP_DUMPFILE()) {
> > + if (pc->flags2 & LIVE_DUMP) {
> > + /* No panic task because system itself created the dump */
> > + return NO_TASK;
> > + } else if (NETDUMP_DUMPFILE()) {
> > task = pc->flags & REM_NETDUMP ?
> > tt->panic_task : get_netdump_panic_task();
> > if (task)
> >
>
> That makes sense, but I'm going to move the LIVE_DUMP check farther down
> in get_dumpfile_panic_task() to just before the get_active_set() call.
>
Makes sense. That was also my first idea.
> The reason for that another type of "LIVE_DUMP" is from the snap.so
extension
> module, and in that case, get_kdump_panic_task() finds and returns the
"crash"
> task that was running the snap command on the live system.
>
> Clarify something else for me: are there actually two types of live dumps
> that can be taken by an s390x? There is the "zgetdump" facility, but is
> there also another type that is taken by the firmware and/or the
> hypervisor?
With the zgetdump tool we create live dumps from /dev/mem or /dev/crash.
These dumps get the LIVE_DUMP flag indicating that data is not consistent.
Besides of this, we have two other non-disruptive live dump features:
- VMDUMP for z/VM guests
- Virsh dump for KVM guests
In contrast to the zgetdump method here the guest system is stopped
to get consistent snapshots. Therefore I think it is fine to *not* set
the LIVE_DUMP flag.
Besides of those live dump mechanisms (and kdump) we have our stand-alone dump
tools for DASD and SCSI. Also these dump methods are "Linux independent" and
therefore can produce dumps without panic tasks.
You can read more on s390 dump in the documents below:
*
http://www.vm.ibm.com/education/lvc/LVC1219.pdf
*
http://www-01.ibm.com/support/knowledgecenter/linuxonibm/liaaf/lnz_r_dt.h...
Michael
OK, so from what I understand, there still can be s390x dumpfiles which have no
indication
of the panic task or cpu (if there is one) in their headers, and therefore may try the
"bt -r"
type search of the active tasks via raw_stack_dump() in get_active_set_panic_task(),
and if that fails, fall back to the "bt -t" search of all tasks in
panic_search().
In those cases, I suppose you could:
(1) restrict the raw_stack_dump() parameters in get_active_set_panic_task() to exclude
the user register dump at the top of the stack, and
(2) plug in a MACHDEP_BT_TEXT handler for the s390x instead of using the generic
version,
and in that case, could prevent the search from entering the user-space register
dump
at the top of the stack, or
(2a) replace "bt -t" with just "bt" in panic_search() for s390x as you
did in the original
patch.
But (1) and (2) are not fool-proof, because even the kernel-only part of the stack could
simply contain "numbers" that by dumb luck fall into the zero-based virtual
address
range of panic, crash_kexec, etc., and return a false positive. So I don't know
how that can be made absolutely reliable.
But at least with dumpfiles that have the live dump magic number (and I'm still
not clear which of the 4 types do so), the simple LIVE_PATCH-check patch covers
them. I'm not sure whether it's worth doing anything beyond that.
Dave