Re: [Crash-utility] Handle the NT_PRSTATUS lost for the "bt" command

Wednesday, 20 June 2012

(2012/06/20 3:25), Dave Anderson wrote:
...

 ----- Original Message -----

>
> OK, now I'm getting confused...
>

 The more I look at this patch, the more confused I get...

 During initialization, the ELF notes contained in the
 dumpfile file header are scanned, and if an NT_PRSTATUS
 note is seen, a pointer to its location in the dumpfile
 is saved in dd->nt_prstatus_percpu[num] and the "num"
 of valid notes is kept in dd->num_prstatus_notes.

 If the dd->num_prstatus_notes is equal to the online cpu
 count, then it is presumed that there is a one-to-one
 relationship, where the cpu number can be used as the
 index into the dd->nt_prstatus_percpu[num] array.

 If the number of notes is not equal to the number of online
 cpus, then the "mapping" function is called, where if a
 cpu is found to be offline, then its (incorrectly) associated
 entry in the dd->nt_prstatus_percpu[num] array is "pushed up"
 to the next higher entry.  But the dd->num_prstatus_notes
 does not seem to get incremented to reflect that move, so
 then it's seems like diskdump_get_prstatus_percpu() can
 possibly return NULL when there actually is a relevant
 NT_PRSTATUS note. 
A dd->num_prstatus_notes is equal to number of NT_PRSTATUS notes in dump file
and unless nt_prstatus_percpu[] is continuous from 0,
diskdump_get_prstatus_percpu() can possibly return NULL.

...
 That seems to be a bug (?), but it's not particularly important,
 because for x86 and x86_64, the data in the NT_PRSTATUS notes is
 only used if the starting point for backtraces if the PC/SP pair
 cannot be determined otherwise, which is the case virtually all of
 the time.  So the registers found in the NT_PRSTATUS notes are
 pretty much useless... 
I seem to be minor bug, if some of cpus are offline while panic,
can't backtrace the same number of cpus from tail because of returning NULL.
I think current task's PC/SP can not obtain from thread_struct corectly,
then NT_PRSTATUS notes need to be stored for them instead.
But anyway, panic under offline is very unlikely condition
as same as the NMI failed.

...
 Now, to complicate matters, your patch does not look at the
 NT_PRSTATUS notes in the dumpfile header, but instead looks
 at the base kernel's original notes, and verifies their
 contents, and correlates what's found there against what was
 found in the dumpfile?  So I don't understand what you are
 attempting to do -- what is the difference between the notes
 that are copied into the dumpfile vs. what you are looking at
 in the base kernel? 
At first, my patch is trying to find out tainted note area in base kernel.
Such notes can not be found out from dumpfile, however, found out other ones
are set continuously into nt_prstatus_percpu[from 0 to found out notes].
I just want to repaire nt_prstatus_percpu[#index] corresponding to cpu#index.
[example]
    cpu#0      cpu#1      cpu#2      cpu#3
    saved      tainted    saved      tainted  [by looking at base kernel]

=> nt_prstatus_percpu[0] = cpu#0's note, nt_prstatus_percpu[1] = cpu#3's note
I want to repaire array like,
=> nt_prstatus_percpu[0] = cpu#0's note, nt_prstatus_percpu[3] = cpu#3's note

Since ELF note format dose not contain information which cpu saved me,
I think there is no way to identify mapping between cpu and 
nt_prstatus_percpu[] from ELF note when both are not equal to.
That's why I try to resolve from base kernel area.

...
 I'm also wondering what would happen in your case if there
 were a combination of "lost" notes *and* offline cpus?  How
 would that work? 
I think offline cpus never ack IPI, thus lost notes everytime(not concerned).
I only assume the case that cpu was online but could not ack IPI caused
by any reasons in interrupts off.

...
 So at this point I really don't want to add this patch
 at all because it touches common code, and I don't want to
 risk breaking the other arches.  Nobody has ever reported
 any "lost" cpus so far, probably because the kdump facility
 uses non-maskable NMI's to shutdown the non-panicking cpus.
 This is such a highly-unlikely corner case, that it does
 not even seem worth addressing for fear of breaking something
 else. 
I can agree, x86 architecture own NMI which is most reliable way to 
deliver interrupts to other cpus while powerpc may not own NMI.
I am going to rework patch so that it can be effective only in ppc arch.

...
 I didn't look at the reasoning behind why you ran into a
 segmentation violation, but since the PPC code path would be:

   ...
     back_trace()
      get_diskdump_regs()
        get_diskdump_regs_ppc() 
get_diskdump_regs_ppc() is touching dd->nt_prstatus_percpu[bt->tc->processor]
when "bt" for panic_task.
---------
        if (KDUMP_CMPRS_VALID() &&
                (bt->task == tt->panic_task ||
                (is_task_active(bt->task) && (dd->num_prstatus_notes >
1)))) {
                note  = (Elf32_Nhdr*) dd->nt_prstatus_percpu[bt->tc->processor];
                if (!note)
                        error(FATAL,
                                    "cannot determine NT_PRSTATUS ELF note "
                                    "for %s task: %lx\n",
                                        (bt->task == tt->panic_task) ?
                                        "panic" : "active",
bt->task);
                len = sizeof(Elf32_Nhdr);
                len = roundup(len + note->n_namesz, 4);
                bt->machdep = (void *)((char *)note + len +
                        MEMBER_OFFSET("elf_prstatus", "pr_reg"));
        }

        machdep->get_stack_frame(bt, eip, esp);
---------

In my dumpfile case, bt->tc->processor is 2 and its note is contained at 
dd->nt_prstatus_percpu[0].
                note  = (Elf32_Nhdr*) dd->nt_prstatus_percpu[2];
                * Here is no note instance which "bt" expects.

A dd->nt_prstatus_percpu[] is allocated by malloc() and not initialied with 0.
When note->n_namesz is executed, it is cause of segmentation violation.
This malloc() is moved to calloc() in my first patch.

Other problem which is also appeared in get_netdump_regs_ppc().
A dd->num_prstatus_notes is 1 in my dumpfile although there were 8 online cpus.
If next condition "is_task_active(bt->task)" is required by "bt",
get_diskdump_regs_ppc() unfortunately do machdep->get_stack_frame() for
active tasks and surely failed.
"cannot determine NT_PRSTATUS ELF note ..." should be proper.

...
 perhaps you can rework your patch so that it is segregated
 to PPC only? 
Yes I'm going to rework, and thanks for your reviews.
I'm modifying difficult parts this time, very helpful.

I'll do more tests before sending reworked PPC only patch.
I also have to study about kexectools or makedumpfile more about
how NT_PRSTATUS or others are treated, haven't understood enough.

Toshi

...
 Dave

 --
 Crash-utility mailing list
 Crash-utility(a)redhat.com
 https://www.redhat.com/mailman/listinfo/crash-utility

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Crash-utility] Handle the NT_PRSTATUS lost for the "bt" command