Hi Dave and all,
Thanks a lot for your suggestion.
According to the feedbacks, I made the following main changes since v2:
1) Add a new option --no_elf_notes. If specified, it'll skip processing
   elf notes in kdump-compressed dump files.
2) Dump notes related information including total number of notes, their
   buffers' pointer value and finally offset in the dump file.
3) Move nt_prstatus_percpu and nt_prstatus_percpu to diskdump_data struct.
   However, bt command may need sp/ip from the notes later, so I guess
   the buffer should not be freed after diskdump_read_header().
So the v3 patch is attached and we're looking forward to your opinions.
Thanks,
Wang Chao
Sent on 2011-4-26 23:15, Dave Anderson wrote:
 
 
 ----- Original Message -----
> Hi Dave and list,
>
> I guess you still remember that a few weeks ago Wen has sent a patch which intends
> to use register values in NT_PRSTATUS notes for backtracing if no panic task was
> found. Thanks for your review and the suggestions are very useful.
>
> However, Wen was busy with other work for these days, so I'll continue with this
> work and now the 2rd version patch is attached.
>
> v2: Address review feedbacks.
> 1) Set up a flag in pc->flags2 and it's determined during the diskdump file
init procedure.
> 2) Seperate code according to the that flag.
> 3) Also, we've done some tests on dumpfile of xen kernel and the trouble
described
> in the previous mail was gone.
>
> So we're looking forward to your feedback and if you still have any problems
with
> it, please tell us.
>
> Thanks,
> Wang Chao
 
 Hello Wang,
 
 I haven't had a chance to run a complete test of this latest patch,
 but I do have several suggestions.
 
 diskdump.c:
 
 Put the nt_prstatus_percpu array pointer and the num_prstatus_notes
 into the diskdump_data structure -- like you did with the notes_buf.
 Make nt_prstatus_percpu a pointer that gets allocated and initialized
 only if necessary -- because if it's not used, it's a waste of space.
 
 In read_dump_header(), when dd->notes_buf is malloc()'d, malloc() the
 dd->nt_prstatus_percpu array. 
 
 And at the bottom of read_dump_header(), free both the dd->notes_buf
 and the dd->nt_prstatus_percpu array if they exist, under the "err:"
 label:
 
 err:
    ...
    if (dd->notes_buf)
 	free(dd->notes_buf);
    if (dd->nt_prstatus_percpu)
         free(dd->nt_prstatus_percpu;
 
 In order to be able to debug/understand why things may fail, especially
 due to the exclusion of prstatus notes of offline cpus, it would be 
 very helpful to be able to see the dd->nt_prstatus pointers for 
 each cpu.  Similar to what is done with the vmcoreinfo data, can 
 you add a section to __diskdump_dump_header():
 
                 if (dh->header_version >= 3) {
                         fprintf(fp, "   offset_vmcoreinfo: %lx\n",
                                 (ulong)dd->sub_header_kdump->offset_vmcoreinfo);
                         fprintf(fp, "     size_vmcoreinfo: %lu\n",
                                 dd->sub_header_kdump->size_vmcoreinfo);
                         if (dd->sub_header_kdump->offset_vmcoreinfo &&
                                 dd->sub_header_kdump->size_vmcoreinfo) {
                                 dump_vmcoreinfo(fp);
                         }
                 }
                 if (dh->header_version >= 4) {
                         fprintf(fp, "         offset_note: %lx\n",
                                 (ulong)dd->sub_header_kdump->offset_note);
                         fprintf(fp, "           size_note: %lu\n",
                                 dd->sub_header_kdump->size_note);
 here ==================> 
                 }
 
 Call a function that dumps the file offset of each cpu's nt_prstatus data.  
 Then, if necessary, any cpu's register set can be read with "rd -f".
 
 
 netdump.c:
 
 In get_netdump_regs_x86(), both of these settings are incorrect:
 
                 user_regs = get_regs_from_note((char *)note, &ip, &sp);
      =====>     bt->flags |= BT_KDUMP_ELF_REGS;
                 if (is_kernel_text(ip) &&
                     (((sp >= GET_STACKBASE(bt->task)) &&
                       (sp < GET_STACKTOP(bt->task))) ||
                     in_alternate_stack(bt->tc->processor, sp))) {
                         *eip = ip;
                         *esp = sp;
      =====>             bt->flags |= BT_KERNEL_SPACE;
                         return;
                 }
 
 Neither of these flags should be set there.  BT_KDUMP_ELF_REGS
 is only applicable to x86_64, and BT_KERNEL_SPACE is only
 appicable to kvmdump dumpfiles.
 
 Finally, in the interest of paranoia, give the user the capability
 of *not* using this facility.  In main.c, create a "--no_elf_notes"
 option (similar to "--zero_excluded"), and have it set a NO_ELF_NOTES
 bit in the globally-accessible "*diskdump_flags".  As an example, see
 how ZERO_EXCLUDED is set, used, and accessed.  Then in read_dump_header()
 skip the new ELF setup if that flag is set.  It will not be necessary
 to go as creating a new environment variable like "zero_excluded", but
 it would be nice to be able to restrict its use on the crash invocation
 command line.
  
 Thanks,
   Dave