Haren Myneni wrote:
Rachita Kothiyal wrote:

>On Thu, Feb 23, 2006 at 09:49:37AM -0500, Dave Anderson wrote:
>
>
>>Ok, then I guess I'll take that as a thumbs-up.
>>
>>Waiting on Rachita's go-ahead...
>>
>>
>
>Dave,
>
>After the application of the patch (posted by Haren)
>on crash-4.0-2.21, I am now able to open the dump using crash
>for analysis.
>
>The following may be unrelated to the present discussion, but
>it is an observation:
>
>When I do 'bt -a' I get the following error on one of the cpus:
>
>PID: 2871   TASK: c000000161d05800  CPU: 4   COMMAND: "klogd"
>bt: invalid kernel virtual address: ff807a50  type: "Regs NIP value"
>
>
Rachita,
    As I mentioned before, this task should be running in user space.
You should notice the similar kind of stack trace even using GDB. Better
to give proper error message here.
 

Is ff807a50 typically a legitimate user-space stack address
in ppc64 user VM?  You could probably run the address
through IN_TASK_VMA(), and if it is a valid user-space
stack address, just indicate that the process was running
in user-space.

Now I understand why you (ppc64) dump the register set
first, because all the other processor types would show
a stack trace emanating from user-space down into the
reception of the IP interrupt issued by the panicking
processor.
 

 
About your other issue: I could not reproduce it.

crash 4.0-2.21
Copyright (C) 2002, 2003, 2004, 2005, 2006  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005  Fujitsu Limited
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.

GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "powerpc64-unknown-linux-gnu"...

crash: pglist_data.node_mem_map structure member does not exist.
crash: certain memory-related commands will fail or display invalid data

      KERNEL: /home/hbabu/2616-rc2-k1/vmlinux
    DUMPFILE: /home/vmcore_2616_rc2_0207
        CPUS: 2
        DATE: Tue Feb  7 16:56:08 2006
      UPTIME: 00:00:09
LOAD AVERAGE: 0.05, 0.24, 0.12
       TASKS: 57
    NODENAME: elm3a135
     RELEASE: 2.6.16-rc2-kexec-k1
     VERSION: #6 SMP Tue Feb 7 16:46:10 PST 2006
     MACHINE: ppc64  (unknown Mhz)
      MEMORY: 2.9 GB
       PANIC: "SysRq : Trigger a crashdump"
         PID: 11076
     COMMAND: "kpanic"
        TASK: c00000000bc6d800  [THREAD_INFO: c0000000ac504000]
         CPU: 1
       STATE: TASK_RUNNING (SYSRQ)

crash> bt
PID: 11076  TASK: c00000000bc6d800  CPU: 1   COMMAND: "kpanic"

 R0:  0000000000000000    R1:  c0000000ac507970    R2:  c00000000077a4a0
 R3:  c0000000ac5079e0    R4:  0000000000000000    R5:  0000000000000000
 R6:  756d700d0a657220    R7:  6120637261736864    R8:  0000000000000000
 R9:  c0000000007b0fa0    R10: 0000000000000000    R11: c0000000007b0fa8
 R12: 8000000000001032    R13: c0000000005a5d80    R14: 0000000000000000
 R15: 0000000000000000    R16: 00000000100bbf08    R17: 00000000100bbeb8
 R18: 0000000010070000    R19: 0000000000000000    R20: 0000000010046720
 R21: 000000000000001f    R22: 00000000100040e8    R23: 0000000010004d74
 R24: 8000000000009032    R25: 0000000000000000    R26: 0000000000000000
 R27: 0000000000000063    R28: 0000000000000009    R29: 0000000000000000
 R30: c0000000005e1560    R31: c0000000b96dd000
 NIP: c0000000000777a8    MSR: 8000000000001032    OR3: c0000000ac7202f8
 CTR: c000000000278b04    LR:  c000000000278b18    XER: 0000000000000000
 CCR: c0000000ac507b90    MQ:  0000000000000000    DAR: 0000000000000063
 DSISR: 0000000000000009     Syscall Result: 0000000000000000
 NIP [c0000000000777a8] .crash_kexec
 LR  [c000000000278b18] .sysrq_handle_crashdump

 #0 [c0000000ac507970] .crash_kexec at c0000000000777d0
 #1 [c0000000ac507b50] .sysrq_handle_crashdump at c000000000278b18
 #2 [c0000000ac507bd0] .__handle_sysrq at c0000000002789c0
 #3 [c0000000ac507c80] .write_sysrq_trigger at c000000000105478
 #4 [c0000000ac507d00] .vfs_write at c0000000000b72ec
 #5 [c0000000ac507d90] .sys_write at c0000000000b74c4
 #6 [c0000000ac507e30] syscall_exit at c0000000000086f8
 syscall  [c01] exception frame:
 R0:  0000000000000004    R1:  00000000ffd109d0    R2:  000000004001ee60
 R3:  0000000000000001    R4:  000000001004f4a8    R5:  0000000000000002
 R6:  000000001004f3a8    R7:  0000000000000011    R8:  000000001004f530
 R9:  0000000000000000    R10: 0000000000000000    R11: 0000000000000000
 R12: 0000000000000000    R13: 000000001004c9d8
 NIP: 000000000ff691e8    MSR: 000000000200f032    OR3: 0000000000000001
 CTR: 00000000100040ec    LR:  000000001000432c    XER: 0000000020000000
 CCR: 0000000048008448    MQ:  c00000000077a4a0    DAR: 00000000100040ec
 DSISR: 0000000040000000     Syscall Result: 0000000000000000

crash> set -c 0
    PID: 0
COMMAND: "swapper"
   TASK: c0000000005a5050  (1 of 2)  [THREAD_INFO: c000000000558000]
    CPU: 0
  STATE: TASK_RUNNING (ACTIVE)
crash> bt
PID: 0      TASK: c0000000005a5050  CPU: 0   COMMAND: "swapper"

 R0:  0000000000000000    R1:  c00000000055bd80    R2:  c00000000077a4a0
 R3:  0000000000000000    R4:  c0000000005a5350    R5:  0000000000000002
 R6:  0000000024004042    R7:  0000000000000000    R8:  c00000000055ba00
 R9:  c0000000005a4e88    R10: 0000008000000000    R11: 00003fef00100649
 R12: 0000000028004028    R13: c0000000005a5b80
 NIP: c000000000018648    MSR: 8000000000009032    OR3: 0000000000000000
 CTR: 0000000000000000    LR:  c0000000000186b8    XER: 0000000020000000
 CCR: 0000000044004042    MQ:  c0000000005a5050    DAR: c0000000b780b780
 DSISR: c0000000000186b8     Syscall Result: 0000000000000000
 NIP [c000000000018648] .default_idle

 #0 [c00000000055bd80] .default_idle at c0000000000186b8
 #1 [c00000000055be00] .cpu_idle at c0000000000184f4
 #2 [c00000000055be70] .rest_init at c0000000000092f4
 #3 [c00000000055bef0] .start_kernel at c000000000502760
 #4 [c00000000055bf90] .hmt_init at c000000000008574
crash> set -c 1
    PID: 11076
COMMAND: "kpanic"
   TASK: c00000000bc6d800  [THREAD_INFO: c0000000ac504000]
    CPU: 1
  STATE: TASK_RUNNING (SYSRQ)
crash> bt
PID: 11076  TASK: c00000000bc6d800  CPU: 1   COMMAND: "kpanic"

 R0:  0000000000000000    R1:  c0000000ac507970    R2:  c00000000077a4a0
 R3:  c0000000ac5079e0    R4:  0000000000000000    R5:  0000000000000000
 R6:  756d700d0a657220    R7:  6120637261736864    R8:  0000000000000000
 R9:  c0000000007b0fa0    R10: 0000000000000000    R11: c0000000007b0fa8
 R12: 8000000000001032    R13: c0000000005a5d80    R14: 0000000000000000
 R15: 0000000000000000    R16: 00000000100bbf08    R17: 00000000100bbeb8
 R18: 0000000010070000    R19: 0000000000000000    R20: 0000000010046720
 R21: 000000000000001f    R22: 00000000100040e8    R23: 0000000010004d74
 R24: 8000000000009032    R25: 0000000000000000    R26: 0000000000000000
 R27: 0000000000000063    R28: 0000000000000009    R29: 0000000000000000
 R30: c0000000005e1560    R31: c0000000b96dd000
 NIP: c0000000000777a8    MSR: 8000000000001032    OR3: c0000000ac7202f8
 CTR: c000000000278b04    LR:  c000000000278b18    XER: 0000000000000000
 CCR: c0000000ac507b90    MQ:  0000000000000000    DAR: 0000000000000063
 DSISR: 0000000000000009     Syscall Result: 0000000000000000
 NIP [c0000000000777a8] .crash_kexec
 LR  [c000000000278b18] .sysrq_handle_crashdump

 #0 [c0000000ac507970] .crash_kexec at c0000000000777d0
 #1 [c0000000ac507b50] .sysrq_handle_crashdump at c000000000278b18
 #2 [c0000000ac507bd0] .__handle_sysrq at c0000000002789c0
 #3 [c0000000ac507c80] .write_sysrq_trigger at c000000000105478
 #4 [c0000000ac507d00] .vfs_write at c0000000000b72ec
 #5 [c0000000ac507d90] .sys_write at c0000000000b74c4
 #6 [c0000000ac507e30] syscall_exit at c0000000000086f8
 syscall  [c01] exception frame:
 R0:  0000000000000004    R1:  00000000ffd109d0    R2:  000000004001ee60
 R3:  0000000000000001    R4:  000000001004f4a8    R5:  0000000000000002
 R6:  000000001004f3a8    R7:  0000000000000011    R8:  000000001004f530
 R9:  0000000000000000    R10: 0000000000000000    R11: 0000000000000000
 R12: 0000000000000000    R13: 000000001004c9d8
 NIP: 000000000ff691e8    MSR: 000000000200f032    OR3: 0000000000000001
 CTR: 00000000100040ec    LR:  000000001000432c    XER: 0000000020000000
 CCR: 0000000048008448    MQ:  c00000000077a4a0    DAR: 00000000100040ec
 DSISR: 0000000040000000     Syscall Result: 0000000000000000

crash>

Probably, this issue is showing up on your system (has 8 CPUS) since my
system is having only 2 CPUs. We need to investigate.
 

That's all I could think of as well.  Rachita also didn't mention
whether he could do "set <task|pid>" of that same task, and then
get a backtrace?  But a crash-gdb backtrace would be helpful.
 
Dave, I tested very few commands on PPC64 vmcore. Where as Rachita is
doing more testing. We might see some bugs which I have not encountered.
We will get back to you with patches as we find bugs.
 
That's understood and not a problem -- especially on kernels
that are beyond the RHEL4 era.  Do you want me to go ahead
and put out a new release with your paca fix?

Dave