Re: [Crash-utility] Unable to switch stack frames while using crash

Wednesday, 15 June 2011

Hi,

Thank you Dave for your time and help. 

As suggested, I will update my crash utility first and then go about analyzing the dump. 

...
>> I believe that something like this might work?: 
...
>>  $ makedumpfile -c -d 31 -x vmlinux_temp vmcore-old
vmcore-new 
I tried to use the command you suggested " makedumpfile  -c -d 31 -x vmlinux_temp
vmcore vmcore-new " . I got an error message " The kernel version is not
supported.The created dumpfile may be incomplete. check_release: Can't get the kernel
version" 

Should I update makedumpfile utility as well? Or just updating crash will do?

...
>> Are you trying to re-create an ELF style dumpfile on purpose?

I tried to recreate the vmcore file in ELF format because, I can't get access to the
original uncompressed ELF dump file which is in the customer machine.

Thanks and Regards
Shashidhara

-----Original Message-----
From: crash-utility-bounces(a)redhat.com [mailto:crash-utility-bounces@redhat.com] On Behalf
Of Dave Anderson
Sent: Wednesday, June 15, 2011 9:33 PM
To: Discussion list for crash utility usage,maintenance and development
Subject: Re: [Crash-utility] Unable to switch stack frames while using crash

----- Original Message -----
...
 Hi Dave,

 Thanks for the help, I have further input regarding query 3 . Please
 help.

 > 3) I want to retrieve the address of a data structure in the current
 > context. How can it be done? I tried using struct command, but it did
 > not help

 The struct command needs the correct virtual address of the structure
 you're trying to view. So I presume you're asking how to find the address
 of the data structure? If that's true, you're going to have to be a lot
 more specific.

 >> I need to find out the virtual address of the structure tty of type
 >> struct tty_struct, which is passed as an argument to the function
 >> n_read_tty. Below is the corresponding stack trace.

 >>PID: 13366 TASK: ffff88031b60d580 CPU: 1 COMMAND: "telnet"
 >> #0 [ffff88031ce759d0] machine_kexec at ffffffff81024486
 >> #1 [ffff88031ce75a40] crash_kexec at ffffffff8107e230
 >> #2 [ffff88031ce75b20] oops_end at ffffffff8100fa38
 >> #3 [ffff88031ce75b50] no_context at ffffffff8102d801
 >> #4 [ffff88031ce75ba0] __bad_area_nosemaphore at ffffffff8102d9c9
 >> #5 [ffff88031ce75c70] bad_area at ffffffff8102da41
 >> #6 [ffff88031ce75ca0] do_page_fault at ffffffff8102dd19
 >> #7 [ffff88031ce75cf0] page_fault at ffffffff812d7425
 >> #8 [ffff88031ce75d78] n_tty_read at ffffffff811f03b3
 >> #9 [ffff88031ce75ec0] tty_read at ffffffff811ebf7e
 >> #10 [ffff88031ce75f10] vfs_read at ffffffff810ebcc8
 >> #11 [ffff88031ce75f40] sys_read at ffffffff810ebe48
 >> #12 [ffff88031ce75f80] system_call_fastpath at ffffffff8100bbc2
 >> RIP: 00007ffff716b9e0 RSP: 00007fffffffdfc0 RFLAGS: 00010212
 >> RAX: 0000000000000000 RBX: ffffffff8100bbc2 RCX: 0000000000000000
 >> RDX: 0000000000001ff6 RSI: 000000000061c02a RDI: 0000000000000000
 >> RBP: 0000000000001ff6 R8: 0000000000000000 R9: 0000000000000000
 >> R10: 0000000000616680 R11: 0000000000000246 R12: 0000000000000000
 >> R13: 0000000000000001 R14: 000000000061c02a R15: 00000000006178a0
 >> ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b 
First I would update your crash utility so that you have the exception
frame dump that was a result of the page fault, because it's possible that
the tty structure pointer is in the register dump.  But anyway, without
knowing the kernel version, it's hard to pinpoint exactly which instruction
in n_tty_read() generated the page fault.  Was the bad address generated
because the tty structure pointer was NULL?  And again, with an updated
crash utility, you'll get more information w/respect to the register
contents at the time of the page fault, and also you might get some help
finding it with "bt -F".  I'm not sure where the tty structure gets
allocated from -- is it statically-allocated, or is it allocated from
one of the "size-xxx" slab caches, etc...

...

 >> I have another query I tried to convert the vmcore file to ELF
 >> format using "makedumpfile -E -d 31 -x vmlinux_temp vmcore
 >> dumpfile" . For which I got an error message " '-E' option is
 >> disable, because vmcore is kdump compressed format. makedumpfile Failed".

 >>Please guide me further 
Refiltering and the -E argument cannot be used together because
makedumpfile cannot regenerate an ELF vmcore file from a previously
compressed kdump dumpfile.

I believe that something like this might work?:

  $ makedumpfile -c -d 31 -x vmlinux_temp vmcore-old vmcore-new

Are you trying to re-create an ELF style dumpfile on purpose?

Dave

...
 Thanks and Regards
 Shashidhara

 -----Original Message-----
 From: crash-utility-bounces(a)redhat.com
 [mailto:crash-utility-bounces@redhat.com] On Behalf Of Dave Anderson
 Sent: Wednesday, June 15, 2011 8:38 PM
 To: Discussion list for crash utility usage,maintenance and
 development
 Subject: Re: [Crash-utility] Unable to switch stack frames while using
 crash

 ----- Original Message -----
 > Hi,
 >
 > I was investigating a 64 bit linux kernel dump . I have following
 > doubts regarding usage of crash.
 >
 > 1) I wanted to access the intermediate kernel stack frames. To know
 > the status of the frame and the point of failure.
 >
 > When I tried to access a stack frame I get an error message “ crash:
 > prohibited gdb command: frame ”. Can you please let me know if there
 > is any other way of accessing the kernel stack frames using crash.

 Right -- the embedded gdb doesn't know anything about the core file
 (or live system) that you're running on. It's invoked as "gdb
 vmlinux",
 and doesn't know anything about any "frames".

 As Flavio mentioned, you can see the stack data of each frame with
 "bt -f", or better yet, "bt -F" which may illuminate what the data
 may be, because it shows symbolic translations or slab cache names
 instead of raw values where appropriate.

 > 2) When I run bt in crash, I get a stack trace. Another person from
 > a
 > different team reported a slightly different stack trace to mine.
 > Below are the stack traces. The register contents are quite
 > different
 > between the two
 >
 > My stack trace
 >
 > PID: 13366 TASK: ffff88031b60d580 CPU: 1 COMMAND: "telnet"
 >
 > #0 [ffff88031ce759d0] machine_kexec at ffffffff81024486
 > #1 [ffff88031ce75a40] crash_kexec at ffffffff8107e230
 > #2 [ffff88031ce75b20] oops_end at ffffffff8100fa38
 > #3 [ffff88031ce75b50] no_context at ffffffff8102d801
 > #4 [ffff88031ce75ba0] __bad_area_nosemaphore at ffffffff8102d9c9
 > #5 [ffff88031ce75c70] bad_area at ffffffff8102da41
 > #6 [ffff88031ce75ca0] do_page_fault at ffffffff8102dd19
 > #7 [ffff88031ce75cf0] page_fault at ffffffff812d7425
 > #8 [ffff88031ce75d78] n_tty_read at ffffffff811f03b3
 > #9 [ffff88031ce75ec0] tty_read at ffffffff811ebf7e
 > #10 [ffff88031ce75f10] vfs_read at ffffffff810ebcc8
 > #11 [ffff88031ce75f40] sys_read at ffffffff810ebe48
 > #12 [ffff88031ce75f80] system_call_fastpath at ffffffff8100bbc2
 > RIP: 00007ffff716b9e0 RSP: 00007fffffffdfc0 RFLAGS: 00010212
 > RAX: 0000000000000000 RBX: ffffffff8100bbc2 RCX: 0000000000000000
 > RDX: 0000000000001ff6 RSI: 000000000061c02a RDI: 0000000000000000
 > RBP: 0000000000001ff6 R8: 0000000000000000 R9: 0000000000000000
 > R10: 0000000000616680 R11: 0000000000000246 R12: 0000000000000000
 > R13: 0000000000000001 R14: 000000000061c02a R15: 00000000006178a0
 > ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b
 >
 >
 > Reported stack trace
 >
 > PID: 13366 TASK: ffff88031b60d580 CPU: 1 COMMAND: "telnet"
 > #0 [ffff88031ce759d0] machine_kexec at ffffffff81024486
 > #1 [ffff88031ce75a40] crash_kexec at ffffffff8107e230
 > #2 [ffff88031ce75ad8] n_tty_read at ffffffff811f03b3
 > #3 [ffff88031ce75b20] oops_end at ffffffff8100fa38
 > #4 [ffff88031ce75b50] no_context at ffffffff8102d801
 > #5 [ffff88031ce75ba0] __bad_area_nosemaphore at ffffffff8102d9c9
 > #6 [ffff88031ce75c20] native_sched_clock at ffffffff810120aa
 > #7 [ffff88031ce75c70] bad_area at ffffffff8102da41
 > #8 [ffff88031ce75ca0] do_page_fault at ffffffff8102dd19
 > #9 [ffff88031ce75cf0] page_fault at ffffffff812d7425
 > [exception RIP: n_tty_read+1420]
 > RIP: ffffffff811f03b3 RSP: ffff88031ce75da8 RFLAGS: 00010246
 > RAX: 0000000000000000 RBX: ffff8802cbd54a68 RCX: 000000000061c044
 > RDX: 0000000000000005 RSI: ffff88031ce75e87 RDI: ffff8802cbd54d1c
 > RBP: ffff88031ce75eb8 R8: 0000000000000000 R9: 0000000000000000
 > R10: 0000000000616680 R11: 0000000000000246 R12: 000000000061c044
 > R13: ffff8802cbd54800 R14: 0000000000000000 R15: 7fffffffffffffff
 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff88031ce75ec0]
 > #10 [ffff88031ce75ec0] tty_read at ffffffff811ebf7e
 > #11 [ffff88031ce75f10] vfs_read at ffffffff810ebcc8
 > #12 [ffff88031ce75f40] sys_read at ffffffff810ebe48
 > #13 [ffff88031ce75f80] system_call_fastpath at ffffffff8100bbc2

 The first backtrace is different because you are apparently using an
 older version of the crash utility, because it is not showing the
 page fault exception frame like the "reported" version.

 >
 > 3) I want to retrieve the address of a data structure in the current
 > context. How can it be done? I tried using struct command, but it
 > did
 > not help

 The struct command needs the correct virtual address of the structure
 you're trying to view. So I presume you're asking how to find the
 address
 of the data structure? If that's true, you're going to have to be a
 lot
 more specific.

 > 4) When I run the command readelf -a vmcore, I get an error message
 > ”readelf: Error: Not an ELF file - it has the wrong magic bytes at
 > the
 > start.”

 I presume that the dumpfile is a compressed kdump dumpfile generated
 by makedumpfile, which takes the original /proc/vmcore ELF dumpfile
 and creates its own unique dumpfile format.

 Dave

 --
 Crash-utility mailing list
 Crash-utility(a)redhat.com
 https://www.redhat.com/mailman/listinfo/crash-utility

 Information transmitted by this e-mail is proprietary to MphasiS, its
 associated companies and/ or its customers and is intended
 for use only by the individual or entity to which it is addressed, and
 may contain information that is privileged, confidential or
 exempt from disclosure under applicable law. If you are not the
 intended recipient or it appears that this mail has been forwarded
 to you without proper authority, you are notified that any use or
 dissemination of this information in any manner is strictly
 prohibited. In such cases, please notify us immediately at
 mailmaster(a)mphasis.com and delete this mail from your records.

 --
 Crash-utility mailing list
 Crash-utility(a)redhat.com
 https://www.redhat.com/mailman/listinfo/crash-utility 
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility

Information transmitted by this e-mail is proprietary to MphasiS, its associated companies
and/ or its customers and is intended 
for use only by the individual or entity to which it is addressed, and may contain
information that is privileged, confidential or 
exempt from disclosure under applicable law. If you are not the intended recipient or it
appears that this mail has been forwarded 
to you without proper authority, you are notified that any use or dissemination of this
information in any manner is strictly 
prohibited. In such cases, please notify us immediately at mailmaster(a)mphasis.com and
delete this mail from your records.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Crash-utility] Unable to switch stack frames while using crash