Re: [Crash-utility] dis command not correct in crash
by Per Fransson
Hi,
On Tue, Mar 5, 2013 at 9:22 AM, Lei Wen <adrian.wenl(a)gmail.com> wrote:
> Per,
>
> On Tue, Mar 5, 2013 at 3:25 PM, Per Fransson <per.fransson.ml(a)gmail.com> wrote:
>> Hi Lei,
>>
>> On Tue, Mar 5, 2013 at 1:22 AM, Lei Wen <adrian.wenl(a)gmail.com> wrote:
>>> Hi Per,
>>>
>>>
>>> On Tue, Mar 5, 2013 at 4:38 AM, Per Fransson <per.fransson.ml(a)gmail.com>
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On Mon, Mar 4, 2013 at 8:49 PM, Mika Westerberg <mika.westerberg(a)iki.fi>
>>>> wrote:
>>>> > On Mon, Mar 04, 2013 at 10:20:42AM +0800, Lei Wen wrote:
>>>> >> I met "dis" command not correct issue when use the crash, any idea?
>>>> >> For built-in "dis" command in crash:
>>>> >> crash> dis task_rq_lock
>>>> >> 0xc015a2d8 <task_rq_lock>: rscsgt r0, sp, r3, lsl #14
>>>> >> 0xc015a2dc <task_rq_lock+4>: mrcgt 8, 7, r0, cr2, cr13, {5}
>>>> >> 0xc015a2e0 <task_rq_lock+8>: mcrvc 8, 4, r3, cr13, cr3, {6}
>>>> >> 0xc015a2e4 <task_rq_lock+12>: lslsvc r3, r10, r8
>>>> >> 0xc015a2e8 <task_rq_lock+16>: bl 0xc049fe34
>>>> >> <__ip_route_output_key+220>
>>>> >
>>>> > Looks weird.
>>>> >
>>>> > What is the kernel version? Does the 'dis' command work for other
>>>> > functions?
>>>> >
>>>>
>>>> You could do a check on one of the instructions - the 'bl' comes to
>>>> mind. Not sure, but I believe it should amount to:
>>>>
>>>> 0xeb000000 | (((0xc049fe34-0xc015a2f0) >> 2) & 0x00ffffff)
>>>>
>>>> i.e.
>>>>
>>>> 0xeb0d16d1
>>>>
>>>> Is that what you get with
>>>>
>>>> crash> rd 0xc015a2e8
>>>>
>>>> ?
>>>>
>>>> If not, try a
>>>>
>>>> crash> search 0xeb0d16d1
>>>>
>>>> and see if it turns up somewhere else.
>>>
>>>
>>>
>>>
>>> Yes, it is that value.
>>>
>>> crash> rd 0xc015a2e8
>>>
>>> c015a2e8: eb0d16d1 ....
>>>
>>>
>>>
>>> While in gdb, show the same address's value, it would be:
>>>
>>> (gdb) x 0xc015a2e8
>>>
>>> 0xc015a2e8 <task_rq_lock+16>: 0xe1a05000
>>>
>>>
>>>
>>> Why it didn't match with each other? Any idea?
>>>
>>>
>>
>> Nope, no idea. When you're using gdb, do you feed it the coredump as
>> well, or just the vmlinux? if you get the same strange result with
>> gdb+vmlinux+coredump, I think you should try to match some known data,
>> e.g. the 'bl' and see if the contents are offset somehow. Try the gdb
>> search command on 0xeb0d16d1.
>
> Your hypothesis is correct.
> When feed dump image with vmlinux to the gdb, I get exactly same result
> as crash...
>
> How to use the search command in gdb?
>
Oh, it's 'find' in gdb. To look for 0xeb0d16d1 in the virtual interval
0xc0000000--0xe0000000 you would:
(gdb) find /w 0xc0000000, +0x20000000, 0xeb0d16d1
or use your favorite hex editor.
If the dump isn't offset, it could be overwrites.
/Per
> Thanks,
> Lei
11 years, 8 months
Re: [Crash-utility] SLES 9 Dump
by Dave Anderson
----- Original Message -----
> Dave,
>
> On both RHEL 5.2 X64 and RHEL 5.5 X64, it showed: (Additional
> argument such as System.map does not make any difference.)
>
> # crash vmlinux-2.6.5-7.321-bigsmp dump-pts02504
> ...
> WARNING: machine type mismatch:
> crash utility: X86_64
> vmlinux-2.6.5-7.321-bigsmp: X86
>
> crash: vmlinux-2.6.5-7.321-bigsmp: not a supported file format
Right, as the message states, you're trying to analyze a 32-bit
x86 vmlinux/vmcore with the x86_64 version of the crash utility.
So just get the 32-bit x86 crash utility. If you can't find
one, then you can try building one on your x86_64 host:
$ wget http://people.redhat.com/anderson/crash-6.1.4.tar.gz
...
$ cd crash-6.1.4
$ make target=X86
...
$ ./crash vmlinux-2.6.5-7.321-bigsmp dump-pts02504
If the build fails, you can wget the crash-6.1.4-0.src.rpm
file from the same location, and the rpmbuild -ba will alert
you to the additional packages you need.
Dave
>
>
>
>
> On an RHEL 4.7 X86, it showed:
>
> # crash vmlinux-2.6.5-7.321-bigsmp dump-pts02504
> ...
> crash: vmlinux-2.6.5-7.321-bigsmp: no debugging data available
>
>
> # crash vmlinux-2.6.5-7.321-bigsmp dump-pts02504
> System.map-2.6.5-7.321-bigsmp
> ...
> crash: vmlinux-2.6.5-7.321-bigsmp: no debugging data available
>
>
> # crash vmlinux-2.6.5-7.321-bigsmp dump-pts02504
> Kerntypes-2.6.5-7.321-bigsmp
> ...
> crash: cannot resolve "_stext"
>
>
> Thanks.
>
> Eugene
>
>
> -----Original Message-----
> From: Dave Anderson [mailto:anderson@redhat.com]
> Sent: Tuesday, March 05, 2013 5:22 PM
> To: Ku, Eugene
> Cc: Discussion list for crash utility usage, maintenance and
> development
> Subject: Re: [Crash-utility] SLES 9 Dump
>
>
>
> ----- Original Message -----
> > Dave,
> >
> > Thank you for getting back to me so quickly.
> >
> > I have tried different ways to start crash but none is working. I
> > have downloaded kernel-bigsmp-2.6.5-7.321.i586.rpm from Novell to
> > match the version of the dump. This package includes the following
> > files:
> >
> > config-2.6.5-7.321-bigsmp System.map-2.6.5-7.321-bigsmp
> > Kerntypes-2.6.5-7.321-bigsmp vmlinux-2.6.5-7.321-bigsmp
> > symtypes-2.6.5-7.321-bigsmp vmlinuz-2.6.5-7.321-bigsmp
> > symvers-2.6.5-7.321-i386-bigsmp
> >
> > I don't believe Novell provides a kernel-debuginfo package for SLES
> > 9
> > or earlier version and I could not find it on their web site.
> >
> > What I have tried so far are all done on RHEL systems because SLES
> > 9
> > does not come with crash. Do I need to run crash against an SLES
> > dump on a compatible SLES system? A compatible system I mean the
> > same architecture. When I tried it on RHEL X64, it complained
> > machine type mismatch.
>
> What is the mismatch error message?
>
> Dave
>
11 years, 8 months
Re: [Crash-utility] SLES 9 Dump
by Dave Anderson
----- Original Message -----
> Dave,
>
> Thank you for getting back to me so quickly.
>
> I have tried different ways to start crash but none is working. I
> have downloaded kernel-bigsmp-2.6.5-7.321.i586.rpm from Novell to
> match the version of the dump. This package includes the following
> files:
>
> config-2.6.5-7.321-bigsmp System.map-2.6.5-7.321-bigsmp
> Kerntypes-2.6.5-7.321-bigsmp vmlinux-2.6.5-7.321-bigsmp
> symtypes-2.6.5-7.321-bigsmp vmlinuz-2.6.5-7.321-bigsmp
> symvers-2.6.5-7.321-i386-bigsmp
>
> I don't believe Novell provides a kernel-debuginfo package for SLES 9
> or earlier version and I could not find it on their web site.
>
> What I have tried so far are all done on RHEL systems because SLES 9
> does not come with crash. Do I need to run crash against an SLES
> dump on a compatible SLES system? A compatible system I mean the
> same architecture. When I tried it on RHEL X64, it complained
> machine type mismatch.
What is the mismatch error message?
Dave
11 years, 8 months
SLES 9 Dump
by Ku, Eugene
Hi,
Can we use crash to analyze a LKCD dump generated on an SLES 9 SP4 system? If crash can analyze such dump, what is the command syntax?
Thanks.
Eugene
11 years, 8 months
Re: [Crash-utility] timer: invalid list entry: 1 [ ARM ]
by Dave Anderson
----- Original Message -----
> as far as crashdump is concerned, this is how we take it.
>
> basically, we just dump whole ram (flat physical RAM), and then I
> have modified crash utlilty to convert ramdump (just plain ramdump)
> into arm elf32 format.
> and so it could get recognized by any debugger as crash utility.
> and it has been working great.
>
> I have loaded so many ramdumps, and timer and any other command is
> working perfectly fine.
> but only this scenario it has given such thing. where I suspected
> timer list corruption/crash utility problem.
Do all cpus in the kernel continue to run while your "ramdump" is
taking place? That's a likely explanation for the "timer" output to
be out of sync.
>
> wfi (is wait for interrupt), in the sense we let the cpu go ino idle/dormant when he has nothing to do.
> and the thread who has been scheduled earliest, the timer would have set accordingly and then wake the cpu up.
> here we are missing both timer interrupt on both cpu. that means that timer counter has much gone ahead, and it will never
> match programmed compare values. so its system freeze, as interrupts are not happening.
I don't understand. Why is the system not receiving timer interrupts
while the cpus are in their idle state?
> in that freeze, we have special keryboard interrupt to take task dump and other dumps.
> on that ramdump which I have crash utility would show
>
> crash> bt -a
> PID: 0 TASK: c097b8b0 CPU: 0 COMMAND: "swapper/0"
> bt: WARNING: cannot get stackframe for task
>
> PID: 0 TASK: dc84ca40 CPU: 1 COMMAND: "swapper/1"
> bt: WARNING: cannot get stackframe for task
Yeah, it appears that the ARM backtrace code presumes that the dumpfile
was taken with the kernel's kdump facility, because it gets the backtrace
starting points from the register values save in the kdump "crash_notes".
So you might try entering "bt -t" or "bt -T". But if the cpus were
sitting in the idle state, there's probably not much to see.
One thing I do *not* like about the ARM "bt" display is that it
does not show the stack address of each frame. But I think the
ARM maintainers did it that way to simulate the kernel's log
output.
Dave
11 years, 8 months
about stack frame
by mo can
Hi,
This is part of a stack backtrace of kernel 2.6.32-279.19.1.el6.x86_64,
bt -f
...
#7 [ffff880028283b38] dev_queue_xmit at ffffffff8142dac9
ffff880028283b40: ffff880028283b80 ffffffff81445ffa
ffff880028283b50: ffff8801175ef020 ffff880079551680
ffff880028283b60: ffff880116a07480 000000000000000e
ffff880028283b70: ffff880116a074c0 ffff88007a266c80
ffff880028283b80: ffff880028283bd0 ffffffff81432b75
#8 [ffff880028283b88] neigh_resolve_output at ffffffff81432b75
ffff880028283b90: ffff880028283c10 ffffffff814547b4
ffff880028283ba0: ffffffff81464e50 00000000000055b8
ffff880028283bb0: ffff880037a7a800 000000000000000e
ffff880028283bc0: ffff880079551680 ffff880037a7a858
ffff880028283bd0: ffff880028283c10 ffffffff81464f8c
...
crash> whatis dev_queue_xmit
int dev_queue_xmit(struct sk_buff *);
Take a look at #7, I know the value "ffffffff81432b75" is the return address, the saved RBP is ffff880028283bd0. What about the values between the address ffff880028283bd0 and address ffff880028283bd0? Are they stack variables in function dev_queue_xmit? How can I distinguish compared the source code? Actually, I want to display the variables(including function parameters and local variables).
Thanks.
11 years, 8 months