> Given that "crash" runs fine on live machine, I am
going to assume
> that its a problem with kdump format for now :(
>
No -- wait -- please don't! ;-)
I now have the dumpfile Badari has been referring to in this thread.
While there are a couple crash-utility-related items to address, the
primary problem may in fact be related to kdump.
This issue is that crash is getting confused because of bizarre
disassembly of kernel text. Here's an example, which takes
crash out of the picture -- I will simply use gdb alone.
If I run gdb alone on the vmlinux file, and disassemble a kernel
function, everything looks normal:
# gdb vmlinux
GNU gdb Red Hat Linux (6.3.0.0-1.63rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db
library "/lib64/tls/libthread_db.so.1".
(gdb) disassemble sys_read
Dump of assembler code for function sys_read:
0xffffffff8017b991 <sys_read+0>: push %r13
0xffffffff8017b993 <sys_read+2>: mov %rsi,%r13
0xffffffff8017b996 <sys_read+5>: push %r12
0xffffffff8017b998 <sys_read+7>: mov $0xfffffffffffffff7,%r12
0xffffffff8017b99f <sys_read+14>: push %rbp
0xffffffff8017b9a0 <sys_read+15>: mov %rdx,%rbp
0xffffffff8017b9a3 <sys_read+18>: push %rbx
0xffffffff8017b9a4 <sys_read+19>: sub $0x18,%rsp
0xffffffff8017b9a8 <sys_read+23>: lea 0x14(%rsp),%rsi
0xffffffff8017b9ad <sys_read+28>: callq 0xffffffff8017bf23
<fget_light>
0xffffffff8017b9b2 <sys_read+33>: test %rax,%rax
0xffffffff8017b9b5 <sys_read+36>: mov %rax,%rbx
0xffffffff8017b9b8 <sys_read+39>: je 0xffffffff8017b9f1
<sys_read+96>
0xffffffff8017b9ba <sys_read+41>: mov 0x38(%rax),%rax
0xffffffff8017b9be <sys_read+45>: lea 0x8(%rsp),%rcx
0xffffffff8017b9c3 <sys_read+50>: mov %rbp,%rdx
0xffffffff8017b9c6 <sys_read+53>: mov %r13,%rsi
0xffffffff8017b9c9 <sys_read+56>: mov %rbx,%rdi
0xffffffff8017b9cc <sys_read+59>: mov %rax,0x8(%rsp)
0xffffffff8017b9d1 <sys_read+64>: callq 0xffffffff8017b52f <vfs_read>
0xffffffff8017b9d6 <sys_read+69>: mov %rax,%r12
0xffffffff8017b9d9 <sys_read+72>: mov 0x8(%rsp),%rax
0xffffffff8017b9de <sys_read+77>: mov %rax,0x38(%rbx)
0xffffffff8017b9e2 <sys_read+81>: cmpl $0x0,0x14(%rsp)
0xffffffff8017b9e7 <sys_read+86>: je 0xffffffff8017b9f1
<sys_read+96>
0xffffffff8017b9e9 <sys_read+88>: mov %rbx,%rdi
0xffffffff8017b9ec <sys_read+91>: callq 0xffffffff8017be05 <fput>
0xffffffff8017b9f1 <sys_read+96>: add $0x18,%rsp
0xffffffff8017b9f5 <sys_read+100>: mov %r12,%rax
0xffffffff8017b9f8 <sys_read+103>: pop %rbx
0xffffffff8017b9f9 <sys_read+104>: pop %rbp
0xffffffff8017b9fa <sys_read+105>: pop %r12
0xffffffff8017b9fc <sys_read+107>: pop %r13
0xffffffff8017b9fe <sys_read+109>: retq
End of assembler dump.
(gdb)
Now, if I bring the suspect kdump vmcore into the picture,
check this out:
# gdb vmlinux vmcore
GNU gdb Red Hat Linux (6.3.0.0-1.63rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db
library "/lib64/tls/libthread_db.so.1".
#0 mwait_idle () at include/asm/thread_info.h:63
63 include/asm/thread_info.h: No such file or directory.
in include/asm/thread_info.h
(gdb) disassemble sys_read
Dump of assembler code for function sys_read:
0xffffffff8017b991 <sys_read+0>: add %cl,0xffffffffffffff83(%rax)
0xffffffff8017b994 <sys_read+3>: (bad)
0xffffffff8017b995 <sys_read+4>: sbb %cl,0xffffffffffffff89(%rax)
0xffffffff8017b998 <sys_read+7>: callq 0xffffffffdc5916f8
0xffffffff8017b99d <sys_read+12>: pop %r13
0xffffffff8017b99f <sys_read+14>: retq
0xffffffff8017b9a0 <sys_read+15>: mov 1862306(%rip),%eax #
0xffffffff80342448 <files_stat+8>
0xffffffff8017b9a6 <sys_read+21>: retq
0xffffffff8017b9a7 <sys_read+22>: push %rbx
0xffffffff8017b9a8 <sys_read+23>: cmp %rdi,(%rdi)
0xffffffff8017b9ab <sys_read+26>: mov %rdi,%rbx
0xffffffff8017b9ae <sys_read+29>: je 0xffffffff8017b9db
<sys_read+74>
0xffffffff8017b9b0 <sys_read+31>: mov $0xffffffff80456b80,%rdi
0xffffffff8017b9b7 <sys_read+38>: callq 0xffffffff802d0aad
<__down_interruptible+75>
0xffffffff8017b9bc <sys_read+43>: mov (%rbx),%rdx
0xffffffff8017b9bf <sys_read+46>: mov 0x8(%rbx),%rax
0xffffffff8017b9c3 <sys_read+50>: mov %rax,0x8(%rdx)
0xffffffff8017b9c7 <sys_read+54>: mov %rdx,(%rax)
0xffffffff8017b9ca <sys_read+57>: mov %rbx,0x8(%rbx)
0xffffffff8017b9ce <sys_read+61>: mov %rbx,(%rbx)
0xffffffff8017b9d1 <sys_read+64>: movl $0x1,2994597(%rip) #
0xffffffff80456b80 <files_lock>
0xffffffff8017b9db <sys_read+74>: pop %rbx
0xffffffff8017b9dc <sys_read+75>: retq
0xffffffff8017b9dd <sys_read+76>: push %rbx
0xffffffff8017b9de <sys_read+77>: mov %rdi,%rbx
0xffffffff8017b9e1 <sys_read+80>: lock decl 0x28(%rdi)
0xffffffff8017b9e5 <sys_read+84>: sete %al
0xffffffff8017b9e8 <sys_read+87>: test %al,%al
0xffffffff8017b9ea <sys_read+89>: je 0xffffffff8017ba21
<sys_write+34>
0xffffffff8017b9ec <sys_read+91>: mov 2787309(%rip),%rax #
0xffffffff804241e0 <security_ops>
0xffffffff8017b9f3 <sys_read+98>: callq *0x1e8(%rax)
0xffffffff8017b9f9 <sys_read+104>: mov %rbx,%rdi
0xffffffff8017b9fc <sys_read+107>: callq 0xffffffff8017b9a7
<sys_read+22>
End of assembler dump.
(gdb)
I see exactly the same behaviour using the embedded gdb-6.1
in the crash utility.
I'm digging into it more as we speak, but I just wanted to
throw this out for now.
Thanks,
Dave