Re: [Crash-utility] crash 4.0-2.8 fails on 2.6.14-rc5 (EM64T)
by Dave Anderson
> There is no simple way to add #if KERNEL_VERSION > 2.6.10
> in the header file and leave the hardcoded values there ?
>
THIS_KERNEL_VERSION is based upon crash internal data variables in the
kernel_table data structure that get initialized in kernel_init(PRE_GDB)
based upon the contents of the kernel's "system_utsname" data structure
read from memory or the dumpfile.
I was mistaken in using the value of "_stext" as the qualifier, though,
since the __START_KERNEL_map value of 0xffffffff80000000 is still the same.
But there must be *some* difference in the symbol list that can be used
to determine which set of address values to use. It could even be just
the *existence* of some new kernel variable introduced as part of the
change to the new scheme. Doing an "nm -Bn" on the old and new
vmlinux files should yield something obvious.
> bt -t seems to better.
>
> crash> bt 3144
> PID: 3144 TASK: ffff81011dd1e100 CPU: 0 COMMAND: "mingetty"
> #0 [ffff81011d6b9c68] schedule at ffffffff803b12b3
> RIP: 000000377c7b85b2 RSP: 00007fffff87a110 RFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffffffff8010dc26 RCX: 00007fffff87a7b0
> RDX: 0000000000000001 RSI: 00007fffff87a8c7 RDI: 0000000000000000
> RBP: 00007fffff87aca0 R8: 00002aaaaaac9b00 R9: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000246 R12: 00007fffff87a900
> R13: 0000000000502d20 R14: 0000000000000000 R15: 000000007c92d8c0
> ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b
> crash> bt -t 3144
> PID: 3144 TASK: ffff81011dd1e100 CPU: 0 COMMAND: "mingetty"
> START: thread_return (schedule) at ffffffff803b12b3
> [ffff81011d6b9d10] do_con_write at ffffffff802689da
> [ffff81011d6b9d80] schedule_timeout at ffffffff803b1e4e
> [ffff81011d6b9db0] _spin_lock_irqsave at ffffffff803b28ce
> [ffff81011d6b9dc0] add_wait_queue at ffffffff8014cf5c
> [ffff81011d6b9de0] read_chan at ffffffff8025d1f7
> [ffff81011d6b9e48] default_wake_function at ffffffff80130c90
> [ffff81011d6b9e78] default_wake_function at ffffffff80130c90
> [ffff81011d6b9e90] tty_ldisc_deref at ffffffff802571c4
> [ffff81011d6b9ed0] tty_read at ffffffff802575ee
> [ffff81011d6b9f10] vfs_read at ffffffff80183a46
> [ffff81011d6b9f40] sys_read at ffffffff80183e03
> [ffff81011d6b9f80] system_call at ffffffff8010dc26
> RIP: 000000377c7b85b2 RSP: 00007fffff87a110 RFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffffffff8010dc26 RCX: 00007fffff87a7b0
> RDX: 0000000000000001 RSI: 00007fffff87a8c7 RDI: 0000000000000000
> RBP: 00007fffff87aca0 R8: 00002aaaaaac9b00 R9: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000246 R12: 00007fffff87a900
> R13: 0000000000502d20 R14: 0000000000000000 R15: 000000007c92d8c0
> ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b
> crash>
>
>
I still don't understand what happens in x86_64_low_budget_back_trace_cmd()
that causes the "bt" command to skip from the starting point in schedule()
to the end, where it dumps the user-mode entry exception frame, unless
the rsp has been bumped too high by the time it gets to this point:
/*
* Walk the process stack.
*/
for (i = (rsp - bt->stackbase)/sizeof(ulong);
!done && (rsp < bt->stacktop); i++, rsp += sizeof(ulong)) {
...and that conceivably may have something to do with the exception stack
problem. It's hard to say without being there...
Thanks,
Dave
19 years