On Fri, 2005-10-28 at 17:55 -0400, Dave Anderson wrote:
Badari Pulavarty wrote:
> On Thu, 2005-10-27 at 14:36 -0400, Dave Anderson wrote:
> >
> >
> > #ifdef X86_64
> > #define _64BIT_
> > #define MACHINE_TYPE "X86_64"
> >
> > #define USERSPACE_TOP 0x0000008000000000
> > #define __START_KERNEL_map 0xffffffff80000000
> > #define PAGE_OFFSET 0x0000010000000000
> >
> > #define VMALLOC_START 0xffffff0000000000
> > #define VMALLOC_END 0xffffff7fffffffff
> > #define MODULES_VADDR 0xffffffffa0000000
> > #define MODULES_END 0xffffffffafffffff
> > #define MODULES_LEN (MODULES_END - MODULES_VADDR)
> >
> > So I believe the place to start would be to make these
> > values into x86_64-specific variables that get initialized
> > early on based upon the symbol values gathered during
> > symtab_init(), which is called by main(). After it
> > completes, machdep_init(PRE_GDB) is called, i.e. x86_64_init():
> >
> > /*
> > * Initialize various subsystems.
> > */
> > fd_init();
> > buf_init();
> > cmdline_init();
> > mem_init();
> > machdep_init(PRE_SYMTAB);
> > symtab_init();
> > machdep_init(PRE_GDB);
> > kernel_init(PRE_GDB);
> > verify_version();
> > datatype_init();
> >
> > In x86_64_init(PRE_GDB), the former hardwired #defines would need
> > to be variables, initialized properly based upon clues in the
> symbol
> > list.
> >
> > Interested in taking a look into this?
> >
> > Dave
>
> Well, I took a stab at it. Here are the changes I made to "defs.h"
> looking at Documentation/x86_64/mm.txt. We need to some how put
> this under "#if THIS_KERNEL_VERSION > 2.6.10".
>
>
First off -- thanks very much for all you've done so far. I
really appreciate the effort.
Anyway, what I meant was that -- for x86_64 specifically -- things
like USERSPACE_TOP, PAGE_OFFSET, VMALLOC_START, etc. should no longer
be hardwired #defines, but instead, they should be references to
x86_64 data variables define in x86_64.c. So, for example,
USERSPACE_TOP would be defined something like:
#define USERSPACE_TOP (x86_64_userspace_top)
and there would be one x86_64_xxx variable per virtual address
item. And each of the variables would be initialized in
machdep_init(PRE_GDB), which is called just after symtab_init().
The fact that symtab_init() has been done is important because
the variables behind "THIS_KERNEL_VERSION" haven't even been
initialized yet. So instead, I would look at the symbol_value()
of "_stext", or some known kernel text symbol, and based upon its
value, it would be obvious whether to use the "old" or "new"
virtual address values to then set up the each of the x86_64_xxxx
virtual address values.
But for testing the new addresses, what you've done below should
suffice.
>
There is no simple way to add #if KERNEL_VERSION > 2.6.10
in the header file and leave the hardcoded values there ?
>
> ---
defs.h.org 2005-10-28 13:43:11.000000000 -0700
> +++ defs.h 2005-10-28 13:53:58.000000000 -0700
> @@ -1740,14 +1740,14 @@ struct load_module {
> #define _64BIT_
> #define MACHINE_TYPE "X86_64"
>
> -#define USERSPACE_TOP 0x0000008000000000
> +#define USERSPACE_TOP 0x0000800000000000
> #define __START_KERNEL_map 0xffffffff80000000
> -#define PAGE_OFFSET 0x0000010000000000
> +#define PAGE_OFFSET 0xffff810000000000
>
> -#define VMALLOC_START 0xffffff0000000000
> -#define VMALLOC_END 0xffffff7fffffffff
> -#define MODULES_VADDR 0xffffffffa0000000
> -#define MODULES_END 0xffffffffafffffff
> +#define VMALLOC_START 0xffffc20000000000
> +#define VMALLOC_END 0xffffe1ffffffffff
> +#define MODULES_VADDR 0xffffffff88000000
> +#define MODULES_END 0xfffffffffff00000
> #define MODULES_LEN (MODULES_END - MODULES_VADDR)
>
> #define PTOV(X) ((unsigned long)(X)+(machdep-
> >kvbase))
>
> Even with these changes, I am not sure if crash is running
> fine. Its seem doesn't show any useful stacks + there is a
> warning on start (about exception stacks).
>
>
I'm wondering whether the per-cpu calculations are being
done correctly? The exception stack addresses come from the
same per-cpu tss_struct code that started this whole mess,
and if the per-cpu address calculations needed to find those data
structures were incorrect, it would lead to exception stack
error message that you're seeing. This is the old code, but if
the readmem() of 7 ebase addresses below came from the
wrong place, the error message you're seeing would result:
} else if (symbol_exists("per_cpu__init_tss")) {
for (c = 0; c < NR_CPUS; c++) {
if ((kt->flags & SMP) && (kt->flags &
PER_CPU_OFF)) {
if (kt->__per_cpu_offset[c] == 0)
break;
vaddr = symbol_value
("per_cpu__init_tss") +
kt->__per_cpu_offset[c];
} else
vaddr = symbol_value
("per_cpu__init_tss");
vaddr += OFFSET(tss_struct_ist);
readmem(vaddr, KVADDR, &ms->stkinfo.ebase
[c][0],
sizeof(ulong) * 7, "tss_struct ist
array",
FAULT_ON_ERROR);
if (ms->stkinfo.ebase[c][0] == 0)
break;
}
}
The error message only error checks the contents of cpu 0's array
of exception stack addresses, the first of which should be a
pointer to the "boot_exception_stacks" array in the kernel.
I will take a closer look.
>
>
> [root@localhost crash-4.0-2.8]# ./crash
>
> crash 4.0-2.8
> Copyright (C) 2002, 2003, 2004, 2005 Red Hat, Inc.
> Copyright (C) 2004, 2005 IBM Corporation
> Copyright (C) 1999-2005 Hewlett-Packard Co
> Copyright (C) 1999, 2002 Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public
> License,
> and you are welcome to change it and/or distribute copies of it
> under
> certain conditions. Enter "help copying" to see the conditions.
> This program has absolutely no warranty. Enter "help warranty" for
> details.
>
> GNU gdb 6.1
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and
> you
> are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for
> details.
> This GDB was configured as "x86_64-unknown-linux-gnu"...
>
> WARNING: cpu 0 first exception stack: cccccccccccccccc
> boot_exception_stacks: ffffffff8052ce80
>
> KERNEL: /usr/src/linux-2.6.14-rc5-madv/vmlinux
> DUMPFILE: /dev/mem
> CPUS: 2
> DATE: Fri Oct 28 13:58:50 2005
> UPTIME: 06:32:12
> LOAD AVERAGE: 0.11, 0.10, 0.06
> TASKS: 66
> NODENAME: localhost.localdomain
> RELEASE: 2.6.14-rc5
> VERSION: #10 SMP Wed Oct 26 15:58:51 PDT 2005
> MACHINE: x86_64 (3000 Mhz)
> MEMORY: 4.6 GB
> PID: 1460
> COMMAND: "crash"
> TASK: ffff810122c9f0c0 [THREAD_INFO: ffff810113442000]
> CPU: 0
> STATE: TASK_RUNNING (ACTIVE)
>
> crash>
> crash> bt 13939
> PID: 13939 TASK: ffff810119123740 CPU: 0 COMMAND: "vi"
> #0 [ffff810114535c78] schedule at ffffffff803b12b3
> RIP: 000000377c7beb95 RSP: 00007ffffff402d8 RFLAGS: 00010246
> RAX: 0000000000000017 RBX: ffffffff8010dc26 RCX:
> 00007ffffff40388
> RDX: 0000000000000000 RSI: 00007ffffff400a0 RDI:
> 0000000000000001
> RBP: 0000000000000000 R8: 0000000000000000 R9:
> 00007ffffff40020
> R10: 00007ffffff40020 R11: 0000000000000246 R12:
> 000000000058b0e0
> R13: 000000000058b0e0 R14: 0000000000000058 R15:
> 0000000000000001
> ORIG_RAX: 0000000000000017 CS: 0033 SS: 002b
>
> It shows only "schedule" for all processes. Doesn't seem to show
> any more stack traces.
>
>
I don't really have any suggestions here, other than to determine
why the x86_64_low_budget_back_trace_cmd() section that walks the
process stack is only finding/printing the schedule() line.
Does "bt -t" work?
I note that this one doesn't show the "cannot access vmalloc space"
message. Can you read vmalloc and user space addresses? Does "mod"
work? How about "runq", which is one of the places that depends
upon being able to read per-cpu data?
bt -t seems to better.
crash> bt 3144
PID: 3144 TASK: ffff81011dd1e100 CPU: 0 COMMAND: "mingetty"
#0 [ffff81011d6b9c68] schedule at ffffffff803b12b3
RIP: 000000377c7b85b2 RSP: 00007fffff87a110 RFLAGS: 00010246
RAX: 0000000000000000 RBX: ffffffff8010dc26 RCX: 00007fffff87a7b0
RDX: 0000000000000001 RSI: 00007fffff87a8c7 RDI: 0000000000000000
RBP: 00007fffff87aca0 R8: 00002aaaaaac9b00 R9: 0000000000000000
R10: 0000000000000001 R11: 0000000000000246 R12: 00007fffff87a900
R13: 0000000000502d20 R14: 0000000000000000 R15: 000000007c92d8c0
ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b
crash> bt -t 3144
PID: 3144 TASK: ffff81011dd1e100 CPU: 0 COMMAND: "mingetty"
START: thread_return (schedule) at ffffffff803b12b3
[ffff81011d6b9d10] do_con_write at ffffffff802689da
[ffff81011d6b9d80] schedule_timeout at ffffffff803b1e4e
[ffff81011d6b9db0] _spin_lock_irqsave at ffffffff803b28ce
[ffff81011d6b9dc0] add_wait_queue at ffffffff8014cf5c
[ffff81011d6b9de0] read_chan at ffffffff8025d1f7
[ffff81011d6b9e48] default_wake_function at ffffffff80130c90
[ffff81011d6b9e78] default_wake_function at ffffffff80130c90
[ffff81011d6b9e90] tty_ldisc_deref at ffffffff802571c4
[ffff81011d6b9ed0] tty_read at ffffffff802575ee
[ffff81011d6b9f10] vfs_read at ffffffff80183a46
[ffff81011d6b9f40] sys_read at ffffffff80183e03
[ffff81011d6b9f80] system_call at ffffffff8010dc26
RIP: 000000377c7b85b2 RSP: 00007fffff87a110 RFLAGS: 00010246
RAX: 0000000000000000 RBX: ffffffff8010dc26 RCX: 00007fffff87a7b0
RDX: 0000000000000001 RSI: 00007fffff87a8c7 RDI: 0000000000000000
RBP: 00007fffff87aca0 R8: 00002aaaaaac9b00 R9: 0000000000000000
R10: 0000000000000001 R11: 0000000000000246 R12: 00007fffff87a900
R13: 0000000000502d20 R14: 0000000000000000 R15: 000000007c92d8c0
ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b
crash>
vmalloc space seems ok:
crash> mod
MODULE NAME SIZE OBJECT FILE
ffffffff88011f80 floppy 77896 (not loaded) [CONFIG_KALLSYMS]
ffffffff8801db80 i2c_core 29056 (not loaded) [CONFIG_KALLSYMS]
ffffffff88022800 i2c_i801 11796 (not loaded) [CONFIG_KALLSYMS]
ffffffff88025900 hw_random 7968 (not loaded) [CONFIG_KALLSYMS]
ffffffff88030500 ehci_hcd 39688 (not loaded) [CONFIG_KALLSYMS]
ffffffff8803ae80 uhci_hcd 38048 (not loaded) [CONFIG_KALLSYMS]
ffffffff88086380 ipv6 309760 (not loaded) [CONFIG_KALLSYMS]
ffffffff8809a380 dm_mod 70232 (not loaded) [CONFIG_KALLSYMS]
ffffffff880a3100 dm_mirror 26504 (not loaded) [CONFIG_KALLSYMS]
ffffffff880cc300 sunrpc 177096 (not loaded) [CONFIG_KALLSYMS]
ffffffff880d8100 autofs4 26376 (not loaded) [CONFIG_KALLSYMS]
ffffffff880e5100 parport 46988 (not loaded) [CONFIG_KALLSYMS]
ffffffff880ea800 lp 17616 (not loaded) [CONFIG_KALLSYMS]
ffffffff880f6c80 parport_pc 33768 (not loaded) [CONFIG_KALLSYMS]
crash> runq
RUNQUEUES[0]: ffff8100050ee6e0
ACTIVE PRIO_ARRAY: ffff8100050ee760
[115] PID: 30383 TASK: ffff810110c720c0 CPU: 0 COMMAND: "crash"
PID: 30227 TASK: ffff810115f1f8c0 CPU: 0 COMMAND: "sshd"
EXPIRED PRIO_ARRAY: ffff8100050ef040
RUNQUEUES[1]: ffff8100050f66e0
ACTIVE PRIO_ARRAY: ffff8100050f6760
[117] PID: 3505 TASK: ffff81011cec4780 CPU: 1 COMMAND: "crash"
EXPIRED PRIO_ARRAY: ffff8100050f7040
Have a nice weekend, we can take a look at it on Monday.
Thanks,
Badari