----- "Paul Cahier" <pkc(a)f1-photo.com> wrote:
Hello,
I have finished setting up kdump and kexec today, recompiling my kernel
to add everything needed in there.
I have triggered a kernel panic by echo c>/proc/sysrq-trigger, and found
that the vmcore dump was indeed there after all was done.
However I can not get any traces out of that crash dump(short version,
long version at the end of the email):
crash /usr/src/linux-2.6.35.3/vmlinux vmcore.201008231930
[...]
crash: read error: kernel virtual address: c148a9a0 type:
"kernel_config_data"
WARNING: cannot read kernel_config_data
crash: read error: kernel virtual address: c1487e28 type: "cpu_possible_mask"
The virtual addresses for "kernel_config_data" and "cpu_possible_mask"
are
strange (too high?) -- I'll continue the analysis at the end of your "d7"
output below...
If I try crash --minimal things do load but I'm stuck with the
minimal
error set that's not very helpful.
All I'm looking at is getting a full trace of the kernel panic.
- Paul-Kenji Cahier
PS, the full version:
crash -d7 /usr/src/linux-2.6.35.3/vmlinux vmcore.201008231930
crash 5.0.6
Copyright (C) 2002-2010 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public
License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for
details.
vmcore_data:
flags: a0 (KDUMP_LOCAL|KDUMP_ELF32)
ndfd: 3
ofp: b77344c0
header_size: 1860
num_pt_load_segments: 9
pt_load_segment[0]:
file_offset: 744
phys_start: 0
phys_end: a0000
zero_fill: 0
pt_load_segment[1]:
file_offset: a0744
phys_start: 100000
phys_end: 1000000
zero_fill: 0
pt_load_segment[2]:
file_offset: fa0744
phys_start: 5000000
phys_end: 38000000
zero_fill: 0
pt_load_segment[3]:
file_offset: 33fa0744
phys_start: 38000000
phys_end: 3e5ff000
zero_fill: 0
pt_load_segment[4]:
file_offset: 3a59f744
phys_start: 3e6c6000
phys_end: 3f594000
zero_fill: 0
pt_load_segment[5]:
file_offset: 3b46d744
phys_start: 3f59c000
phys_end: 3f62a000
zero_fill: 0
pt_load_segment[6]:
file_offset: 3b4fb744
phys_start: 3f62e000
phys_end: 3f6a9000
zero_fill: 0
pt_load_segment[7]:
file_offset: 3b576744
phys_start: 3f6e9000
phys_end: 3f6ed000
zero_fill: 0
pt_load_segment[8]:
file_offset: 3b57a744
phys_start: 3f6ff000
phys_end: 3f700000
zero_fill: 0
elf_header: 85368c0
elf32: 85368c0
notes32: 85368f4
load32: 8536914
elf64: 0
notes64: 0
load64: 0
nt_prstatus: 8536a34
nt_prpsinfo: 0
nt_taskstruct: 0
task_struct: 0
page_size: 0
switch_stack: 0
xen_kdump_data: (unused)
num_prstatus_notes: 2
vmcoreinfo: 0
size_vmcoreinfo: 0
nt_prstatus_percpu:
08536a34 08536ad8
Elf32_Ehdr:
e_ident: \177ELF
e_ident[EI_CLASS]: 1 (ELFCLASS32)
e_ident[EI_DATA]: 1 (ELFDATA2LSB)
e_ident[EI_VERSION]: 1 (EV_CURRENT)
e_ident[EI_OSABI]: 0 (ELFOSABI_SYSV)
e_ident[EI_ABIVERSION]: 0
e_type: 4 (ET_CORE)
e_machine: 3 (EM_386)
e_version: 1 (EV_CURRENT)
e_entry: 0
e_phoff: 34
e_shoff: 0
e_flags: 0
e_ehsize: 34
e_phentsize: 20
e_phnum: a
e_shentsize: 0
e_shnum: 0
e_shstrndx: 0
Elf32_Phdr:
p_type: 4 (PT_NOTE)
p_offset: 372 (174)
p_vaddr: 0
p_paddr: 0
p_filesz: 1488 (5d0)
p_memsz: 1488 (5d0)
p_flags: 0 ()
p_align: 0
Elf32_Phdr:
p_type: 1 (PT_LOAD)
p_offset: 1860 (744)
p_vaddr: c0000000
p_paddr: 0
p_filesz: 655360 (a0000)
p_memsz: 655360 (a0000)
p_flags: 7 (PF_X|PF_W|PF_R)
p_align: 0
Elf32_Phdr:
p_type: 1 (PT_LOAD)
p_offset: 657220 (a0744)
p_vaddr: c0100000
p_paddr: 100000
p_filesz: 15728640 (f00000)
p_memsz: 15728640 (f00000)
p_flags: 7 (PF_X|PF_W|PF_R)
p_align: 0
Elf32_Phdr:
p_type: 1 (PT_LOAD)
p_offset: 16385860 (fa0744)
p_vaddr: c5000000
p_paddr: 5000000
p_filesz: 855638016 (33000000)
p_memsz: 855638016 (33000000)
p_flags: 7 (PF_X|PF_W|PF_R)
p_align: 0
Elf32_Phdr:
p_type: 1 (PT_LOAD)
p_offset: 872023876 (33fa0744)
p_vaddr: ffffffff
p_paddr: 38000000
p_filesz: 106950656 (65ff000)
p_memsz: 106950656 (65ff000)
p_flags: 7 (PF_X|PF_W|PF_R)
p_align: 0
Elf32_Phdr:
p_type: 1 (PT_LOAD)
p_offset: 978974532 (3a59f744)
p_vaddr: ffffffff
p_paddr: 3e6c6000
p_filesz: 15523840 (ece000)
p_memsz: 15523840 (ece000)
p_flags: 7 (PF_X|PF_W|PF_R)
p_align: 0
Elf32_Phdr:
p_type: 1 (PT_LOAD)
p_offset: 994498372 (3b46d744)
p_vaddr: ffffffff
p_paddr: 3f59c000
p_filesz: 581632 (8e000)
p_memsz: 581632 (8e000)
p_flags: 7 (PF_X|PF_W|PF_R)
p_align: 0
Elf32_Phdr:
p_type: 1 (PT_LOAD)
p_offset: 995080004 (3b4fb744)
p_vaddr: ffffffff
p_paddr: 3f62e000
p_filesz: 503808 (7b000)
p_memsz: 503808 (7b000)
p_flags: 7 (PF_X|PF_W|PF_R)
p_align: 0
Elf32_Phdr:
p_type: 1 (PT_LOAD)
p_offset: 995583812 (3b576744)
p_vaddr: ffffffff
p_paddr: 3f6e9000
p_filesz: 16384 (4000)
p_memsz: 16384 (4000)
p_flags: 7 (PF_X|PF_W|PF_R)
p_align: 0
Elf32_Phdr:
p_type: 1 (PT_LOAD)
p_offset: 995600196 (3b57a744)
p_vaddr: ffffffff
p_paddr: 3f6ff000
p_filesz: 4096 (1000)
p_memsz: 4096 (1000)
p_flags: 7 (PF_X|PF_W|PF_R)
p_align: 0
Elf32_Nhdr:
n_namesz: 5 ("CORE")
n_descsz: 144
n_type: 1 (NT_PRSTATUS)
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000401
c06fef00 00000001 0000f004 0000f008
00000000 c06e3ecc 00000282 00000282
00000024 c06e3fa4 00000068 00000000
Elf32_Nhdr:
n_namesz: 5 ("CORE")
n_descsz: 144
n_type: 1 (NT_PRSTATUS)
00000000 00000000 00000000 00000000
00000000 00000000 00000dbd 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 c0712420 00007e7e
00000000 00000063 00000000 f0639f0c
00000063 0000007b 0000007b 000000d8
00000033 ffffffff c03145b2 00000060
00010086 f0639f0c 00000068 00000000
Elf32_Nhdr:
n_namesz: 11 ("VMCOREINFO")
n_descsz: 1134
n_type: 0 (unused)
OSRELEASE=2.6.35.3-saber
PAGESIZE=4096
SYMBOL(init_uts_ns)=c06f9120
SYMBOL(node_online_map)=c0730644
SYMBOL(swapper_pg_dir)=c06e4000
SYMBOL(_stext)=c0101000
SYMBOL(vmlist)=c07d3540
SYMBOL(mem_map)=c07d3500
SYMBOL(contig_page_data)=c072ce80
SIZE(page)=32
SIZE(pglist_data)=4224
SIZE(zone)=1024
SIZE(free_area)=44
SIZE(list_head)=8
SIZE(nodemask_t)=4
OFFSET(page.flags)=0
OFFSET(page._count)=4
OFFSET(page.mapping)=16
OFFSET(page.lru)=24
OFFSET(pglist_data.node_zones)=0
OFFSET(pglist_data.nr_zones)=4140
OFFSET(pglist_data.node_mem_map)=4144
OFFSET(pglist_data.node_start_pfn)=4148
OFFSET(pglist_data.node_spanned_pages)=4156
OFFSET(pglist_data.node_id)=4160
OFFSET(zone.free_area)=40
OFFSET(zone.vm_stat)=728
OFFSET(zone.spanned_pages)=916
OFFSET(free_area.free_list)=0
OFFSET(list_head.next)=0
OFFSET(list_head.prev)=4
OFFSET(vm_struct.addr)=4
LENGTH(zone.free_area)=11
SYMBOL(log_buf)=c06fc83c
SYMBOL(log_end)=c07bb7ec
SYMBOL(log_buf_len)=c06fc838
SYMBOL(logged_chars)=c07c38a0
LENGTH(free_area.free_list)=5
NUMBER(NR_FREE_PAGES)=0
NUMBER(PG_lru)=5
NUMBER(PG_private)=11
NUMBER(PG_swapcache)=16
CONFIG_X86_PAE=y
CRASHTIME=1282584565
cannot determine relocation value: not a live system
gdb /usr/src/linux-2.6.35.3/vmlinux
GNU gdb (GDB) 7.0
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<
http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
<readmem: c148a9a0, KVADDR, "kernel_config_data", 32768, (ROE), 8bad3d8>
crash: read error: kernel virtual address: c148a9a0 type:
"kernel_config_data"
WARNING: cannot read kernel_config_data
<readmem: c1487e28, KVADDR, "cpu_possible_mask", 4, (FOE), bfed4bbc>
crash: read error: kernel virtual address: c1487e28 type: "cpu_possible_mask"
The read error with for the "kernel_config_data" symbol at c148a9a0 and (which
returns on
error -- that's what ROE means), and then the "cpu_possible_mask" symbol at
c1487e28 (which
cause the session to fault or bail out -- FOE), mean that -- after translating those
virtual
addresses to physical addresses by stripping off the c0000000 unity-map identifier --
those
physical addresses (at 148a9a0 and 1487e28 respectively) were not found in the dumpfile.
And that's because the ELF header of the vmcore does not show a PT_LOAD segment
that contains those physical addresses.
But as I mentioned before, the virtual addresses seem to be too high for
static kernel data symbols. If you run --minimal, does the "sym" command
show "cpu_possible_mask" at that address? I don't have anything later than
a 2.6.34 x86 dumpfile to use as a reference, but the symbol is much lower
in value in that kernel:
crash> sym cpu_possible_mask
c07ffa28 (R) cpu_possible_mask
crash>
And if I dump all of the symbols from within a --minimal session with that
dumpfile, I see this, where the "_end" of the static kernel virtual memory
is at c0c77000:
crash> sym -l
... [ cut ] ...
c0b50ffc (b) netlbl_unlhsh_lock
c0b51000 (b) klist_remove_lock
c0b51004 (B) __bss_stop
c0b52000 (b) .brk
c0b52000 (B) __brk_base
c0b62000 (b) .brk.pagetables
c0c67000 (b) .brk.dmi_alloc
c0c77000 (B) __brk_limit
c0c77000 (A) _end
crash>
And if you look at the "VMCOREINFO" data above in your dump for items that are
kernel symbol values, they make sense, i.e.,
SYMBOL(node_online_map)=c0730644
SYMBOL(swapper_pg_dir)=c06e4000
SYMBOL(_stext)=c0101000
SYMBOL(vmlist)=c07d3540
SYMBOL(mem_map)=c07d3500
SYMBOL(contig_page_data)=c072ce80
If you run a --minimal session, what do you see when you run the
two commands that I show above? (i.e., "sym cpu_possible_mask" & the
output
of the tail end of "sym -l")
Dave
But for starters, if you run the --minimal session and then execute the