August 2010 - Crash-utility - Crash Utility List Archives

Re: [Crash-utility] [PATCH 0/4] crash utility: add ARM crashdump support

by Dave Anderson

----- "Lei Wen" <adrian.wenl(a)gmail.com> wrote: > Hi Dave, > > What the status of this patch series now? Could crash utilities > support analyzing arm machine core dump in the x86 host? That is the plan, i.e., supporting the analysis of ARM dumpfiles on both x86 and ARM hosts (and by extension on x86_64 hosts using the x86 binary), and presumably "live" on ARM hosts. > This is very useful feature, since the arm kernel already support the > core dump by kdump enabled. I'm waiting for the results of the Nokia/Sony-Ericsson collaboration efforts. I've added Jan and Thomas's names to the cc: list. Dave

15 years, 2 months

2
1
0 / 0

Re: [Crash-utility] mount cmd crashes crash

by Dave Anderson

----- "Bob Montgomery" <bob.montgomery(a)hp.com> wrote: > Sorry, forgot to reply all: > --------------------------- > > On Wed, 2010-08-18 at 20:57 +0000, Dave Anderson wrote: > > ----- "Bob Montgomery" <bob.montgomery(a)hp.com> wrote: > > > > > I'm working on a dump of a system that did not have a PID 1. I don't > > > think it's relevant to the crash itself, but it does cause crash get > > > a seg fault. > > > > > > > I don't know if it was important to have the context of pid 1 for > > > reporting mounts, or just any context, but this hack makes the problem > > > go away, although not a very efficient way to find the lowest existing > > > PID above 0. > > > > Yeah, it's not important to use the context of pid 1, but it just needs > > some context, and I had presumed that init would always exist. I thought > > that the panic("Attempted to kill the idle task!") in do_exit() would > > prevent pid 1 from ever going away -- but apparently your kernel figured > > out how to do it elsewhere... ;-) > > That test is for PID 0, not PID 1 (at least on the kernel I'm > debugging.) However, there is this also: > > if (unlikely(tsk == child_reaper)) > panic("Attempted to kill init!"); That's the one I *meant*... ;-) > > And child_reaper in the dump points to a task struct for init that isn't > in the ps listing. Hmmm. Maybe that part *is* interesting in this dump... > > > > > Your patch would pick a kernel thread pid, and apparently everything still > > works OK? That being the case, it's fine with me. > > With the patch, these commands all produce the same output: > crash-5.0.6-fix> mount >mount.out > crash-5.0.6-fix> mount -n 2 >mount2.out > crash-5.0.6-fix> mount -n 1459 >mount1459.out > > I discovered the -n option as my first workaround. Actually, it looks like pid 0 could be used as well. Anyway, queued for the next release. Thanks, Dave

15 years, 2 months

2
1
0 / 0

Re: [Crash-utility] Crash issue when loading vmcore

by Dave Anderson

----- "Paul-Kenji Cahier Furuya" <pkc(a)f1-photo.com> wrote: > On 08/23/2010 22:51, Dave Anderson wrote: > > So that doesn't make any sense unless the vmlinux file and the > vmlinux that > > was running on the crashed kernel are not the same kernels. Are you using > > a different kernel as the secondary kdump kernel? > > Just checked kdump's config and it says: > # If these are not set, kdump-config will try to use the current > # and initrd if it is relocatable. > And I did not set those variables. > > However I checked and found out the vmlinuz(bzImage, 7.7MB extracted) > being run seems to be stripped, while the vmlinux from the kernel > directory(124MB) is not. > > Could this affect the result? Is there any way to deal properly with > that situation?(I am using my own kernel builds, so I do not have any > "debug kernel" packages) It appears that the kdump configuration should be using the same kernel as the crashed kernel, but would relocate it when it gets run as the kdump kernel. But that does not explain the discrepancy between the symbol values listed by the "VMCOREINFO" data and that of the vmlinux file that you are using. The vmlinuz file (with a "z" at the end) is useless for crash. Crash needs the debuginfo-full vmlinux file that was created by compiling the kernel with -g, and which is located at the topmost directory in the kernel source build tree. In any case, it would be trivial to figure this out if you could log into the the live system and try to run crash there -- or even simpler -- run "cat /proc/kallsyms" on that live system. Other than that, I don't know what else to suggest at this point. Dave

15 years, 2 months

2
5
0 / 0

Re: [Crash-utility] [PATCH 0/4] crash utility: add ARM crashdump support

by Dave Anderson

----- "Mika Westerberg" <ext-mika.1.westerberg(a)nokia.com> wrote: > > On Wed, Jun 30, 2010 at 03:10:58PM +0200, ext Dave Anderson wrote: > > > In any case, I'm more than happy to fold in ARM support, but I don't know what > > to do in this case. > > > > I wonder if it would it be possible for you, Jan and Thomas to somehow collaborate > > on this effort? It seems that both sides would benefit from the work of the other > > side. I've added them to the cc list. > > Sure. Can those patches be found in some public ML? I quickly searched but > couldn't find anything. > > Regards, > MW Jan is contacting you off-list. Thanks, Dave

15 years, 2 months

2
1
0 / 0

Re: [Crash-utility] Crash issue when loading vmcore

by Dave Anderson

----- "Paul-Kenji Cahier Furuya" <pkc(a)f1-photo.com> wrote: > On 08/23/2010 22:20, Dave Anderson wrote: > > Well, yes, you'd have to be able to log into the machine, and > > then just run: > > > > # crash > > > > or if the /vmlinux file is not in a common location, do this: > > > > # crash /path/to/vmlinux > > > > And that presumes you've got crash installed on the system as well. > > > I might be able to get a physical access at some point, but right now I > have none. > > Anything that helps from the logs? Well, this part is still unexplainable -- in the crashd8.txt, the symbol addresses that were seen by the crashed kernel are as shown: # grep SYMBOL crashd8.txt SYMBOL(init_uts_ns)=c06f9120 SYMBOL(node_online_map)=c0730644 SYMBOL(swapper_pg_dir)=c06e4000 SYMBOL(_stext)=c0101000 SYMBOL(vmlist)=c07d3540 SYMBOL(mem_map)=c07d3500 SYMBOL(contig_page_data)=c072ce80 SYMBOL(log_buf)=c06fc83c SYMBOL(log_end)=c07bb7ec SYMBOL(log_buf_len)=c06fc838 SYMBOL(logged_chars)=c07c38a0 # But the "sym.l" list starts with a unity-mapped PAGE_OFFSET value of c1000000 (instead of the more common c0000000) c1000000 (T) _text c1000000 (T) startup_32 c1000054 (t) default_entry c1001000 (T) _stext c1001010 (T) do_one_initcall c1001180 (t) init_post c10012c0 (T) name_to_dev_t c1001500 (T) thread_saved_pc c1001510 (T) prepare_to_copy c1001590 (T) get_wchan c1001640 (T) __switch_to ... So that being the case, the symbol values for "init_uts_ns", "node_online_map", and so on, don't even match those of the crashed kernel: c1662120 (D) init_uts_ns c1716000 (B) swapper_pg_dir c1001000 (T) _stext c173ed3c (B) vmlist c173ed00 (B) mem_map c1696180 (D) contig_page_data c17248a0 (b) __log_buf c17247ec (b) log_end c16656b8 (d) log_buf_len c172c8a0 (b) logged_chars So that doesn't make any sense unless the vmlinux file and the vmlinux that was running on the crashed kernel are not the same kernels. Are you using a different kernel as the secondary kdump kernel? Dave node_online_map doesn't even exist in your sym.l file.

15 years, 2 months

2
1
0 / 0

Re: [Crash-utility] Crash issue when loading vmcore

by Dave Anderson

----- "Paul Cahier" <pkc(a)f1-photo.com> wrote: > Hello, > > I have finished setting up kdump and kexec today, recompiling my kernel > to add everything needed in there. > I have triggered a kernel panic by echo c>/proc/sysrq-trigger, and found > that the vmcore dump was indeed there after all was done. > > However I can not get any traces out of that crash dump(short version, > long version at the end of the email): > > crash /usr/src/linux-2.6.35.3/vmlinux vmcore.201008231930 > [...] > crash: read error: kernel virtual address: c148a9a0 type: "kernel_config_data" > WARNING: cannot read kernel_config_data > crash: read error: kernel virtual address: c1487e28 type: "cpu_possible_mask" The virtual addresses for "kernel_config_data" and "cpu_possible_mask" are strange (too high?) -- I'll continue the analysis at the end of your "d7" output below... > If I try crash --minimal things do load but I'm stuck with the minimal > error set that's not very helpful. > All I'm looking at is getting a full trace of the kernel panic. > > > - Paul-Kenji Cahier > > > PS, the full version: > crash -d7 /usr/src/linux-2.6.35.3/vmlinux vmcore.201008231930 > > crash 5.0.6 > Copyright (C) 2002-2010 Red Hat, Inc. > Copyright (C) 2004, 2005, 2006 IBM Corporation > Copyright (C) 1999-2006 Hewlett-Packard Co > Copyright (C) 2005, 2006 Fujitsu Limited > Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. > Copyright (C) 2005 NEC Corporation > Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. > Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. > This program is free software, covered by the GNU General Public > License, > and you are welcome to change it and/or distribute copies of it under > certain conditions. Enter "help copying" to see the conditions. > This program has absolutely no warranty. Enter "help warranty" for > details. > > vmcore_data: > flags: a0 (KDUMP_LOCAL|KDUMP_ELF32) > ndfd: 3 > ofp: b77344c0 > header_size: 1860 > num_pt_load_segments: 9 > pt_load_segment[0]: > file_offset: 744 > phys_start: 0 > phys_end: a0000 > zero_fill: 0 > pt_load_segment[1]: > file_offset: a0744 > phys_start: 100000 > phys_end: 1000000 > zero_fill: 0 > pt_load_segment[2]: > file_offset: fa0744 > phys_start: 5000000 > phys_end: 38000000 > zero_fill: 0 > pt_load_segment[3]: > file_offset: 33fa0744 > phys_start: 38000000 > phys_end: 3e5ff000 > zero_fill: 0 > pt_load_segment[4]: > file_offset: 3a59f744 > phys_start: 3e6c6000 > phys_end: 3f594000 > zero_fill: 0 > pt_load_segment[5]: > file_offset: 3b46d744 > phys_start: 3f59c000 > phys_end: 3f62a000 > zero_fill: 0 > pt_load_segment[6]: > file_offset: 3b4fb744 > phys_start: 3f62e000 > phys_end: 3f6a9000 > zero_fill: 0 > pt_load_segment[7]: > file_offset: 3b576744 > phys_start: 3f6e9000 > phys_end: 3f6ed000 > zero_fill: 0 > pt_load_segment[8]: > file_offset: 3b57a744 > phys_start: 3f6ff000 > phys_end: 3f700000 > zero_fill: 0 > elf_header: 85368c0 > elf32: 85368c0 > notes32: 85368f4 > load32: 8536914 > elf64: 0 > notes64: 0 > load64: 0 > nt_prstatus: 8536a34 > nt_prpsinfo: 0 > nt_taskstruct: 0 > task_struct: 0 > page_size: 0 > switch_stack: 0 > xen_kdump_data: (unused) > num_prstatus_notes: 2 > vmcoreinfo: 0 > size_vmcoreinfo: 0 > nt_prstatus_percpu: > 08536a34 08536ad8 > > Elf32_Ehdr: > e_ident: \177ELF > e_ident[EI_CLASS]: 1 (ELFCLASS32) > e_ident[EI_DATA]: 1 (ELFDATA2LSB) > e_ident[EI_VERSION]: 1 (EV_CURRENT) > e_ident[EI_OSABI]: 0 (ELFOSABI_SYSV) > e_ident[EI_ABIVERSION]: 0 > e_type: 4 (ET_CORE) > e_machine: 3 (EM_386) > e_version: 1 (EV_CURRENT) > e_entry: 0 > e_phoff: 34 > e_shoff: 0 > e_flags: 0 > e_ehsize: 34 > e_phentsize: 20 > e_phnum: a > e_shentsize: 0 > e_shnum: 0 > e_shstrndx: 0 > Elf32_Phdr: > p_type: 4 (PT_NOTE) > p_offset: 372 (174) > p_vaddr: 0 > p_paddr: 0 > p_filesz: 1488 (5d0) > p_memsz: 1488 (5d0) > p_flags: 0 () > p_align: 0 > Elf32_Phdr: > p_type: 1 (PT_LOAD) > p_offset: 1860 (744) > p_vaddr: c0000000 > p_paddr: 0 > p_filesz: 655360 (a0000) > p_memsz: 655360 (a0000) > p_flags: 7 (PF_X|PF_W|PF_R) > p_align: 0 > Elf32_Phdr: > p_type: 1 (PT_LOAD) > p_offset: 657220 (a0744) > p_vaddr: c0100000 > p_paddr: 100000 > p_filesz: 15728640 (f00000) > p_memsz: 15728640 (f00000) > p_flags: 7 (PF_X|PF_W|PF_R) > p_align: 0 > Elf32_Phdr: > p_type: 1 (PT_LOAD) > p_offset: 16385860 (fa0744) > p_vaddr: c5000000 > p_paddr: 5000000 > p_filesz: 855638016 (33000000) > p_memsz: 855638016 (33000000) > p_flags: 7 (PF_X|PF_W|PF_R) > p_align: 0 > Elf32_Phdr: > p_type: 1 (PT_LOAD) > p_offset: 872023876 (33fa0744) > p_vaddr: ffffffff > p_paddr: 38000000 > p_filesz: 106950656 (65ff000) > p_memsz: 106950656 (65ff000) > p_flags: 7 (PF_X|PF_W|PF_R) > p_align: 0 > Elf32_Phdr: > p_type: 1 (PT_LOAD) > p_offset: 978974532 (3a59f744) > p_vaddr: ffffffff > p_paddr: 3e6c6000 > p_filesz: 15523840 (ece000) > p_memsz: 15523840 (ece000) > p_flags: 7 (PF_X|PF_W|PF_R) > p_align: 0 > Elf32_Phdr: > p_type: 1 (PT_LOAD) > p_offset: 994498372 (3b46d744) > p_vaddr: ffffffff > p_paddr: 3f59c000 > p_filesz: 581632 (8e000) > p_memsz: 581632 (8e000) > p_flags: 7 (PF_X|PF_W|PF_R) > p_align: 0 > Elf32_Phdr: > p_type: 1 (PT_LOAD) > p_offset: 995080004 (3b4fb744) > p_vaddr: ffffffff > p_paddr: 3f62e000 > p_filesz: 503808 (7b000) > p_memsz: 503808 (7b000) > p_flags: 7 (PF_X|PF_W|PF_R) > p_align: 0 > Elf32_Phdr: > p_type: 1 (PT_LOAD) > p_offset: 995583812 (3b576744) > p_vaddr: ffffffff > p_paddr: 3f6e9000 > p_filesz: 16384 (4000) > p_memsz: 16384 (4000) > p_flags: 7 (PF_X|PF_W|PF_R) > p_align: 0 > Elf32_Phdr: > p_type: 1 (PT_LOAD) > p_offset: 995600196 (3b57a744) > p_vaddr: ffffffff > p_paddr: 3f6ff000 > p_filesz: 4096 (1000) > p_memsz: 4096 (1000) > p_flags: 7 (PF_X|PF_W|PF_R) > p_align: 0 > Elf32_Nhdr: > n_namesz: 5 ("CORE") > n_descsz: 144 > n_type: 1 (NT_PRSTATUS) > 00000000 00000000 00000000 00000000 > 00000000 00000000 00000000 00000000 > 00000000 00000000 00000000 00000000 > 00000000 00000000 00000000 00000000 > 00000000 00000000 00000000 00000000 > 00000000 00000000 00000000 00000401 > c06fef00 00000001 0000f004 0000f008 > 00000000 c06e3ecc 00000282 00000282 > 00000024 c06e3fa4 00000068 00000000 > Elf32_Nhdr: > n_namesz: 5 ("CORE") > n_descsz: 144 > n_type: 1 (NT_PRSTATUS) > 00000000 00000000 00000000 00000000 > 00000000 00000000 00000dbd 00000000 > 00000000 00000000 00000000 00000000 > 00000000 00000000 00000000 00000000 > 00000000 00000000 c0712420 00007e7e > 00000000 00000063 00000000 f0639f0c > 00000063 0000007b 0000007b 000000d8 > 00000033 ffffffff c03145b2 00000060 > 00010086 f0639f0c 00000068 00000000 > Elf32_Nhdr: > n_namesz: 11 ("VMCOREINFO") > n_descsz: 1134 > n_type: 0 (unused) > OSRELEASE=2.6.35.3-saber > PAGESIZE=4096 > SYMBOL(init_uts_ns)=c06f9120 > SYMBOL(node_online_map)=c0730644 > SYMBOL(swapper_pg_dir)=c06e4000 > SYMBOL(_stext)=c0101000 > SYMBOL(vmlist)=c07d3540 > SYMBOL(mem_map)=c07d3500 > SYMBOL(contig_page_data)=c072ce80 > SIZE(page)=32 > SIZE(pglist_data)=4224 > SIZE(zone)=1024 > SIZE(free_area)=44 > SIZE(list_head)=8 > SIZE(nodemask_t)=4 > OFFSET(page.flags)=0 > OFFSET(page._count)=4 > OFFSET(page.mapping)=16 > OFFSET(page.lru)=24 > OFFSET(pglist_data.node_zones)=0 > OFFSET(pglist_data.nr_zones)=4140 > OFFSET(pglist_data.node_mem_map)=4144 > OFFSET(pglist_data.node_start_pfn)=4148 > OFFSET(pglist_data.node_spanned_pages)=4156 > OFFSET(pglist_data.node_id)=4160 > OFFSET(zone.free_area)=40 > OFFSET(zone.vm_stat)=728 > OFFSET(zone.spanned_pages)=916 > OFFSET(free_area.free_list)=0 > OFFSET(list_head.next)=0 > OFFSET(list_head.prev)=4 > OFFSET(vm_struct.addr)=4 > LENGTH(zone.free_area)=11 > SYMBOL(log_buf)=c06fc83c > SYMBOL(log_end)=c07bb7ec > SYMBOL(log_buf_len)=c06fc838 > SYMBOL(logged_chars)=c07c38a0 > LENGTH(free_area.free_list)=5 > NUMBER(NR_FREE_PAGES)=0 > NUMBER(PG_lru)=5 > NUMBER(PG_private)=11 > NUMBER(PG_swapcache)=16 > CONFIG_X86_PAE=y > CRASHTIME=1282584565 > cannot determine relocation value: not a live system > gdb /usr/src/linux-2.6.35.3/vmlinux > GNU gdb (GDB) 7.0 > Copyright (C) 2009 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "i686-pc-linux-gnu"... > > <readmem: c148a9a0, KVADDR, "kernel_config_data", 32768, (ROE), 8bad3d8> > crash: read error: kernel virtual address: c148a9a0 type: "kernel_config_data" > WARNING: cannot read kernel_config_data > <readmem: c1487e28, KVADDR, "cpu_possible_mask", 4, (FOE), bfed4bbc> > crash: read error: kernel virtual address: c1487e28 type: "cpu_possible_mask" The read error with for the "kernel_config_data" symbol at c148a9a0 and (which returns on error -- that's what ROE means), and then the "cpu_possible_mask" symbol at c1487e28 (which cause the session to fault or bail out -- FOE), mean that -- after translating those virtual addresses to physical addresses by stripping off the c0000000 unity-map identifier -- those physical addresses (at 148a9a0 and 1487e28 respectively) were not found in the dumpfile. And that's because the ELF header of the vmcore does not show a PT_LOAD segment that contains those physical addresses. But as I mentioned before, the virtual addresses seem to be too high for static kernel data symbols. If you run --minimal, does the "sym" command show "cpu_possible_mask" at that address? I don't have anything later than a 2.6.34 x86 dumpfile to use as a reference, but the symbol is much lower in value in that kernel: crash> sym cpu_possible_mask c07ffa28 (R) cpu_possible_mask crash> And if I dump all of the symbols from within a --minimal session with that dumpfile, I see this, where the "_end" of the static kernel virtual memory is at c0c77000: crash> sym -l ... [ cut ] ... c0b50ffc (b) netlbl_unlhsh_lock c0b51000 (b) klist_remove_lock c0b51004 (B) __bss_stop c0b52000 (b) .brk c0b52000 (B) __brk_base c0b62000 (b) .brk.pagetables c0c67000 (b) .brk.dmi_alloc c0c77000 (B) __brk_limit c0c77000 (A) _end crash> And if you look at the "VMCOREINFO" data above in your dump for items that are kernel symbol values, they make sense, i.e., > SYMBOL(node_online_map)=c0730644 > SYMBOL(swapper_pg_dir)=c06e4000 > SYMBOL(_stext)=c0101000 > SYMBOL(vmlist)=c07d3540 > SYMBOL(mem_map)=c07d3500 > SYMBOL(contig_page_data)=c072ce80 If you run a --minimal session, what do you see when you run the two commands that I show above? (i.e., "sym cpu_possible_mask" & the output of the tail end of "sym -l") Dave But for starters, if you run the --minimal session and then execute the

15 years, 2 months

2
6
0 / 0

Re: [Crash-utility] Crash issue when loading vmcore

by Dave Anderson

----- "Paul-Kenji Cahier Furuya" <pkc(a)f1-photo.com> wrote: > Here's for sym cpu_possible_mask in minimal mode: > crash> sym cpu_possible_mask > c14e7d88 (R) cpu_possible_mask > > And here's the tail of sym -l: > c175d4e0 (b) sunrpc_table_header > c175d4e4 (B) sctp_assocs_id_lock > c175d4e8 (B) proc_net_sctp > c175d4ec (B) sctp_assocs_id > c175d500 (B) sysctl_sctp_mem > c175d50c (B) sysctl_sctp_rmem > c175d518 (B) sysctl_sctp_wmem > c175d524 (b) __key.46606 > c175d524 (b) sctp_ctl_sock > c175d528 (b) sctp_pf_inet6_specific > c175d52c (b) sctp_pf_inet_specific > c175d530 (b) sctp_af_v4_specific > c175d534 (b) sctp_af_v6_specific > c175d538 (b) __key.44408 > c175d538 (b) sctp_rand.42824 > c175d53c (B) sctp_sockets_allocated > c175d54c (b) sctp_memory_pressure > c175d550 (b) sctp_memory_allocated > c175d554 (b) sctp_sysctl_header > c175d558 (b) zero > c175d55c (b) klist_remove_lock > c175d560 (B) __bss_stop > c175e000 (b) .brk > c175e000 (B) __brk_base > c176e000 (b) .brk.pagetables > c17ee000 (b) .brk.dmi_alloc > c17fe000 (B) __brk_limit > c17fe000 (A) _end That's interesting -- did you add some huge data structure or something to the kernel? OK -- three more requests -- can you bring up the --minimal session, and then do this: crash> sym -l > sym.l and send the "sym.l" file? (It's long, so send it as an attachment) Secondly, send the output of: # crash -d8 /usr/src/linux-2.6.35.3/vmlinux vmcore.201008231930 The -d8 output will also show the physical address translation, like this: <readmem: c07ffa28, KVADDR, "cpu_possible_mask", 4, (FOE), bff3d43c> addr: c07ffa28 paddr: 7ffa28 cnt: 4 And third, send the output of: # readelf -a vmcore.201008231930 Dave

15 years, 2 months

2
1
0 / 0

Crash issue when loading vmcore

by Paul Cahier

Hello, I have finished setting up kdump and kexec today, recompiling my kernel to add everything needed in there. I have triggered a kernel panic by echo c>/proc/sysrq-trigger, and found that the vmcore dump was indeed there after all was done. However I can not get any traces out of that crash dump(short version, long version at the end of the email): crash /usr/src/linux-2.6.35.3/vmlinux vmcore.201008231930 [...] crash: read error: kernel virtual address: c148a9a0 type: "kernel_config_data" WARNING: cannot read kernel_config_data crash: read error: kernel virtual address: c1487e28 type: "cpu_possible_mask" If I try crash --minimal things do load but I'm stuck with the minimal error set that's not very helpful. All I'm looking at is getting a full trace of the kernel panic. - Paul-Kenji Cahier PS, the full version: crash -d7 /usr/src/linux-2.6.35.3/vmlinux vmcore.201008231930 crash 5.0.6 Copyright (C) 2002-2010 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. vmcore_data: flags: a0 (KDUMP_LOCAL|KDUMP_ELF32) ndfd: 3 ofp: b77344c0 header_size: 1860 num_pt_load_segments: 9 pt_load_segment[0]: file_offset: 744 phys_start: 0 phys_end: a0000 zero_fill: 0 pt_load_segment[1]: file_offset: a0744 phys_start: 100000 phys_end: 1000000 zero_fill: 0 pt_load_segment[2]: file_offset: fa0744 phys_start: 5000000 phys_end: 38000000 zero_fill: 0 pt_load_segment[3]: file_offset: 33fa0744 phys_start: 38000000 phys_end: 3e5ff000 zero_fill: 0 pt_load_segment[4]: file_offset: 3a59f744 phys_start: 3e6c6000 phys_end: 3f594000 zero_fill: 0 pt_load_segment[5]: file_offset: 3b46d744 phys_start: 3f59c000 phys_end: 3f62a000 zero_fill: 0 pt_load_segment[6]: file_offset: 3b4fb744 phys_start: 3f62e000 phys_end: 3f6a9000 zero_fill: 0 pt_load_segment[7]: file_offset: 3b576744 phys_start: 3f6e9000 phys_end: 3f6ed000 zero_fill: 0 pt_load_segment[8]: file_offset: 3b57a744 phys_start: 3f6ff000 phys_end: 3f700000 zero_fill: 0 elf_header: 85368c0 elf32: 85368c0 notes32: 85368f4 load32: 8536914 elf64: 0 notes64: 0 load64: 0 nt_prstatus: 8536a34 nt_prpsinfo: 0 nt_taskstruct: 0 task_struct: 0 page_size: 0 switch_stack: 0 xen_kdump_data: (unused) num_prstatus_notes: 2 vmcoreinfo: 0 size_vmcoreinfo: 0 nt_prstatus_percpu: 08536a34 08536ad8 Elf32_Ehdr: e_ident: \177ELF e_ident[EI_CLASS]: 1 (ELFCLASS32) e_ident[EI_DATA]: 1 (ELFDATA2LSB) e_ident[EI_VERSION]: 1 (EV_CURRENT) e_ident[EI_OSABI]: 0 (ELFOSABI_SYSV) e_ident[EI_ABIVERSION]: 0 e_type: 4 (ET_CORE) e_machine: 3 (EM_386) e_version: 1 (EV_CURRENT) e_entry: 0 e_phoff: 34 e_shoff: 0 e_flags: 0 e_ehsize: 34 e_phentsize: 20 e_phnum: a e_shentsize: 0 e_shnum: 0 e_shstrndx: 0 Elf32_Phdr: p_type: 4 (PT_NOTE) p_offset: 372 (174) p_vaddr: 0 p_paddr: 0 p_filesz: 1488 (5d0) p_memsz: 1488 (5d0) p_flags: 0 () p_align: 0 Elf32_Phdr: p_type: 1 (PT_LOAD) p_offset: 1860 (744) p_vaddr: c0000000 p_paddr: 0 p_filesz: 655360 (a0000) p_memsz: 655360 (a0000) p_flags: 7 (PF_X|PF_W|PF_R) p_align: 0 Elf32_Phdr: p_type: 1 (PT_LOAD) p_offset: 657220 (a0744) p_vaddr: c0100000 p_paddr: 100000 p_filesz: 15728640 (f00000) p_memsz: 15728640 (f00000) p_flags: 7 (PF_X|PF_W|PF_R) p_align: 0 Elf32_Phdr: p_type: 1 (PT_LOAD) p_offset: 16385860 (fa0744) p_vaddr: c5000000 p_paddr: 5000000 p_filesz: 855638016 (33000000) p_memsz: 855638016 (33000000) p_flags: 7 (PF_X|PF_W|PF_R) p_align: 0 Elf32_Phdr: p_type: 1 (PT_LOAD) p_offset: 872023876 (33fa0744) p_vaddr: ffffffff p_paddr: 38000000 p_filesz: 106950656 (65ff000) p_memsz: 106950656 (65ff000) p_flags: 7 (PF_X|PF_W|PF_R) p_align: 0 Elf32_Phdr: p_type: 1 (PT_LOAD) p_offset: 978974532 (3a59f744) p_vaddr: ffffffff p_paddr: 3e6c6000 p_filesz: 15523840 (ece000) p_memsz: 15523840 (ece000) p_flags: 7 (PF_X|PF_W|PF_R) p_align: 0 Elf32_Phdr: p_type: 1 (PT_LOAD) p_offset: 994498372 (3b46d744) p_vaddr: ffffffff p_paddr: 3f59c000 p_filesz: 581632 (8e000) p_memsz: 581632 (8e000) p_flags: 7 (PF_X|PF_W|PF_R) p_align: 0 Elf32_Phdr: p_type: 1 (PT_LOAD) p_offset: 995080004 (3b4fb744) p_vaddr: ffffffff p_paddr: 3f62e000 p_filesz: 503808 (7b000) p_memsz: 503808 (7b000) p_flags: 7 (PF_X|PF_W|PF_R) p_align: 0 Elf32_Phdr: p_type: 1 (PT_LOAD) p_offset: 995583812 (3b576744) p_vaddr: ffffffff p_paddr: 3f6e9000 p_filesz: 16384 (4000) p_memsz: 16384 (4000) p_flags: 7 (PF_X|PF_W|PF_R) p_align: 0 Elf32_Phdr: p_type: 1 (PT_LOAD) p_offset: 995600196 (3b57a744) p_vaddr: ffffffff p_paddr: 3f6ff000 p_filesz: 4096 (1000) p_memsz: 4096 (1000) p_flags: 7 (PF_X|PF_W|PF_R) p_align: 0 Elf32_Nhdr: n_namesz: 5 ("CORE") n_descsz: 144 n_type: 1 (NT_PRSTATUS) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000401 c06fef00 00000001 0000f004 0000f008 00000000 c06e3ecc 00000282 00000282 00000024 c06e3fa4 00000068 00000000 Elf32_Nhdr: n_namesz: 5 ("CORE") n_descsz: 144 n_type: 1 (NT_PRSTATUS) 00000000 00000000 00000000 00000000 00000000 00000000 00000dbd 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 c0712420 00007e7e 00000000 00000063 00000000 f0639f0c 00000063 0000007b 0000007b 000000d8 00000033 ffffffff c03145b2 00000060 00010086 f0639f0c 00000068 00000000 Elf32_Nhdr: n_namesz: 11 ("VMCOREINFO") n_descsz: 1134 n_type: 0 (unused) OSRELEASE=2.6.35.3-saber PAGESIZE=4096 SYMBOL(init_uts_ns)=c06f9120 SYMBOL(node_online_map)=c0730644 SYMBOL(swapper_pg_dir)=c06e4000 SYMBOL(_stext)=c0101000 SYMBOL(vmlist)=c07d3540 SYMBOL(mem_map)=c07d3500 SYMBOL(contig_page_data)=c072ce80 SIZE(page)=32 SIZE(pglist_data)=4224 SIZE(zone)=1024 SIZE(free_area)=44 SIZE(list_head)=8 SIZE(nodemask_t)=4 OFFSET(page.flags)=0 OFFSET(page._count)=4 OFFSET(page.mapping)=16 OFFSET(page.lru)=24 OFFSET(pglist_data.node_zones)=0 OFFSET(pglist_data.nr_zones)=4140 OFFSET(pglist_data.node_mem_map)=4144 OFFSET(pglist_data.node_start_pfn)=4148 OFFSET(pglist_data.node_spanned_pages)=4156 OFFSET(pglist_data.node_id)=4160 OFFSET(zone.free_area)=40 OFFSET(zone.vm_stat)=728 OFFSET(zone.spanned_pages)=916 OFFSET(free_area.free_list)=0 OFFSET(list_head.next)=0 OFFSET(list_head.prev)=4 OFFSET(vm_struct.addr)=4 LENGTH(zone.free_area)=11 SYMBOL(log_buf)=c06fc83c SYMBOL(log_end)=c07bb7ec SYMBOL(log_buf_len)=c06fc838 SYMBOL(logged_chars)=c07c38a0 LENGTH(free_area.free_list)=5 NUMBER(NR_FREE_PAGES)=0 NUMBER(PG_lru)=5 NUMBER(PG_private)=11 NUMBER(PG_swapcache)=16 CONFIG_X86_PAE=y CRASHTIME=1282584565 cannot determine relocation value: not a live system gdb /usr/src/linux-2.6.35.3/vmlinux GNU gdb (GDB) 7.0 Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... <readmem: c148a9a0, KVADDR, "kernel_config_data", 32768, (ROE), 8bad3d8> crash: read error: kernel virtual address: c148a9a0 type: "kernel_config_data" WARNING: cannot read kernel_config_data <readmem: c1487e28, KVADDR, "cpu_possible_mask", 4, (FOE), bfed4bbc> crash: read error: kernel virtual address: c1487e28 type: "cpu_possible_mask"

15 years, 2 months

1
0
0 / 0

Re: [Crash-utility] mount cmd crashes crash

by Bob Montgomery

Sorry, forgot to reply all: --------------------------- On Wed, 2010-08-18 at 20:57 +0000, Dave Anderson wrote: > ----- "Bob Montgomery" <bob.montgomery(a)hp.com> wrote: > > > I'm working on a dump of a system that did not have a PID 1. I don't > > think it's relevant to the crash itself, but it does cause crash get > > a seg fault. > > > > I don't know if it was important to have the context of pid 1 for > > reporting mounts, or just any context, but this hack makes the problem > > go away, although not a very efficient way to find the lowest existing > > PID above 0. > > Yeah, it's not important to use the context of pid 1, but it just needs > some context, and I had presumed that init would always exist. I thought > that the panic("Attempted to kill the idle task!") in do_exit() would > prevent pid 1 from ever going away -- but apparently your kernel figured > out how to do it elsewhere... ;-) That test is for PID 0, not PID 1 (at least on the kernel I'm debugging.) However, there is this also: if (unlikely(tsk == child_reaper)) panic("Attempted to kill init!"); And child_reaper in the dump points to a task struct for init that isn't in the ps listing. Hmmm. Maybe that part *is* interesting in this dump... > > Your patch would pick a kernel thread pid, and apparently everything still > works OK? That being the case, it's fine with me. With the patch, these commands all produce the same output: crash-5.0.6-fix> mount >mount.out crash-5.0.6-fix> mount -n 2 >mount2.out crash-5.0.6-fix> mount -n 1459 >mount1459.out I discovered the -n option as my first workaround. Bob M.

15 years, 2 months

1
0
0 / 0

Re: [Crash-utility] mount cmd crashes crash

by Dave Anderson

----- "Bob Montgomery" <bob.montgomery(a)hp.com> wrote: > I'm working on a dump of a system that did not have a PID 1. I don't > think it's relevant to the crash itself, but it does cause crash get > a seg fault. > > crash> ps | head > PID PPID CPU TASK ST %MEM VSZ RSS COMM > 0 0 0 ffffffff805144c0 RU 0.0 0 0 [swapper] > 0 -1 1 ffff81012bc0a100 RU 0.0 0 0 [swapper] > 2 -1 0 ffff81012bd3c040 IN 0.0 0 0 [migration/0] > 3 -1 0 ffff81012bd3e7c0 RU 0.0 0 0 [ksoftirqd/0] > 4 -1 0 ffff81012bd3e080 IN 0.0 0 0 [watchdog/0] > 5 -1 1 ffff81012bd3f800 IN 0.0 0 0 [migration/1] > 6 -1 1 ffff81012bd3f0c0 RU 0.0 0 0 [ksoftirqd/1] > 7 -1 1 ffff81012bc0a840 IN 0.0 0 0 [watchdog/1] > 8 -1 0 ffff81012af02880 IN 0.0 0 0 [events/0] > crash> mount > Segmentation fault (core dumped) > > In cmd_mount, this returns null and subsequent use causes the seg fault: > > 1156 > 1157 namespace_context = pid_to_context(1); > > I don't know if it was important to have the context of pid 1 for > reporting mounts, or just any context, but this hack makes the problem > go away, although not a very efficient way to find the lowest existing > PID above 0. Yeah, it's not important to use the context of pid 1, but it just needs some context, and I had presumed that init would always exist. I thought that the panic("Attempted to kill the idle task!") in do_exit() would prevent pid 1 from ever going away -- but apparently your kernel figured out how to do it elsewhere... ;-) Your patch would pick a kernel thread pid, and apparently everything still works OK? That being the case, it's fine with me. Thanks, Dave > --- filesys.c.orig 2010-08-18 14:03:26.000000000 -0600 > +++ filesys.c 2010-08-18 14:10:02.000000000 -0600 > @@ -1153,8 +1153,12 @@ cmd_mount(void) > ulong vfsmount = 0; > int flags = 0; > int save_next; > + ulong pid; > > - namespace_context = pid_to_context(1); > + /* find a context */ > + pid = 1; > + while ((namespace_context = pid_to_context(pid)) == NULL) > + pid++; > > while ((c = getopt(argcnt, args, "ifn:")) != EOF) { > switch(c) > > Bob Montgomery > At HP > > > > > > -- > Crash-utility mailing list > Crash-utility(a)redhat.com > https://www.redhat.com/mailman/listinfo/crash-utility

15 years, 2 months

1
0
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility August 2010