Hi Dave,
Before the patch mentioned below, makedumpfile hardcoded nr_cpus to 1 in
the header.
..........................
author Ken'ichi Ohmichi <oomichi(a)mxs.nes.nec.co.jp>
Thu, 11 Nov 2010 03:53:16 +0000 (12:53 +0900)
Before applying this patch, makedumpfile sets "1" to nr_cpus in kdump
main header always even if a machine has multiple cpus.
As the result, the subcommand "help -n" of the crash utility prints
an invalid value "1" as nr_cpus.
...........................
Could this be part of Joe's issue?
Thanks,
Jeff Hagen
Interesting -- and depending upon which version of kexec-tools he has
installed, it certainly could affect the display of "nr_cpus" shown
by "help -n".
But as it turns out, the setting of "nr_cpus" in the header is irrelevant
with respect to compressed kdumps. It was used by the original diskdump-format
dumpfiles, upon which the compressed kdump dumpfile format is based. So that's
a red herring, regardless of whether Ken'ichi's patch was applied.
What's unexplainable here is the dump of the note information:
The determination of the number of ELF nt_prstatus notes is based
upon the contents of the kdump_sub_header, where "size_note" describes
a single buffer in the dumpfile that contains an array of nt_prstatus
notes. Each note consists of a small Elf64_Nhdr header, a name string,
and a register dump. Here's an example of one taken from an ELF-format
kdump:
Elf64_Nhdr:
n_namesz: 5 ("CORE")
n_descsz: 336
n_type: 1 (NT_PRSTATUS)
0000000000000000 0000000000000000
0000000000000000 0000000000000000
000000000000544d 0000000000000000
0000000000000000 0000000000000000
0000000000000000 0000000000000000
0000000000000000 0000000000000000
0000000000000000 0000000000000000
0000000000000001 00007fffb894e76f
00007fffb894e170 ffff88012969ebf0
ffff880128541f88 ffff880108870a00
0000000000000000 0000000000000000
ffffffff8184f8f0 0000000000000000
ffff880108870ab0 0000000000000003
0000000000000004 ffffffff81ad7fd0
ffff8801280e44b0 ffffffffffffffff
ffffffff8108d378 0000000000000010
0000000000010202 ffff88012854de68
0000000000000018 00007fa3283af700
0000000000000000 0000000000000000
0000000000000000 0000000000000000
0000000000000000 0000000000000000
So rounded up, each note is roughly ~350 bytes. So, while a "size_note"
of 1780 bytes wouldn't be large enough to contain the notes for 16 cpus,
it would seem to contain more than 1 note. (???) But the note-gathering
code was only able to come up a "num_prstatus_notes" of 1.
It would interesting to find out what happened in the x86_process_elf_notes()
function.
Dave
-----Original Message-----
From: crash-utility-bounces(a)redhat.com
[mailto:crash-utility-bounces@redhat.com] On Behalf Of Dave Anderson
Sent: Thursday, September 29, 2011 2:04 PM
To: Discussion list for crash utility usage,maintenance and
development
Subject: Re: [Crash-utility] Crash faults when determining panic task
----- Original Message -----
>
> Hi Dave,
>
> I hope I have captured everything you asked for here, if remote
> debugging over e-mail is too tedious, I can arrange to post a
> vmlinux/vmcore on our FTP site (roughly 600MB together).
Sure, you can do that if you'd like.
But anyway, the crash -d1 output is illuminating. You've got a
16-cpu
system, with all cpus online. But the compressed kdump header only
saw 1 cpu when it was created:
> header: 2cc1fe0
> signature: "KDUMP "
> header_version: 4
> utsname:
> sysname: Linux
> nodename:
bahamut.mno.stratus.com
> release: 2.6.32-131.0.15.el6.exp10.bz16586.x86_64
> version: #1 SMP Thu Jun 16 13:13:45 EDT 2011
> machine: x86_64
> domainname: sraeng
> timestamp:
> tv_sec: 4e4fe6e3
> tv_usec: 0
> status: 0 ()
> block_size: 4096
> sub_hdr_size: 1
> bitmap_blocks: 288
> max_mapnr: 4718592
> total_ram_blocks: 0
> device_blocks: 0
> written_blocks: 0
> current_cpu: 0
> nr_cpus: 1 <== should be 16
> tasks[nr_cpus]: 0
and farther on, here is the array of note pointers that I
was asking about:
> num_prstatus_notes: 1
> notes_buf: 2cc4000
> notes[0]: 2cc4000
Since dumpfile header's nr_cpus was 1, the array has only one entry.
I cannot explain that. But the crash utility can only deal with what
it finds in the dumpfile.
Furthermore, these two error messages indicate that memory containing
per-cpu data was "excluded" are of prime importance here:
> crash: page excluded: kernel virtual address: ffffffff81bb3b00
> type:
"cpu number (per_cpu)"
> crash: page excluded: kernel virtual address: ffffffff81bb3b00
> type:
"cpu number (per_cpu)"
The fact that the page was specifically "page excluded" is troubling,
because it should *never* have been filtered by makedumpfile
-d<level>.
But, since the dumpfile indicates that the crucial per-cpu page was
filtered,
so there's nothing that the crash utility can do about it.
I'm guessing that, even though you are able to get to a prompt
with --no_elf_notes, any command that depends upon per-cpu data
would fail. Although, it might be interesting to know *which*
cpu was in play when those two error messages were displayed
in x86_64_per_cpu_init() and x86_64_get_smp_cpus(). There is
a loop in both functions -- can you dump out which cpu's
per-cpu data was inaccessible?
Thanks,
Dave
>
>
> *** Setup some breakpoints to watch bt->machdep:
>
> get_netdump_regs_x86_64(struct bt_info *bt, ulong *ripp, ulong
> *rspp)
> {
> ...
>
> if (((NETDUMP_DUMPFILE() || KDUMP_DUMPFILE()) &&
> VALID_STRUCT(user_regs_struct) && (bt->task == tt->panic_task)) ||
> (KDUMP_DUMPFILE() && (kt->flags & DWARF_UNWIND) &&
> (bt->flags & BT_DUMPFILE_SEARCH))) {
> ...
> 2287 bt->machdep = (void *)user_regs;
> ...
>
> if (ELF_NOTES_VALID() &&
> (bt->flags & BT_DUMPFILE_SEARCH) && DISKDUMP_DUMPFILE() &&
> (note = (Elf64_Nhdr *)
> diskdump_get_prstatus_percpu(bt->tc->processor))) {
> ...
> 2306 bt->machdep = (void *)user_regs;
> ...
>
> (gdb) break get_netdump_regs_x86_64
> Breakpoint 1 at 0x519740: file netdump.c, line 2238.
> (gdb) break netdump.c:2287
> Breakpoint 2 at 0x519970: file netdump.c, line 2287.
> (gdb) break netdump.c:2306
> Breakpoint 3 at 0x5199e7: file netdump.c, line 2306.
> (gdb) r
>
> please wait... (determining panic task)
> Breakpoint 1, get_netdump_regs_x86_64 (bt=0x7fffffffcd70,
> ripp=0x7fffffffcce0,
> rspp=0x7fffffffcce8) at netdump.c:2238
> 2238 {
> (gdb) c
> Continuing.
>
> Breakpoint 3, get_netdump_regs_x86_64 (bt=0x7fffffffcd70,
> ripp=0x7fffffffcce0,
> rspp=0x7fffffffcce8) at netdump.c:2306
> 2306 bt->machdep = (void *)user_regs;
> (gdb) p user_regs
> $1 = 0xd14084 ""
> (gdb) c
> Continuing.
>
> Breakpoint 1, get_netdump_regs_x86_64 (bt=0x7fffffffcd70,
> ripp=0x7fffffffcce0,
> rspp=0x7fffffffcce8) at netdump.c:2238
> 2238 {
> (gdb) c
> Continuing.
>
> Program received signal SIGSEGV, Segmentation fault.
> x86_64_get_dumpfile_stack_frame (rsp=0x7fffffffcce8,
> rip=0x7fffffffcce0,
> bt_in=0x7fffffffcd70) at x86_64.c:4183
> 4183 ur_rip = ULONG(user_regs +
>
>
> *** So in its second invocation, get_netdump_regs_x86_64() never
> sets
> bt->machdep (only breakpoint 1 fired)
>
> *** Let's see what diskdump_get_prstatus_percpu() is returning
>
> (gdb) break diskdump_get_prstatus_percpu
> Breakpoint 1 at 0x526070: file diskdump.c, line 1451.
> (gdb) r
> please wait... (determining panic task)
> Breakpoint 1, diskdump_get_prstatus_percpu (cpu=0) at
> diskdump.c:1451
> 1451 return dd->nt_prstatus_percpu[cpu];
> (gdb) display dd->nt_prstatus_percpu[0]@16
> 1: dd->nt_prstatus_percpu[0]@16 = {0xd1c000, 0x0, 0x0, 0xd26472,
> 0xbf35ab2,
> 0xd26472, 0x200000012, 0xd1c850, 0xd1c600, 0x1010000012b,
> 0xffffffff814e4fa0, 0x14e4fa0, 0x4270, 0x0, 0x0, 0x0}
> (gdb) c
> Continuing.
>
> Breakpoint 1, diskdump_get_prstatus_percpu (cpu=1) at
> diskdump.c:1451
> 1451 return dd->nt_prstatus_percpu[cpu];
> 1: dd->nt_prstatus_percpu[0]@16 = {0xd1c000, 0x0, 0x0, 0xd26472,
> 0xbf35ab2,
> 0xd26472, 0x200000012, 0xd1c850, 0xd1c600, 0x1010000012b,
> 0xffffffff814e4fa0, 0x14e4fa0, 0x4270, 0x0, 0x0, 0x0}
>
>
> *** See crash -d1 vmlinux vmcore output at the bottom of the mail,
> particularly the part that says...
>
> crash: page excluded: kernel virtual address: ffffffff81bb3b00
> type:
> "cpu number (per_cpu)"
> crash: get_cpus_present: present: 16
>
>
>
> *** Bogus note->n_descsz value
> *** Apply first patch to get us further into ELF Note processing
>
> >From inside netdump.c :: get_regs_from_note() at the point of the
> >fault, I don't see dd->nt_prstatus[], for dd is now type
> >*diskdump_data... The *note passed in can be found in
> >dd->nt_prstatus_percpu[] however...
>
> please wait... (determining panic task)
> Program received signal SIGSEGV, Segmentation fault.
> get_regs_from_note (note=0xd26472 "\b", ip=0x7fffffffc590,
> sp=0x7fffffffc598)
> at netdump.c:2221
> 2221 *sp = ULONG(user_regs + offset_sp);
> (gdb) p/x *((Elf64_Nhdr *)note)
> $1 = {n_namesz = 0x8, n_descsz = 0xccf80000, n_type = 0x8}
> (gdb) p dd->nt_prstatus_percpu[0]@16
> $2 = {0xd1c000, 0x0, 0x0, 0xd26472, 0xbf35ab2, 0xd26472,
> 0x200000012,
> 0xd1c850, 0xd1c600, 0x1010000012b, 0xffffffff814e4fa0, 0x14e4fa0,
> 0x4270,
> 0x0, 0x0, 0x0}
> (gdb) ptype dd
> type = struct diskdump_data {
> char *filename;
> ulong flags;
> int dfd;
> FILE *ofp;
> int machine_type;
> struct disk_dump_header *header;
> struct disk_dump_sub_header *sub_header;
> struct kdump_sub_header *sub_header_kdump;
> size_t data_offset;
> int block_size;
> int block_shift;
> char *bitmap;
> int bitmap_len;
> char *dumpable_bitmap;
> int byte;
> int bit;
> char *compressed_page;
> char *curbufptr;
> unsigned char *notes_buf;
> void **nt_prstatus_percpu;
> uint num_prstatus_notes;
> struct page_cache_hdr page_cache_hdr[16];
> char *page_cache_buf;
> int evict_index;
> ulong evictions;
> ulong cached_reads;
> ulong *valid_pages;
> ulong accesses;
> } *
>
>
>
> *** Unpatched crash -d1 vmlinux vmcore output:
>
> crash 5.1.8
> Copyright (C) 2002-2011 Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006 IBM Corporation
> Copyright (C) 1999-2006 Hewlett-Packard Co
> Copyright (C) 2005, 2006 Fujitsu Limited
> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
> Copyright (C) 2005 NEC Corporation
> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public
> License,
> and you are welcome to change it and/or distribute copies of it
> under
> certain conditions. Enter "help copying" to see the conditions.
> This program has absolutely no warranty. Enter "help warranty" for
> details.
>
> compressed kdump: header->utsname.machine: x86_64
> diskdump_data:
> filename: vmcore
> flags: 6 (KDUMP_CMPRS_LOCAL|ERROR_EXCLUDED)
> dfd: 3
> ofp: 0
> machine_type: 62 (EM_X86_64)
>
> header: 2cc1fe0
> signature: "KDUMP "
> header_version: 4
> utsname:
> sysname: Linux
> nodename:
bahamut.mno.stratus.com
> release: 2.6.32-131.0.15.el6.exp10.bz16586.x86_64
> version: #1 SMP Thu Jun 16 13:13:45 EDT 2011
> machine: x86_64
> domainname: sraeng
> timestamp:
> tv_sec: 4e4fe6e3
> tv_usec: 0
> status: 0 ()
> block_size: 4096
> sub_hdr_size: 1
> bitmap_blocks: 288
> max_mapnr: 4718592
> total_ram_blocks: 0
> device_blocks: 0
> written_blocks: 0
> current_cpu: 0
> nr_cpus: 1
> tasks[nr_cpus]: 0
>
> sub_header: 0 (n/a)
>
> sub_header_kdump: 2cc2ff0
> phys_base: 0
> dump_level: 31 (0x1f)
>
(DUMP_EXCLUDE_ZERO|DUMP_EXCLUDE_CACHE|DUMP_EXCLUDE_CACHE_PRI|DUMP_EXCLUD
E_USER_DATA|DUMP_EXCLUDE_FREE)
> offset_vmcoreinfo: 11bc
> size_vmcoreinfo: 1392
> OSRELEASE=2.6.32-131.0.15.el6.exp10.bz16586.x86_64
> PAGESIZE=4096
> SYMBOL(init_uts_ns)=ffffffff81a2e8c0
> SYMBOL(node_online_map)=ffffffff81ba0860
> SYMBOL(swapper_pg_dir)=ffffffff81a25000
> SYMBOL(_stext)=ffffffff81000198
> SYMBOL(vmlist)=ffffffff81ee60b8
> SYMBOL(mem_section)=ffffffff81ef03c0
> LENGTH(mem_section)=4096
> SIZE(mem_section)=32
> OFFSET(mem_section.section_mem_map)=0
> SIZE(page)=56
> SIZE(pglist_data)=212416
> SIZE(zone)=34496
> SIZE(free_area)=88
> SIZE(list_head)=16
> SIZE(nodemask_t)=64
> OFFSET(page.flags)=0
> OFFSET(page._count)=8
> OFFSET(page.mapping)=24
> OFFSET(page.lru)=40
> OFFSET(pglist_data.node_zones)=0
> OFFSET(pglist_data.nr_zones)=212288
> OFFSET(pglist_data.node_start_pfn)=212312
> OFFSET(pglist_data.node_spanned_pages)=212328
> OFFSET(pglist_data.node_id)=212336
> OFFSET(zone.free_area)=32864
> OFFSET(zone.vm_stat)=34032
> OFFSET(zone.spanned_pages)=34344
> OFFSET(free_area.free_list)=0
> OFFSET(list_head.next)=0
> OFFSET(list_head.prev)=8
> OFFSET(vm_struct.addr)=8
> LENGTH(zone.free_area)=11
> SYMBOL(log_buf)=ffffffff81a37210
> SYMBOL(log_end)=ffffffff81d5b820
> SYMBOL(log_buf_len)=ffffffff81a37208
> SYMBOL(logged_chars)=ffffffff81ddb920
> LENGTH(free_area.free_list)=5
> NUMBER(NR_FREE_PAGES)=0
> NUMBER(PG_lru)=5
> NUMBER(PG_private)=11
> NUMBER(PG_swapcache)=16
> SYMBOL(phys_base)=ffffffff81a2d010
> SYMBOL(init_level4_pgt)=ffffffff81a25000
> SYMBOL(node_data)=ffffffff81b9cda0
> LENGTH(node_data)=512
> CRASHTIME=1313859299
> offset_note: 1040
> size_note: 1780
> num_prstatus_notes: 1
> notes_buf: 2cc4000
> notes[0]: 2cc4000
> NT_PRSTATUS_offset: 1040
>
> data_offset: 122000
> block_size: 4096
> block_shift: 12
> bitmap: 7fa5296fc010
> bitmap_len: 1179648
> dumpable_bitmap: 7fa528890010
> byte: 0
> bit: 0
> compressed_page: 2cdeb30
> curbufptr: 0
>
> page_cache_hdr[0]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cceb20
> pg_hit_count: 0
> page_cache_hdr[1]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2ccfb20
> pg_hit_count: 0
> page_cache_hdr[2]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cd0b20
> pg_hit_count: 0
> page_cache_hdr[3]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cd1b20
> pg_hit_count: 0
> page_cache_hdr[4]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cd2b20
> pg_hit_count: 0
> page_cache_hdr[5]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cd3b20
> pg_hit_count: 0
> page_cache_hdr[6]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cd4b20
> pg_hit_count: 0
> page_cache_hdr[7]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cd5b20
> pg_hit_count: 0
> page_cache_hdr[8]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cd6b20
> pg_hit_count: 0
> page_cache_hdr[9]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cd7b20
> pg_hit_count: 0
> page_cache_hdr[10]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cd8b20
> pg_hit_count: 0
> page_cache_hdr[11]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cd9b20
> pg_hit_count: 0
> page_cache_hdr[12]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cdab20
> pg_hit_count: 0
> page_cache_hdr[13]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cdbb20
> pg_hit_count: 0
> page_cache_hdr[14]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cdcb20
> pg_hit_count: 0
> page_cache_hdr[15]:
> pg_flags: 0 ()
> pg_addr: 0
> pg_bufptr: 2cddb20
> pg_hit_count: 0
>
> page_cache_buf: 2cceb20
> evict_index: 0
> evictions: 0
> accesses: 0
> cached_reads: 0
> valid_pages: 2ccc710
> crash: pv_init_ops exists: ARCH_PVOPS
> compressed kdump: phys_base: 0
> gdb vmlinux
> GNU gdb (GDB) 7.0
> Copyright (C) 2009 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <
>
http://gnu.org/licenses/gpl.html >
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show
> copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-unknown-linux-gnu"...
>
> cpu_possible_map: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
> cpu_present_map: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
> cpu_online_map: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
> base kernel version: 2.6.32
> verify_namelist:
> dumpfile /proc/version:
> Linux version 2.6.32-131.0.15.el6.exp10.bz16586.x86_64
> (root(a)druk.mno.stratus.com) (gcc version 4.4.5 20110214 (Red Hat
> 4.4.5-6) (GCC) ) #1 SMP Thu Jun 16 13:13:45 EDT 2011
> vmlinux:
> Linux version 2.6.32-131.0.15.el6.exp10.bz16586.x86_64
> (root(a)druk.mno.stratus.com) (gcc version 4.4.5 20110214 (Red Hat
> 4.4.5-6) (GCC) ) #1 SMP Thu Jun 16 13:13:45 EDT 2011
>
> crash: page excluded: kernel virtual address: ffffffff81bb3b00
> type:
> "cpu number (per_cpu)"
> crash: get_cpus_present: present: 16
> crash: page excluded: kernel virtual address: ffffffff81bb3b00
> type:
> "cpu number (per_cpu)"
> crash: get_cpus_present: present: 16
> IRQ stack link register: undetermined
> PAGESIZE=4096
> mem_section_size = 32768
> NR_SECTION_ROOTS = 4096
> NR_MEM_SECTIONS = 524288
> SECTIONS_PER_ROOT = 128
> SECTION_ROOT_MASK = 0x7f
> PAGES_PER_SECTION = 32768
> node_online_map: [3, 0, 0, 0, 0, 0, 0, 0] -> nodes online: 2
> node_table[0]:
> id: 0
> pgdat: ffff880000020040
> size: 0
> present: 0
> mem_map: ffffea0000000000
> start_paddr: 0
> start_mapnr: 0
> WARNING: sparsemem: invalid section number: 137438888923
> WARNING: sparsemem: invalid section number: 137438888923
> crash: invalid kernel virtual address: 0 type: "readstring
> characters"
> crash: invalid kernel virtual address: 0 type: "readstring
> characters"
> node_table[1]:
> id: 1
> pgdat: ffff880280000040
> size: 2097152
> present: 2097152
> mem_map: ffffea0008c00000
> start_paddr: 280000000
> start_mapnr: 2621440
> NOTE: page_hash_table does not exist in this kernel
> ^Mplease wait... (gathering kmem slab cache data)
> kmem_cache_downsize: SIZE(kmem_cache_s): 36968
> cache_cache.buffer_size: 32896
> kmem_cache_downsize: nr_node_ids: 2
> ^M ^MNOTE: unwind_table structure has changed, or does not exist in
> this kernel
> init_unwind_table: DWARF_UNWIND_EH_FRAME
> ^Mplease wait... (gathering module symbol data)^M ^M^Mplease
> wait...
> (gathering task table data)^M ^Mcrash: get_cpus_online: online: 16
> ^Mplease wait... (determining panic task)
> crash: get_active_set_panic_task: failed
>
>
> Thanks,
>
> -- Joe Lawrence
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
>
https://www.redhat.com/mailman/listinfo/crash-utility
>
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility