crash-utility-bounces(a)redhat.com wrote on 15.01.2010 18:54:48:
 From:
 
 Dave Anderson <anderson(a)redhat.com>
 Date:
 
 15.01.2010 18:59
 
 Subject:
 
 Re: [Crash-utility] crash-5.0: Segmentation fault with  
x86_64_get_active_set
 
 Sent by:
 
 crash-utility-bounces(a)redhat.com
 
 
 ----- "ville mattila" <ville.mattila(a)stonesoft.com> wrote:
 
 > crash-utility-bounces(a)redhat.com wrote on 14.01.2010 16:08:41:
 > 
 > > From:
 > >
 > > Dave Anderson <anderson(a)redhat.com>
 > >
 > > To:
 > >
 > > ----- "ville mattila" <ville.mattila(a)stonesoft.com> wrote:
 > >
 > > > Hello,
 > > >
 > > > I get segementation fault from our 64-bit kernel crash
 > > > This crash is caused by "echo c > /proc/sys-trigger".
 > > > The reason seems to be that the x86_64_cpu_pda_init is
 > > > not called at least gdb do not break there.
 > > >
 > > > Here is a little patch that fixes it. Everyting seems to
 > > > work correctly. I'll provide more info if needed.
 > > >
 > > >
 > > > --- crash-5.0.0/x86_64.c 2010-01-06 21:38:27.000000000 +0200
 > > > +++ crash-5.0.0-64bit/x86_64.c 2010-01-14 08:24:13.679603706 +0200
 > > > @@ -6325,6 +6325,12 @@ x86_64_get_active_set(void)
 > > >
 > > > ms = machdep->machspec;
 > > >
 > > > + if (!ms->current) {
 > > > + error(INFO, "%s: Cannot get active set, ms->current is
NULL\n",
 > > > + __func__);
 > > > + return;
 > > > + }
 > > > +
 > >
 > > That patch just masks the real problem.
 > >
 > > What kernel version is it?
 > >
 > > If it's 2.6.30 or later, then x86_64_per_cpu_init() should
 > > be called, otherwise x86_64_cpu_pda_init() is called. And
 > > whichever one that gets called should allocate the array.
 > >
 > > 2.6.30 or later kernels should show:
 > >
 > > crash> struct x8664_pda
 > > struct: invalid data structure reference: x8664_pda
 > > crash>
 > >
 > > and they will use x86_64_per_cpu_init().
 > >
 > > Kernels prior to 2.6.30 should show:
 > >
 > > crash> struct x8664_pda
 > > struct x8664_pda {
 > > struct task_struct *pcurrent;
 > > long unsigned int data_offset;
 > > long unsigned int kernelstack;
 > > long unsigned int oldrsp;
 > > long unsigned int debugstack;
 > > int irqcount;
 > > int cpunumber;
 > > char *irqstackptr;
 > > int nodenumber;
 > > unsigned int __softirq_pending;
 > > unsigned int __nmi_count;
 > > int mmu_state;
 > > struct mm_struct *active_mm;
 > > unsigned int apic_timer_irqs;
 > > }
 > > SIZE: 128
 > > crash>
 > >
 > > and they will use x86_64_cpu_pda_init().
 > >
 > > If you're having trouble with gdb, can you put some fprintf(fp, ...)
 > > calls in the relevant function and find out why it isn't doing
 > > the calloc() call?
 > 
 > 
 > Yes I thought so. This is a customized 2.6.31.7 
kernel.org
 > kernel. This is a UP configuration e.g. CONFIG_SMP is n.
 > I think the problem is that the PER_CPU_OFF is not set.
 
 Ahah -- that would do it.  UP x86_64 kernels are so rare
 that apparently nobody ever noticed, and I don't have a UP
 x86_64 vmcore to even test with. (RHEL5 doesn't even ship
 a UP x86_64 kernel).
 
 Anyway, that change went into 4.0-8.11.  And as far as I
 can tell, x86_64_per_cpu_init() should still populate the
 single "ms->current[0]" task from the "per_cpu__current_task"
 symbol from UP kernels -- which doesn't need the PER_CPU_OFF
 translation mechanism.  In other words, I think you should
 be able to do this on your UP kernel:
 
  crash> px per_cpu__current_task
 
 and it should show the panic task address that comes up as the
 current task upon invocation.  Is that right? 
        Yes this works correctly.
 
 > Btw, the "struct" command caused another segementation fault.
 > Here is gdb bt:
 > 
 > (gdb) bt
 > #0 0x00007f74b3524a92 in strcmp () from /lib/libc.so.6
 > #1 0x0000000000534284 in lookup_partial_symtab (name=0x120e3c0
 > "x8664_pda")
 > at symtab.c:276
 > #2 0x00000000005344ed in lookup_symtab (name=0x120e3c0 "x8664_pda")
 > at symtab.c:228
 > #3 0x000000000060019d in c_lex () at c-exp.y:2149
 > #4 0x00000000006008f5 in c_parse_internal () at c-exp.c.tmp:1468
 > #5 0x00000000006022dd in c_parse () at c-exp.y:2225
 > #6 0x000000000055f614 in parse_exp_in_context
 > (stringptr=0x7fffbc2f2260,
 > block=<value optimized out>, comma=<value optimized out>,
 > void_context_p=0, out_subexp=0x0) at parse.c:1094
 > #7 0x000000000055f924 in parse_expression (string=0x7fffbc2f2950
 > "x8664_pda")
 > at parse.c:1144
 > #8 0x000000000053291b in gdb_command_funnel (req=0xca2c00) at
 > symtab.c:4992
 > #9 0x00000000004c1740 in gdb_interface (req=0xca2c00) at
 > gdb_interface.c:407
 > #10 0x00000000004e9dca in datatype_info (name=0xb618a7 "x8664_pda",
 > member=0x0, dm=0x7fffbc2f3620) at symbols.c:4146
 > #11 0x00000000004eb1ee in arg_to_datatype (s=0xb618a7 "x8664_pda",
 > dm=0x7fffbc2f3620, flags=524290) at symbols.c:4867
 > #12 0x00000000004efa1b in cmd_datatype_common (flags=2048) at
 > symbols.c:4664
 > #13 0x000000000045efd9 in exec_command () at main.c:644
 > #14 0x000000000045f1fa in main_loop () at main.c:603
 > #15 0x00000000005452a9 in captured_command_loop (data=0x120e3c0)
 > at ./main.c:226
 > #16 0x00000000005434e4 in catch_errors (func=0x5452a0
 > <captured_command_loop>,
 > func_args=0x0, errstring=0x7f9d7c "", mask=<value optimized out>)
 > at exceptions.c:520
 > #17 0x0000000000544d36 in captured_main (data=<value optimized out>)
 > at ./main.c:924
 > #18 0x00000000005434e4 in catch_errors (func=0x544340 <captured_main>,
 > func_args=0x7fffbc2f38b0, errstring=0x7f9d7c "",
 > mask=<value optimized out>) at exceptions.c:520
 > #19 0x000000000054412f in gdb_main_entry (argc=<value optimized out>,
 > argv=<value optimized out>) at ./main.c:939
 > #20 0x000000000045fece in main (argc=3, argv=0x7fffbc2f3a08) at
 > main.c:517
 > (gdb) frame 1
 > #1 0x0000000000534284 in lookup_partial_symtab (name=0x120e3c0
 > "x8664_pda")
 > at symtab.c:276
 > 276 if (FILENAME_CMP (name, pst->filename) == 0)
 > (gdb) p name
 > $4 = 0x120e3c0 "x8664_pda"
 > (gdb) p pst
 > $5 = (struct partial_symtab *) 0x14d6040
 > (gdb) p pst->filename
 > $6 = 0x0
 > (gdb) p *pst
 > $7 = {next = 0x0, filename = 0x0, fullname = 0x0, dirname = 0x0,
 > objfile = 0x0, section_offsets = 0x0, textlow = 0, texthigh = 0,
 > dependencies = 0x0, number_of_dependencies = 0, globals_offset = 0,
 > n_global_syms = 0, statics_offset = 0, n_static_syms = 0, symtab =
 > 0x0,
 > read_symtab = 0, read_symtab_private = 0x0, readin = 0 '\0'}
 > (gdb)
 > 
 > 
 > I fixed it with the patch below:
 > -- crash-5.0.0/gdb-7.0/gdb/symtab.c 2010-01-15 10:41:00.919973440
 > +0200
 > +++ crash-5.0.0-64bit/gdb-7.0/gdb/symtab.c 2010-01-15
 > 10:19:21.436128740 +0200
 > @@ -256,7 +256,7 @@ got_symtab:
 > struct partial_symtab *
 > lookup_partial_symtab (const char *name)
 > {
 > - struct partial_symtab *pst;
 > + struct partial_symtab *pst = NULL;
 > struct objfile *objfile;
 > char *full_path = NULL;
 > char *real_path = NULL;
 > @@ -273,7 +273,7 @@ lookup_partial_symtab (const char *name)
 > 
 > ALL_PSYMTABS (objfile, pst)
 > {
 > - if (FILENAME_CMP (name, pst->filename) == 0)
 > + if (pst->filename && FILENAME_CMP (name, pst->filename) == 0)
 > {
 > return (pst);
 > }
 > @@ -311,7 +311,7 @@ lookup_partial_symtab (const char *name)
 > if (lbasename (name) == name)
 > ALL_PSYMTABS (objfile, pst)
 > {
 > - if (FILENAME_CMP (lbasename (pst->filename), name) == 0)
 > + if (pst->filename && FILENAME_CMP (lbasename (pst->filename), name)
 > == 0)
 > return (pst);
 > }
 
 Weird -- so you're apparently able to do that when running any
 "struct <non-existent>" command from the crash command line?
 
 But I can't reproduce that -- this is what should happen:
 
   crash> struct this_is_junk
   struct: invalid data structure reference: this_is_junk
   crash>
  
        Yes, before patching I always got segmentation fault when
      using "struct". After patch everything seems to be fine.
 and I don't understand what could be different with your
 custom kernel?
 
 > >
 > > Either that, or if you can make the vmlinux/vmcore pair available
 > > for me to download, I can look at it.
 > 
 > I'll arrange this if the above information is not enough.
 
 Yes please -- can you put the vmlinux/vmcore pair somewhere
 where I can download it?  You can send me the particulars
 off-line to anderson(a)redhat.com. 
        I've sent you a email about the location.
 
 
 - Ville