crash-utility-bounces@redhat.com wrote on 15.01.2010
18:54:48:
> From:
>
> Dave Anderson <anderson@redhat.com>
> Date:
>
> 15.01.2010 18:59
>
> Subject:
>
> Re: [Crash-utility] crash-5.0: Segmentation fault with x86_64_get_active_set
>
> Sent by:
>
> crash-utility-bounces@redhat.com
>
>
> ----- "ville mattila" <ville.mattila@stonesoft.com>
wrote:
>
> > crash-utility-bounces@redhat.com wrote on 14.01.2010 16:08:41:
> >
> > > From:
> > >
> > > Dave Anderson <anderson@redhat.com>
> > >
> > > To:
> > >
> > > ----- "ville mattila" <ville.mattila@stonesoft.com>
wrote:
> > >
> > > > Hello,
> > > >
> > > > I get segementation fault from our 64-bit kernel crash
> > > > This crash is caused by "echo c > /proc/sys-trigger".
> > > > The reason seems to be that the x86_64_cpu_pda_init
is
> > > > not called at least gdb do not break there.
> > > >
> > > > Here is a little patch that fixes it. Everyting seems
to
> > > > work correctly. I'll provide more info if needed.
> > > >
> > > >
> > > > --- crash-5.0.0/x86_64.c 2010-01-06 21:38:27.000000000
+0200
> > > > +++ crash-5.0.0-64bit/x86_64.c 2010-01-14 08:24:13.679603706
+0200
> > > > @@ -6325,6 +6325,12 @@ x86_64_get_active_set(void)
> > > >
> > > > ms = machdep->machspec;
> > > >
> > > > + if (!ms->current) {
> > > > + error(INFO, "%s: Cannot get active set, ms->current
is NULL\n",
> > > > + __func__);
> > > > + return;
> > > > + }
> > > > +
> > >
> > > That patch just masks the real problem.
> > >
> > > What kernel version is it?
> > >
> > > If it's 2.6.30 or later, then x86_64_per_cpu_init() should
> > > be called, otherwise x86_64_cpu_pda_init() is called. And
> > > whichever one that gets called should allocate the array.
> > >
> > > 2.6.30 or later kernels should show:
> > >
> > > crash> struct x8664_pda
> > > struct: invalid data structure reference: x8664_pda
> > > crash>
> > >
> > > and they will use x86_64_per_cpu_init().
> > >
> > > Kernels prior to 2.6.30 should show:
> > >
> > > crash> struct x8664_pda
> > > struct x8664_pda {
> > > struct task_struct *pcurrent;
> > > long unsigned int data_offset;
> > > long unsigned int kernelstack;
> > > long unsigned int oldrsp;
> > > long unsigned int debugstack;
> > > int irqcount;
> > > int cpunumber;
> > > char *irqstackptr;
> > > int nodenumber;
> > > unsigned int __softirq_pending;
> > > unsigned int __nmi_count;
> > > int mmu_state;
> > > struct mm_struct *active_mm;
> > > unsigned int apic_timer_irqs;
> > > }
> > > SIZE: 128
> > > crash>
> > >
> > > and they will use x86_64_cpu_pda_init().
> > >
> > > If you're having trouble with gdb, can you put some fprintf(fp,
...)
> > > calls in the relevant function and find out why it isn't
doing
> > > the calloc() call?
> >
> >
> > Yes I thought so. This is a customized 2.6.31.7 kernel.org
> > kernel. This is a UP configuration e.g. CONFIG_SMP is n.
> > I think the problem is that the PER_CPU_OFF is not set.
>
> Ahah -- that would do it. UP x86_64 kernels are so rare
> that apparently nobody ever noticed, and I don't have a UP
> x86_64 vmcore to even test with. (RHEL5 doesn't even ship
> a UP x86_64 kernel).
>
> Anyway, that change went into 4.0-8.11. And as far as I
> can tell, x86_64_per_cpu_init() should still populate the
> single "ms->current[0]" task from the "per_cpu__current_task"
> symbol from UP kernels -- which doesn't need the PER_CPU_OFF
> translation mechanism. In other words, I think you should
> be able to do this on your UP kernel:
>
> crash> px per_cpu__current_task
>
> and it should show the panic task address that comes up as the
> current task upon invocation. Is that right?
Yes this works
correctly.
>
> > Btw, the "struct" command caused another segementation
fault.
> > Here is gdb bt:
> >
> > (gdb) bt
> > #0 0x00007f74b3524a92 in strcmp () from /lib/libc.so.6
> > #1 0x0000000000534284 in lookup_partial_symtab (name=0x120e3c0
> > "x8664_pda")
> > at symtab.c:276
> > #2 0x00000000005344ed in lookup_symtab (name=0x120e3c0 "x8664_pda")
> > at symtab.c:228
> > #3 0x000000000060019d in c_lex () at c-exp.y:2149
> > #4 0x00000000006008f5 in c_parse_internal () at c-exp.c.tmp:1468
> > #5 0x00000000006022dd in c_parse () at c-exp.y:2225
> > #6 0x000000000055f614 in parse_exp_in_context
> > (stringptr=0x7fffbc2f2260,
> > block=<value optimized out>, comma=<value optimized
out>,
> > void_context_p=0, out_subexp=0x0) at parse.c:1094
> > #7 0x000000000055f924 in parse_expression (string=0x7fffbc2f2950
> > "x8664_pda")
> > at parse.c:1144
> > #8 0x000000000053291b in gdb_command_funnel (req=0xca2c00) at
> > symtab.c:4992
> > #9 0x00000000004c1740 in gdb_interface (req=0xca2c00) at
> > gdb_interface.c:407
> > #10 0x00000000004e9dca in datatype_info (name=0xb618a7 "x8664_pda",
> > member=0x0, dm=0x7fffbc2f3620) at symbols.c:4146
> > #11 0x00000000004eb1ee in arg_to_datatype (s=0xb618a7 "x8664_pda",
> > dm=0x7fffbc2f3620, flags=524290) at symbols.c:4867
> > #12 0x00000000004efa1b in cmd_datatype_common (flags=2048) at
> > symbols.c:4664
> > #13 0x000000000045efd9 in exec_command () at main.c:644
> > #14 0x000000000045f1fa in main_loop () at main.c:603
> > #15 0x00000000005452a9 in captured_command_loop (data=0x120e3c0)
> > at ./main.c:226
> > #16 0x00000000005434e4 in catch_errors (func=0x5452a0
> > <captured_command_loop>,
> > func_args=0x0, errstring=0x7f9d7c "", mask=<value
optimized out>)
> > at exceptions.c:520
> > #17 0x0000000000544d36 in captured_main (data=<value optimized
out>)
> > at ./main.c:924
> > #18 0x00000000005434e4 in catch_errors (func=0x544340 <captured_main>,
> > func_args=0x7fffbc2f38b0, errstring=0x7f9d7c "",
> > mask=<value optimized out>) at exceptions.c:520
> > #19 0x000000000054412f in gdb_main_entry (argc=<value optimized
out>,
> > argv=<value optimized out>) at ./main.c:939
> > #20 0x000000000045fece in main (argc=3, argv=0x7fffbc2f3a08)
at
> > main.c:517
> > (gdb) frame 1
> > #1 0x0000000000534284 in lookup_partial_symtab (name=0x120e3c0
> > "x8664_pda")
> > at symtab.c:276
> > 276 if (FILENAME_CMP (name, pst->filename) == 0)
> > (gdb) p name
> > $4 = 0x120e3c0 "x8664_pda"
> > (gdb) p pst
> > $5 = (struct partial_symtab *) 0x14d6040
> > (gdb) p pst->filename
> > $6 = 0x0
> > (gdb) p *pst
> > $7 = {next = 0x0, filename = 0x0, fullname = 0x0, dirname = 0x0,
> > objfile = 0x0, section_offsets = 0x0, textlow = 0, texthigh =
0,
> > dependencies = 0x0, number_of_dependencies = 0, globals_offset
= 0,
> > n_global_syms = 0, statics_offset = 0, n_static_syms = 0, symtab
=
> > 0x0,
> > read_symtab = 0, read_symtab_private = 0x0, readin = 0 '\0'}
> > (gdb)
> >
> >
> > I fixed it with the patch below:
> > -- crash-5.0.0/gdb-7.0/gdb/symtab.c 2010-01-15 10:41:00.919973440
> > +0200
> > +++ crash-5.0.0-64bit/gdb-7.0/gdb/symtab.c 2010-01-15
> > 10:19:21.436128740 +0200
> > @@ -256,7 +256,7 @@ got_symtab:
> > struct partial_symtab *
> > lookup_partial_symtab (const char *name)
> > {
> > - struct partial_symtab *pst;
> > + struct partial_symtab *pst = NULL;
> > struct objfile *objfile;
> > char *full_path = NULL;
> > char *real_path = NULL;
> > @@ -273,7 +273,7 @@ lookup_partial_symtab (const char *name)
> >
> > ALL_PSYMTABS (objfile, pst)
> > {
> > - if (FILENAME_CMP (name, pst->filename) == 0)
> > + if (pst->filename && FILENAME_CMP (name, pst->filename)
== 0)
> > {
> > return (pst);
> > }
> > @@ -311,7 +311,7 @@ lookup_partial_symtab (const char *name)
> > if (lbasename (name) == name)
> > ALL_PSYMTABS (objfile, pst)
> > {
> > - if (FILENAME_CMP (lbasename (pst->filename), name) == 0)
> > + if (pst->filename && FILENAME_CMP (lbasename (pst->filename),
name)
> > == 0)
> > return (pst);
> > }
>
> Weird -- so you're apparently able to do that when running any
> "struct <non-existent>" command from the crash command
line?
>
> But I can't reproduce that -- this is what should happen:
>
> crash> struct this_is_junk
> struct: invalid data structure reference: this_is_junk
> crash>
>
Yes, before patching
I always got segmentation fault when
using "struct". After
patch everything seems to be fine.
> and I don't understand what could be different
with your
> custom kernel?
>
> > >
> > > Either that, or if you can make the vmlinux/vmcore pair
available
> > > for me to download, I can look at it.
> >
> > I'll arrange this if the above information is not enough.
>
> Yes please -- can you put the vmlinux/vmcore pair somewhere
> where I can download it? You can send me the particulars
> off-line to anderson@redhat.com.
I've sent you
a email about the location.
- Ville