January 2010 - Crash-utility - Crash Utility List Archives

Re: [Crash-utility] crash-5.0: Segmentation fault with x86_64_get_active_set

by Dave Anderson

----- "ville mattila" <ville.mattila(a)stonesoft.com> wrote: > crash-utility-bounces(a)redhat.com wrote on 14.01.2010 16:08:41: > > > From: > > > > Dave Anderson <anderson(a)redhat.com> > > > > To: > > > > ----- "ville mattila" <ville.mattila(a)stonesoft.com> wrote: > > > > > Hello, > > > > > > I get segementation fault from our 64-bit kernel crash > > > This crash is caused by "echo c > /proc/sys-trigger". > > > The reason seems to be that the x86_64_cpu_pda_init is > > > not called at least gdb do not break there. > > > > > > Here is a little patch that fixes it. Everyting seems to > > > work correctly. I'll provide more info if needed. > > > > > > > > > --- crash-5.0.0/x86_64.c 2010-01-06 21:38:27.000000000 +0200 > > > +++ crash-5.0.0-64bit/x86_64.c 2010-01-14 08:24:13.679603706 +0200 > > > @@ -6325,6 +6325,12 @@ x86_64_get_active_set(void) > > > > > > ms = machdep->machspec; > > > > > > + if (!ms->current) { > > > + error(INFO, "%s: Cannot get active set, ms->current is NULL\n", > > > + __func__); > > > + return; > > > + } > > > + > > > > That patch just masks the real problem. > > > > What kernel version is it? > > > > If it's 2.6.30 or later, then x86_64_per_cpu_init() should > > be called, otherwise x86_64_cpu_pda_init() is called. And > > whichever one that gets called should allocate the array. > > > > 2.6.30 or later kernels should show: > > > > crash> struct x8664_pda > > struct: invalid data structure reference: x8664_pda > > crash> > > > > and they will use x86_64_per_cpu_init(). > > > > Kernels prior to 2.6.30 should show: > > > > crash> struct x8664_pda > > struct x8664_pda { > > struct task_struct *pcurrent; > > long unsigned int data_offset; > > long unsigned int kernelstack; > > long unsigned int oldrsp; > > long unsigned int debugstack; > > int irqcount; > > int cpunumber; > > char *irqstackptr; > > int nodenumber; > > unsigned int __softirq_pending; > > unsigned int __nmi_count; > > int mmu_state; > > struct mm_struct *active_mm; > > unsigned int apic_timer_irqs; > > } > > SIZE: 128 > > crash> > > > > and they will use x86_64_cpu_pda_init(). > > > > If you're having trouble with gdb, can you put some fprintf(fp, ...) > > calls in the relevant function and find out why it isn't doing > > the calloc() call? > > > Yes I thought so. This is a customized 2.6.31.7 kernel.org > kernel. This is a UP configuration e.g. CONFIG_SMP is n. > I think the problem is that the PER_CPU_OFF is not set. Ahah -- that would do it. UP x86_64 kernels are so rare that apparently nobody ever noticed, and I don't have a UP x86_64 vmcore to even test with. (RHEL5 doesn't even ship a UP x86_64 kernel). Anyway, that change went into 4.0-8.11. And as far as I can tell, x86_64_per_cpu_init() should still populate the single "ms->current[0]" task from the "per_cpu__current_task" symbol from UP kernels -- which doesn't need the PER_CPU_OFF translation mechanism. In other words, I think you should be able to do this on your UP kernel: crash> px per_cpu__current_task and it should show the panic task address that comes up as the current task upon invocation. Is that right? > Btw, the "struct" command caused another segementation fault. > Here is gdb bt: > > (gdb) bt > #0 0x00007f74b3524a92 in strcmp () from /lib/libc.so.6 > #1 0x0000000000534284 in lookup_partial_symtab (name=0x120e3c0 > "x8664_pda") > at symtab.c:276 > #2 0x00000000005344ed in lookup_symtab (name=0x120e3c0 "x8664_pda") > at symtab.c:228 > #3 0x000000000060019d in c_lex () at c-exp.y:2149 > #4 0x00000000006008f5 in c_parse_internal () at c-exp.c.tmp:1468 > #5 0x00000000006022dd in c_parse () at c-exp.y:2225 > #6 0x000000000055f614 in parse_exp_in_context > (stringptr=0x7fffbc2f2260, > block=<value optimized out>, comma=<value optimized out>, > void_context_p=0, out_subexp=0x0) at parse.c:1094 > #7 0x000000000055f924 in parse_expression (string=0x7fffbc2f2950 > "x8664_pda") > at parse.c:1144 > #8 0x000000000053291b in gdb_command_funnel (req=0xca2c00) at > symtab.c:4992 > #9 0x00000000004c1740 in gdb_interface (req=0xca2c00) at > gdb_interface.c:407 > #10 0x00000000004e9dca in datatype_info (name=0xb618a7 "x8664_pda", > member=0x0, dm=0x7fffbc2f3620) at symbols.c:4146 > #11 0x00000000004eb1ee in arg_to_datatype (s=0xb618a7 "x8664_pda", > dm=0x7fffbc2f3620, flags=524290) at symbols.c:4867 > #12 0x00000000004efa1b in cmd_datatype_common (flags=2048) at > symbols.c:4664 > #13 0x000000000045efd9 in exec_command () at main.c:644 > #14 0x000000000045f1fa in main_loop () at main.c:603 > #15 0x00000000005452a9 in captured_command_loop (data=0x120e3c0) > at ./main.c:226 > #16 0x00000000005434e4 in catch_errors (func=0x5452a0 > <captured_command_loop>, > func_args=0x0, errstring=0x7f9d7c "", mask=<value optimized out>) > at exceptions.c:520 > #17 0x0000000000544d36 in captured_main (data=<value optimized out>) > at ./main.c:924 > #18 0x00000000005434e4 in catch_errors (func=0x544340 <captured_main>, > func_args=0x7fffbc2f38b0, errstring=0x7f9d7c "", > mask=<value optimized out>) at exceptions.c:520 > #19 0x000000000054412f in gdb_main_entry (argc=<value optimized out>, > argv=<value optimized out>) at ./main.c:939 > #20 0x000000000045fece in main (argc=3, argv=0x7fffbc2f3a08) at > main.c:517 > (gdb) frame 1 > #1 0x0000000000534284 in lookup_partial_symtab (name=0x120e3c0 > "x8664_pda") > at symtab.c:276 > 276 if (FILENAME_CMP (name, pst->filename) == 0) > (gdb) p name > $4 = 0x120e3c0 "x8664_pda" > (gdb) p pst > $5 = (struct partial_symtab *) 0x14d6040 > (gdb) p pst->filename > $6 = 0x0 > (gdb) p *pst > $7 = {next = 0x0, filename = 0x0, fullname = 0x0, dirname = 0x0, > objfile = 0x0, section_offsets = 0x0, textlow = 0, texthigh = 0, > dependencies = 0x0, number_of_dependencies = 0, globals_offset = 0, > n_global_syms = 0, statics_offset = 0, n_static_syms = 0, symtab = > 0x0, > read_symtab = 0, read_symtab_private = 0x0, readin = 0 '\0'} > (gdb) > > > I fixed it with the patch below: > -- crash-5.0.0/gdb-7.0/gdb/symtab.c 2010-01-15 10:41:00.919973440 > +0200 > +++ crash-5.0.0-64bit/gdb-7.0/gdb/symtab.c 2010-01-15 > 10:19:21.436128740 +0200 > @@ -256,7 +256,7 @@ got_symtab: > struct partial_symtab * > lookup_partial_symtab (const char *name) > { > - struct partial_symtab *pst; > + struct partial_symtab *pst = NULL; > struct objfile *objfile; > char *full_path = NULL; > char *real_path = NULL; > @@ -273,7 +273,7 @@ lookup_partial_symtab (const char *name) > > ALL_PSYMTABS (objfile, pst) > { > - if (FILENAME_CMP (name, pst->filename) == 0) > + if (pst->filename && FILENAME_CMP (name, pst->filename) == 0) > { > return (pst); > } > @@ -311,7 +311,7 @@ lookup_partial_symtab (const char *name) > if (lbasename (name) == name) > ALL_PSYMTABS (objfile, pst) > { > - if (FILENAME_CMP (lbasename (pst->filename), name) == 0) > + if (pst->filename && FILENAME_CMP (lbasename (pst->filename), name) > == 0) > return (pst); > } Weird -- so you're apparently able to do that when running any "struct <non-existent>" command from the crash command line? But I can't reproduce that -- this is what should happen: crash> struct this_is_junk struct: invalid data structure reference: this_is_junk crash> and I don't understand what could be different with your custom kernel? > > > > Either that, or if you can make the vmlinux/vmcore pair available > > for me to download, I can look at it. > > I'll arrange this if the above information is not enough. Yes please -- can you put the vmlinux/vmcore pair somewhere where I can download it? You can send me the particulars off-line to anderson(a)redhat.com. Thanks, Dave

15 years, 5 months

2
1
0 / 0

Re: [Crash-utility] crash-5.0: zero-size memory-allocation

by Dave Anderson

----- "ville mattila" <ville.mattila(a)stonesoft.com> wrote: > That would be usefull, just warn that some major corruption seems to have > happen.It is always good to get atleast some crash info out. For example > dmesg and bt. I'll gladly test patches, if needed. Patch attached... > Also one question. Is there some hidden option that will show all the > hidden crash command line options, e.g. --no_kmem_cache and alike? No, for the most part they are there for debugging crash itself, or were put in place as a result of specific odd-ball vmcores, or short-time kernels that were missing a key ingredient, etc. So, for example, with the attached patch, --no_kmem_cache should not be needed, even with your horrifically corrupted vmcore... Dave

15 years, 6 months

2
1
0 / 0

Re: [Crash-utility] crash-5.0: Segmentation fault with x86_64_get_active_set

by Dave Anderson

----- "ville mattila" <ville.mattila(a)stonesoft.com> wrote: > Hello, > > I get segementation fault from our 64-bit kernel crash > This crash is caused by "echo c > /proc/sys-trigger". > The reason seems to be that the x86_64_cpu_pda_init is > not called at least gdb do not break there. > > Here is a little patch that fixes it. Everyting seems to > work correctly. I'll provide more info if needed. > > > --- crash-5.0.0/x86_64.c 2010-01-06 21:38:27.000000000 +0200 > +++ crash-5.0.0-64bit/x86_64.c 2010-01-14 08:24:13.679603706 +0200 > @@ -6325,6 +6325,12 @@ x86_64_get_active_set(void) > > ms = machdep->machspec; > > + if (!ms->current) { > + error(INFO, "%s: Cannot get active set, ms->current is NULL\n", > + __func__); > + return; > + } > + That patch just masks the real problem. What kernel version is it? If it's 2.6.30 or later, then x86_64_per_cpu_init() should be called, otherwise x86_64_cpu_pda_init() is called. And whichever one that gets called should allocate the array. 2.6.30 or later kernels should show: crash> struct x8664_pda struct: invalid data structure reference: x8664_pda crash> and they will use x86_64_per_cpu_init(). Kernels prior to 2.6.30 should show: crash> struct x8664_pda struct x8664_pda { struct task_struct *pcurrent; long unsigned int data_offset; long unsigned int kernelstack; long unsigned int oldrsp; long unsigned int debugstack; int irqcount; int cpunumber; char *irqstackptr; int nodenumber; unsigned int __softirq_pending; unsigned int __nmi_count; int mmu_state; struct mm_struct *active_mm; unsigned int apic_timer_irqs; } SIZE: 128 crash> and they will use x86_64_cpu_pda_init(). If you're having trouble with gdb, can you put some fprintf(fp, ...) calls in the relevant function and find out why it isn't doing the calloc() call? Either that, or if you can make the vmlinux/vmcore pair available for me to download, I can look at it. Dave

15 years, 6 months

2
1
0 / 0

crash-5.0: Segmentation fault with x86_64_get_active_set

by ville.mattila＠stonesoft.com

Hello, I get segementation fault from our 64-bit kernel crash This crash is caused by "echo c > /proc/sys-trigger". The reason seems to be that the x86_64_cpu_pda_init is not called at least gdb do not break there. Here is a little patch that fixes it. Everyting seems to work correctly. I'll provide more info if needed. --- crash-5.0.0/x86_64.c 2010-01-06 21:38:27.000000000 +0200 +++ crash-5.0.0-64bit/x86_64.c 2010-01-14 08:24:13.679603706 +0200 @@ -6325,6 +6325,12 @@ x86_64_get_active_set(void) ms = machdep->machspec; + if (!ms->current) { + error(INFO, "%s: Cannot get active set, ms->current is NULL\n", + __func__); + return; + } +

15 years, 6 months

1
0
0 / 0

Re: [Crash-utility] crash-5.0: zero-size memory-allocation

by Dave Anderson

----- "ville mattila" <ville.mattila(a)stonesoft.com> wrote: > > From: > > > > Dave Anderson <anderson(a)redhat.com> > > > ... > > But your kernel shows cache_cache.buffer_size set to zero -- and the > > ASSIGN_SIZE(kmem_cache_s) above dutifully downsized the data structure > > size from 204 to zero. Later on, that size was used to allocate a > > kmem_cache buffer, which failed when a GETBUF() was called with a zero-size. > > > > I guess a check could be made above for a zero cache_cache.buffer_size, > > but why would that ever be? > > > > Try this: > > > > # crash --no_kmem_cache vmlinux vmcore > > > > which will allow you to get past the kmem_cache initialization. > > > > Then enter: > > > > crash> p cache_cache > > > > Does the "buffer_size" member really show zero? > > Yes it seems so! > initialize_task_state: using old defaults > <readmem: 8067a300, KVADDR, "fill_task_struct", 868, (ROE), 86e3f78> > addr: 8067a300 paddr: 67a300 cnt: 868 > STATE: TASK_RUNNING (PANIC) > > crash> p cache_cache > cache_cache = GETBUF(128 -> 0) > <readmem: 8067f1c0, KVADDR, "gdb_readmem_callback", 204, (ROE), 8ac00d8> > addr: 8067f1c0 paddr: 67f1c0 cnt: 204 > $3 = { > array = {0x0, 0x8067f1c4, 0x8067f1c4, 0x0, 0x0, 0x0, 0x0, 0x0, > 0xf7813e00, 0xf7849400, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, > batchcount = 0, > limit = 0, > shared = 0, > buffer_size = 0, > reciprocal_buffer_size = 0, > flags = 0, > num = 0, > gfporder = 0, > gfpflags = 60, > colour = 120, > colour_off = 8, > slabp_cache = 0x100, > slab_size = 16777216, > dflags = 0, > ctor = 0xf, > name = 0x0, > next = { > next = 0x0, > prev = 0x2 > }, > nodelists = {0x40} > } > FREEBUF(0) That's some serious corruption! > > > > BTW, you can work around the problem by commenting out the call > > to kmem_cache_downsize() in vm_init(). > > This workaround works ok. But even then, if you comment out the call to kmem_cache_downsize(), the kmem_cache_init() function could not have done anything useful because the "cache_cache.next.next" pointer is corrupted with a NULL, which points to the first of the chain of kmem_cache slab cache headers. I'm surprised it managed to continue without running into another roadblock -- did it display the "crash: unable to initialize kmem slab cache subsystem" error message? > > (And if you're using makedumpfile with excluded pages, hope that > > the problem I described above doesn't occur...) > > > We are not excluding files so this is not a big issue. Also > the --no_kmem_cache lets me open dump and let me do quite many things > already. Like I mentioned before, I could put a check in kmem_cache_downsize() to check for a zero buffer_size, but the odds of that happening are absurdly small. I suppose I could check whether the value is less than the kmem_cache.nodelists structure offset. Dave

15 years, 6 months

2
1
0 / 0

Re: [Crash-utility] crash-5.0: zero-size memory-allocation

by Dave Anderson

----- "ville mattila" <ville.mattila(a)stonesoft.com> wrote: > Hello, > > We have a custom kernel based on 2.6.27.39. This kernel > has 2/2 memory split. Now we have one crash dump that can be > successfully be opened with crash 4.0-8.8 but not with crash 5.0. > This crashdump happens because double free of memory block, so there > might be some memory corruption in cache data area. > > Unfortunately I cannot pinpoint the exact version where this > starts to happen because I could not find older crash releases. > > Here is some debug info. > > The tail of crash -d 10 output > ... > NOTE: page_hash_table does not exist in this kernel > please wait... (gathering kmem slab cache data)<readmem: 8075801c, > KVADDR, > "cache_chain", 4, (FOE), ffb944f8> > addr: 8075801c paddr: 75801c cnt: 4 > GETBUF(128 -> 0) > FREEBUF(0) > GETBUF(204 -> 0) > <readmem: 8067f1c0, KVADDR, "kmem_cache buffer", 204, (FOE), 8520f00> > addr: 8067f1c0 paddr: 67f1c0 cnt: 204 > GETBUF(128 -> 1) > FREEBUF(1) > GETBUF(128 -> 1) > FREEBUF(1) > > kmem_cache_downsize: SIZE(kmem_cache_s): 204 cache_cache.buffer_size: 0 > kmem_cache_downsize: nr_node_ids: 1 > FREEBUF(0) > > crash: zero-size memory allocation! (called from 80b7b7b) > > > addr2line -e crash 80b7b7b > /workarea/build/packages/crash/crash-5.0.0-32bit/memory.c:7439 > > I'm happy to test patches. Nice bug report! Here's what's happening: It's related to this patch that went into 4.1.0: - Fix for a potential failure to initialize the kmem slab cache subsystem on 2.6.22 and later CONFIG_SLAB kernels if the dumpfile has pages excluded by the makedumpfile facility. Without the patch, the following error message would be displayed during initialization: "crash: page excluded: kernel virtual address: <address> type: kmem_cache_s buffer", followed by "crash: unable to initialize kmem slab cache subsystem". (anderson(a)redhat.com) The patch was put in place due to this definition of the kmem_cache data structure: struct kmem_cache { /* 1) per-cpu data, touched during every alloc/free */ struct array_cache *array[NR_CPUS]; /* 2) Cache tunables. Protected by cache_chain_mutex */ unsigned int batchcount; unsigned int limit; ... [ snip ] ... * We put nodelists[] at the end of kmem_cache, because we want to size * this array to nr_node_ids slots instead of MAX_NUMNODES * (see kmem_cache_init()) * We still use [MAX_NUMNODES] and not [1] or [0] because cache_cache * is statically defined, so we reserve the max number of nodes. */ struct kmem_list3 *nodelists[MAX_NUMNODES]; /* * Do not add fields after nodelists[] */ }; where for all kernel instances of the kmem_cache data structure *except* for the head "cache_cache" kmem_cache structure, every other kmem_cache structure in the kernel has its nodelists[] array downsized to whatever "nr_node_ids" is initialized to. The actual size of all of the downsized kmem_cache data structures can be found in the head "cache_cache.buffer_size" field. But when the crash utility queries gdb for the size of a kmem_cache structure it gets the "full" size as declared in the vmlinux debuginfo data. And so whenever a kmem_cache structure was read by crash, it was using the "full" size instead of the downsized size. Doing that type of over-sized read could potentially extend into the next page, and there was a reported case where doing that happened to extend into a page that was excluded by makedumpfile. Hence the kmem_cache_downsize() function added to memory.c. Anyway, given that your debug output shows: kmem_cache_downsize: SIZE(kmem_cache_s): 204 cache_cache.buffer_size: 0 kmem_cache_downsize: nr_node_ids: 1 In vm_init() there was an initial STRUCT_SIZE_INIT(kmem_cache_s, ...) that set the size to 204 bytes. But then kmem_cache_downsize() was called to downsize to whatever cache_cache.buffer_size contains: ... buffer_size = UINT(cache_buf + MEMBER_OFFSET("kmem_cache", "buffer_size")); if (buffer_size < SIZE(kmem_cache_s)) { ASSIGN_SIZE(kmem_cache_s) = buffer_size; if (kernel_symbol_exists("nr_node_ids")) { get_symbol_data("nr_node_ids", sizeof(int), &nr_node_ids); vt->kmem_cache_len_nodes = nr_node_ids; } else vt->kmem_cache_len_nodes = 1; if (CRASHDEBUG(1)) { fprintf(fp, "\nkmem_cache_downsize: SIZE(kmem_cache_s): %ld " "cache_cache.buffer_size: %d\n", STRUCT_SIZE("kmem_cache"), buffer_size); fprintf(fp, "kmem_cache_downsize: nr_node_ids: %ld\n", vt->kmem_cache_len_nodes); } } But your kernel shows cache_cache.buffer_size set to zero -- and the ASSIGN_SIZE(kmem_cache_s) above dutifully downsized the data structure size from 204 to zero. Later on, that size was used to allocate a kmem_cache buffer, which failed when a GETBUF() was called with a zero-size. I guess a check could be made above for a zero cache_cache.buffer_size, but why would that ever be? Try this: # crash --no_kmem_cache vmlinux vmcore which will allow you to get past the kmem_cache initialization. Then enter: crash> p cache_cache Does the "buffer_size" member really show zero? BTW, you can work around the problem by commenting out the call to kmem_cache_downsize() in vm_init(). (And if you're using makedumpfile with excluded pages, hope that the problem I described above doesn't occur...) Dave

15 years, 6 months

2
1
0 / 0

[PATCH] Display "irqaction mask" only if available

by Bernhard Walle

Display "irqaction mask" only if available The member "mask" has been removed from "struct irqaction" in the kernel per commit ef79f8e191722dbc1fc33bdfc448f572266c37e9 Author: Rusty Russell <rusty(a)rustcorp.com.au> Date: Thu Sep 24 09:34:37 2009 -0600 cpumask: remove unused mask field from struct irqaction. Up until 1.1.83, the primitive human tribes used struct sigaction for interrupts. The sa_mask field was overloaded to hold a pointer to the name. When someone created the new "struct irqaction" they carried across the "mask" field as a kind of ancestor worship: the fact that it was unused makes clear its spiritual significance. Signed-off-by: Rusty Russell <rusty(a)rustcorp.com.au> This patch only displays the "irqaction mask" in the "irq" command if the member is present. It fixes the following error (kernel was 2.6.33): crash> irq irq: invalid structure member offset: irqaction_mask FILE: kernel.c LINE: 5001 FUNCTION: generic_dump_irq() [./crash.orig] error trace: 8097e44 => 8109541 => 810c0ec => 8156299 Signed-off-by: Bernhard Walle <bernhard(a)bwalle.de>

15 years, 6 months

2
1
0 / 0

crash-5.0: zero-size memory-allocation

by ville.mattila＠stonesoft.com

Hello, We have a custom kernel based on 2.6.27.39. This kernel has 2/2 memory split. Now we have one crash dump that can be successfully be opened with crash 4.0-8.8 but not with crash 5.0. This crashdump happens because double free of memory block, so there might be some memory corruption in cache data area. Unfortunately I cannot pinpoint the exact version where this starts to happen because I could not find older crash releases. Here is some debug info. The tail of crash -d 10 output ... NOTE: page_hash_table does not exist in this kernel please wait... (gathering kmem slab cache data)<readmem: 8075801c, KVADDR, "cache_chain", 4, (FOE), ffb944f8> addr: 8075801c paddr: 75801c cnt: 4 GETBUF(128 -> 0) FREEBUF(0) GETBUF(204 -> 0) <readmem: 8067f1c0, KVADDR, "kmem_cache buffer", 204, (FOE), 8520f00> addr: 8067f1c0 paddr: 67f1c0 cnt: 204 GETBUF(128 -> 1) FREEBUF(1) GETBUF(128 -> 1) FREEBUF(1) kmem_cache_downsize: SIZE(kmem_cache_s): 204 cache_cache.buffer_size: 0 kmem_cache_downsize: nr_node_ids: 1 FREEBUF(0) crash: zero-size memory allocation! (called from 80b7b7b) > addr2line -e crash 80b7b7b /workarea/build/packages/crash/crash-5.0.0-32bit/memory.c:7439 I'm happy to test patches.

15 years, 6 months

1
0
0 / 0

Re: [Crash-utility] Degradation with crash 5.0.0 on x86 -- [PATCH]

by Dave Anderson

----- "Dave Anderson" <anderson(a)redhat.com> wrote: > ----- "Shahar Luxenberg" <shahar(a)checkpoint.com> wrote: > > > Hi, > > > > > > > > Environment: Red Hat Enterprise Linux Server release 5.2 (Tikanga), > > x86, 2.6.18-92.el5 > > > > I’ve installed crash 5.0.0 and noticed lots of error messages during > > startup of the form: > > > > ‘crash: input string too large: "804328c4:" (9 vs 8)’ > > > > This doesn’t happen with crash 4.1.2 > > > > > > > > While debugging it a little, I’ve noticed that BUG_x86 is calling gdb > > with the x/i command: > > > > sprintf(buf1, "x/%ldi 0x%lx", spn->value - sp->value, sp->value); > > > > The return buffer (buf2) is: 0x80430800: push %ebp > > > > On 4.1.2, the return buffer (buf2) is: 0x80430800 <do_exit>: push %ebp > > > > This explains the problem since parse_line will parse the line > > differently returning ‘0x80430800:’ on arglist[0] and nothing on > > arglist[2] (crash 5.0.0) while returning 0x80430800 on arglist[0] and > > ‘push’ on arglist[2]. > > > > Have you noticed this kind of problem? > > I see it now, at least on 2.6.18-era kernels. It doesn't seem to happen > with earlier RHEL4 (2.6.9-era) vmlinux files for some reason. And on anything > later than 2.6.20, the code in question isn't run. Anyway, as you tracked > it down, the x86 code disassembly output is different, but should be trivial > to fix. > > Thanks for the report, > Dave Patch attached, and queued for the next release. Dave

15 years, 6 months

1
0
0 / 0

Re: [Crash-utility] Degradation with crash 5.0.0 on x86

by Dave Anderson

----- "Shahar Luxenberg" <shahar(a)checkpoint.com> wrote: > Hi, > > > > Environment: Red Hat Enterprise Linux Server release 5.2 (Tikanga), > x86, 2.6.18-92.el5 > > I’ve installed crash 5.0.0 and noticed lots of error messages during > startup of the form: > > ‘crash: input string too large: "804328c4:" (9 vs 8)’ > > This doesn’t happen with crash 4.1.2 > > > > While debugging it a little, I’ve noticed that BUG_x86 is calling gdb > with the x/i command: > > sprintf(buf1, "x/%ldi 0x%lx", spn->value - sp->value, sp->value); > > The return buffer (buf2) is: 0x80430800: push %ebp > > On 4.1.2, the return buffer (buf2) is: 0x80430800 <do_exit>: push %ebp > > This explains the problem since parse_line will parse the line > differently returning ‘0x80430800:’ on arglist[0] and nothing on > arglist[2] (crash 5.0.0) while returning 0x80430800 on arglist[0] and > ‘push’ on arglist[2]. > > Have you noticed this kind of problem? I see it now, at least on 2.6.18-era kernels. It doesn't seem to happen with earlier RHEL4 (2.6.9-era) vmlinux files for some reason. And on anything later than 2.6.20, the code in question isn't run. Anyway, as you tracked it down, the x86 code disassembly output is different, but should be trivial to fix. Thanks for the report, Dave

15 years, 6 months

1
0
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility January 2010