Dave Anderson wrote:
Hi Castor,
Testing this latest patch on an ia64, there is improvement over
the original, but for one particular module, I get an abort that
was generated from glibc that I have never encountered before.
I saw it the first time when running "mod -S", and subsequently
narrowed it down to the ipv6 module. Check this out:
Typically a realloc() error would occur due to an overflow of a previously
allocated buffer which trashed the malloc-bookkeeping. I thought at first
that the supplied command string buffer of 1500 (BUFSIZE) bytes had
overflowed, but that is a stack buffer and is not malloc'd. And we
don't overflow it anyway.
In any case, with a few debugging lines added, here is the command
buffer for the ipv6 add-symbol-file command, which contains 707 bytes:
add-symbol-file /lib/modules/2.6.18-1.2767.el5/kernel/net/ipv6/ipv6.ko 0xa00000021ed605b0
-s .exit.text
0xa00000021edb49a0 -s .rodata 0xa00000021edbd4c8 -s __ksymtab_strings 0xa00000021edbdc08
-s __versions
0xa00000021edbdf98 -s .data 0xa00000021edd6a20 -s .data.rel.ro 0xa00000021edd6c00 -s
__ksymtab_gpl 0xa00000021edd6df8 -s
__kcrctab_gpl 0xa00000021edd6ed8 -s .data.rel 0xa00000021edd6f48 -s .data.rel.local
0xa00000021ee39940 -s
.data.rel.ro.local 0xa00000021ee3a9c0 -s .data.read_mostly 0xa00000021ee3a9e0 -s __ksymtab
0xa00000021ee3aa60 -s
__kcrctab 0xa00000021ee3ac30 -s .gnu.linkonce.this_module 0xa00000021ee3ad80 -s .sdata
0xa00000021ee5d730 -s .bss
0xa00000021ee5b000 -s .sbss 0xa00000021ee5e8b8
and the realloc() crash that caused the abort was made here in
this part of the upper while loop in add_symbol_file_command()
in gdb-6.1/gdb/symfile.c:
if (expecting_sec_addr)
{
sect_opts[section_index].value = arg;
expecting_sec_addr = 0;
if (++section_index > num_sect_opts)
{
num_sect_opts *= 2;
sect_opts = ((struct sect_opt *)
xrealloc (sect_opts,
num_sect_opts
* sizeof (struct sect_opt)));
}
}
else
error ("USAGE: add-symbol-file <filename>
<textaddress> [-mapped] [-readnow] [-s <secname>
<addr>]*");
At the point of failure, the argcnt had gone from 0 to 47, and the
current arg was pointing to "0xa00000021ee5d730", which is the .sdata
section address. This was the first xrealloc() call made from the
loop above -- so it almost made it through the whole command string
without even having to do an xrealloc().
Anyway, that's as far as I took it -- but perhaps you can
artificially re-create an add-symbol-file command with a command
line size approaching the 700-odd bytes that this one consumes and
which has ~36 sections?
I did put a debug printf prior to each call to execute_command() that
shows the strlen of the command buffer and the lm->mod_sections, and
then did a "mod -S" to load all modules. Note that both the 707 bytes
and the number of sections (36) of the ipv6 module were both the largest:
crash> mod -S | grep strlen
strlen(req->buf): 433 mod_sections: 23
strlen(req->buf): 374 mod_sections: 22
strlen(req->buf): 433 mod_sections: 23
strlen(req->buf): 240 mod_sections: 19
strlen(req->buf): 277 mod_sections: 29
strlen(req->buf): 375 mod_sections: 28
strlen(req->buf): 422 mod_sections: 25
strlen(req->buf): 421 mod_sections: 28
strlen(req->buf): 317 mod_sections: 36
strlen(req->buf): 326 mod_sections: 28
strlen(req->buf): 600 mod_sections: 29
strlen(req->buf): 180 mod_sections: 19
strlen(req->buf): 404 mod_sections: 23
strlen(req->buf): 615 mod_sections: 29
strlen(req->buf): 313 mod_sections: 35
strlen(req->buf): 501 mod_sections: 29
strlen(req->buf): 286 mod_sections: 30
strlen(req->buf): 397 mod_sections: 28
strlen(req->buf): 299 mod_sections: 24
strlen(req->buf): 419 mod_sections: 24
strlen(req->buf): 374 mod_sections: 27
strlen(req->buf): 420 mod_sections: 24
strlen(req->buf): 463 mod_sections: 25
strlen(req->buf): 420 mod_sections: 24
strlen(req->buf): 413 mod_sections: 31
strlen(req->buf): 310 mod_sections: 21
strlen(req->buf): 322 mod_sections: 30
strlen(req->buf): 302 mod_sections: 26
strlen(req->buf): 306 mod_sections: 31
strlen(req->buf): 313 mod_sections: 32
strlen(req->buf): 707 mod_sections: 36
*** glibc detected *** ./crash: realloc(): invalid next size: 0x600000000432f430 ***
So the 707-byte ipv6 command string, or maybe its 36 sections,
or a combination of the two, sends crash/gdb into the weeds...
Dave
# ./crash
crash 4.0-3.16
Copyright (C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006 Fujitsu Limited
Copyright (C) 2006 VA Linux Systems Japan K.K.
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "ia64-unknown-linux-gnu"...
KERNEL: /usr/lib/debug/lib/modules/2.6.18-1.2767.el5/vmlinux
DUMPFILE: /dev/mem
CPUS: 64
DATE: Wed Jan 3 10:43:04 2007
UPTIME: 01:40:46
LOAD AVERAGE: 0.15, 0.11, 0.17
TASKS: 629
NODENAME:
altix3.lab.boston.redhat.com
RELEASE: 2.6.18-1.2767.el5
VERSION: #1 SMP Wed Nov 29 17:38:14 EST 2006
MACHINE: ia64 (1500 Mhz)
MEMORY: 122.5 GB
PID: 10699
COMMAND: "crash"
TASK: e00000b04e300000 [THREAD_INFO: e00000b04e301040]
CPU: 7
STATE: TASK_RUNNING (ACTIVE)
crash> mod -l
MODULE NAME SIZE OBJECT FILE
a00000021e189c00 ehci_hcd 204860 (not loaded) [CONFIG_KALLSYMS]
a00000021e1bd100 uhci_hcd 185680 (not loaded) [CONFIG_KALLSYMS]
a00000021e1efb00 ohci_hcd 179956 (not loaded) [CONFIG_KALLSYMS]
a00000021e23cb00 dm_zero 134640 (not loaded) [CONFIG_KALLSYMS]
a00000021e287700 jbd 262432 (not loaded) [CONFIG_KALLSYMS]
a00000021e2b9580 sd_mod 170772 (not loaded) [CONFIG_KALLSYMS]
a00000021e303380 qla1280 276848 (not loaded) [CONFIG_KALLSYMS]
a00000021e374300 ext3 414624 (not loaded) [CONFIG_KALLSYMS]
a00000021e3d9600 scsi_mod 387008 (not loaded) [CONFIG_KALLSYMS]
a00000021e418980 mptbase 235792 (not loaded) [CONFIG_KALLSYMS]
a00000021e44c780 scsi_transport_spi 183672 (not loaded) [CONFIG_KALLSYMS]
a00000021e47ec80 mptscsih 176288 (not loaded) [CONFIG_KALLSYMS]
a00000021e4ab700 mptspi 162536 (not loaded) [CONFIG_KALLSYMS]
a00000021e4e1580 scsi_transport_fc 203748 (not loaded) [CONFIG_KALLSYMS]
a00000021e5fcd00 dm_mod 253328 (not loaded) [CONFIG_KALLSYMS]
a00000021e718e80 qla2xxx 1090472 (not loaded) [CONFIG_KALLSYMS]
a00000021e825680 dm_mirror 187608 (not loaded) [CONFIG_KALLSYMS]
a00000021e857480 autofs4 178336 (not loaded) [CONFIG_KALLSYMS]
a00000021e884800 dm_snapshot 167224 (not loaded) [CONFIG_KALLSYMS]
a00000021e8b1a00 lp 156512 (not loaded) [CONFIG_KALLSYMS]
a00000021e8fdc00 cdrom 206776 (not loaded) [CONFIG_KALLSYMS]
a00000021e935400 sg 203464 (not loaded) [CONFIG_KALLSYMS]
a00000021e96f080 ide_cd 211824 (not loaded) [CONFIG_KALLSYMS]
a00000021ea6bf80 tg3 362244 (not loaded) [CONFIG_KALLSYMS]
a00000021eaa6780 parport 208284 (not loaded) [CONFIG_KALLSYMS]
a00000021eacef00 button 144200 (not loaded) [CONFIG_KALLSYMS]
a00000021eb10980 parport_pc 184504 (not loaded) [CONFIG_KALLSYMS]
a00000021eb3e400 vfat 157504 (not loaded) [CONFIG_KALLSYMS]
a00000021eb9a280 fat 239936 (not loaded) [CONFIG_KALLSYMS]
a00000021ecd6880 sunrpc 468360 (not loaded) [CONFIG_KALLSYMS]
a00000021ee3ad80 ipv6 1141140 (not loaded) [CONFIG_KALLSYMS]
a00000021ee9f580 bluetooth 375704 (not loaded) [CONFIG_KALLSYMS]
a00000021eeef880 l2cap 310456 (not loaded) [CONFIG_KALLSYMS]
a00000021ef48680 rfcomm 347144 (not loaded) [CONFIG_KALLSYMS]
a00000021ef97980 hidp 294256 (not loaded) [CONFIG_KALLSYMS]
crash> mod -s ipv6
*** glibc detected *** ./crash: realloc(): invalid next size: 0x6000000001921fc0 ***
======= Backtrace: =========
/lib/libc.so.6.1[0x20000000002f2a70]
/lib/libc.so.6.1(realloc-0x1cb0b0)[0x20000000002f5e20]
./crash(xmrealloc+0x1fffffffffee6c40)[0x40000000003a7b20]
./crash[0x40000000002ff3a0]
./crash[0x4000000000422000]
./crash(cmd_func+0x1ffffffffff61430)[0x4000000000422320]
./crash(execute_command+0x1fffffffffee2410)[0x40000000003a3310]
./crash(gdb_command_funnel+0x1fffffffffe2f900)[0x40000000002f0810]
./crash(gdb_interface+0x1fffffffffcd7590)[0x40000000001984b0]
./crash[0x4000000000235af0]
./crash(load_module_symbols+0x1fffffffffd748f0)[0x4000000000235820]
./crash[0x4000000000175820]
./crash(cmd_mod+0x2000000000129d68)[0x4000000000174930]
./crash(exec_command+0x1fffffffffb99db0)[0x400000000005acf0]
./crash(main_loop+0x1fffffffffb9a2e0)[0x400000000005a8e0]
./crash(current_interp_command_loop+0x200000000001fb90)[0x40000000004e0ae0]
./crash[0x4000000000319820]
./crash[0x400000000039f1d0]
./crash[0x40000000003a4080]
./crash(catch_errors+0x1fffffffffee31e0)[0x40000000003a4140]
./crash[0x400000000031a790]
./crash[0x400000000039f1d0]
./crash[0x40000000003a4080]
./crash(catch_errors+0x1fffffffffee31e0)[0x40000000003a4140]
./crash(gdb_main+0x1fffffffffe587d0)[0x4000000000319740]
./crash(gdb_main_entry+0x1fffffffffe58860)[0x40000000003197e0]
./crash(gdb_main_loop+0x1fffffffffcd54e0)[0x4000000000196470]
./crash(main+0x1fffffffffb99820)[0x400000000005a330]
/lib/libc.so.6.1(__libc_start_main-0x2818e0)[0x200000000023f6c0]
./crash(_start+0x1fffffffffb95250)[0x4000000000056200]
======= Memory map: ========
00000000-00004000 r--p 00000000 00:00 0
2000000000000000-2000000000038000 r-xp 00000000 fd:00 10256390 /lib/ld-2.5.so
2000000000044000-2000000000050000 rw-p 00034000 fd:00 10256390 /lib/ld-2.5.so
2000000000050000-2000000000114000 r-xp 00000000 fd:00 10256405
/lib/libm-2.5.so
2000000000114000-2000000000120000 ---p 000c4000 fd:00 10256405
/lib/libm-2.5.so
2000000000120000-2000000000124000 rw-p 000c0000 fd:00 10256405
/lib/libm-2.5.so
2000000000124000-20000000001b0000 r-xp 00000000 fd:00 10883077
/usr/lib/libncurses.so.5.5
20000000001b0000-20000000001bc000 ---p 0008c000 fd:00 10883077
/usr/lib/libncurses.so.5.5
20000000001bc000-20000000001cc000 rw-p 00088000 fd:00 10883077
/usr/lib/libncurses.so.5.5
20000000001cc000-20000000001d0000 rw-p 20000000001cc000 00:00 0
20000000001d0000-20000000001d8000 r-xp 00000000 fd:00 10256403
/lib/libdl-2.5.so
20000000001d8000-20000000001e4000 ---p 00008000 fd:00 10256403
/lib/libdl-2.5.so
20000000001e4000-20000000001e8000 rw-p 00004000 fd:00 10256403
/lib/libdl-2.5.so
20000000001e8000-200000000020c000 r-xp 00000000 fd:00 10882711
/usr/lib/libz.so.1.2.3
200000000020c000-2000000000218000 ---p 00024000 fd:00 10882711
/usr/lib/libz.so.1.2.3
2000000000218000-200000000021c000 rw-p 00020000 fd:00 10882711
/usr/lib/libz.so.1.2.3
200000000021c000-2000000000480000 r-xp 00000000 fd:00 10256397
/lib/libc-2.5.so
2000000000480000-200000000048c000 ---p 00264000 fd:00 10256397
/lib/libc-2.5.so
200000000048c000-2000000000498000 rw-p 00260000 fd:00 10256397
/lib/libc-2.5.so
2000000000498000-20000000004d8000 rw-p 2000000000498000 00:00 0
20000000004d8000-2000000003c1c000 r--p 00000000 fd:00 10882710
/usr/lib/locale/locale-archive
2000000003c1c000-2000000003c2c000 rw-p 2000000003c1c000 00:00 0
2000000003c38000-2000000003c44000 r-xp 00000000 fd:00 10256427
/lib/libthread_db-1.0.so
2000000003c44000-2000000003c50000 ---p 0000c000 fd:00 10256427
/lib/libthread_db-1.0.so
2000000003c50000-2000000003c54000 rw-p 00008000 fd:00 10256427
/lib/libthread_db-1.0.so
2000000003c54000-2000000003c58000 rw-p 2000000003c54000 00:00 0
2000000003c6c000-2000000003da0000 rw-p 2000000003c6c000 00:00 0
2000000003da0000-2000000003dbc000 r-xp 00000000 fd:00 10884674
/usr/lib/libunwind.so.7.0.0
2000000003dbc000-2000000003dc8000 ---p 0001c000 fd:00 10884674
/usr/lib/libunwind.so.7.0.0
2000000003dc8000-2000000003dcc000 rw-p 00018000 fd:00 10884674
/usr/lib/libunwind.so.7.0.0
2000000003dcc000-2000000003df0000 rw-p 2000000003dcc000 00:00 0
2000000003e00000-2000000003e08000 r--s 00000000 fd:00 10977539
/usr/lib/gconv/gconv-modules.cache
2000000003e08000-2000000003e18000 rw-p 2000000003e08000 00:00 0
2000000003e1c000-2000000006ecc000 rw-p 2000000003e1c000 00:00 0
2000000006ed8000-2000000006ef4000 r-xp 00000000 fd:00 10256386
/lib/libgcc_s-4.1.1-20061130.so.1
2000000006ef4000-2000000006f00000 ---p 0001c000 fd:00 10256386
/lib/libgcc_s-4.1.1-20061130.so.1
2000000006f00000-2000000006f04000 rw-p 00018000 fd:00 10256386
/lib/libgcc_s-4.1.1-20061130.so.1
2000000006f04000-2000000006f14000 rw-p 2000000006f04000 00:00 0
2000000008000000-2000000008024000 rw-p 2000000008000000 00:00 0
2000000008024000-200000000c000000 ---p 2000000008024000 00:00 0
4000000000000000-40000000007e0000 r-xp 00000000 fd:00 9639915
/var/tmp/crash-4.0-3.16/crash
600000000000c000-600000000006c000 rw-p 007dc000 fd:00 9639915
/var/tmp/crash-4.0-3.16/crash
600000000006c000-6000000001ffc000 rw-p 600000000006c000 00:00 0 [heap]
60000fff7fffc000-60000fff80004000 rw-p 60000fff7fffc000 00:00 0
60000ffffecc0000-60000ffffed14000 rw-p 60000ffffecc0000 00:00 0 [stack]
a000000000000000-a000000000020000 ---p 00000000 00:00 0 [vdso]
Aborted
#
So I set debug to 3, and redirected the debug output to a file.
It's big enough (866K) that I don't want to clutter up everybody's
mailbox, so I copied it here:
http://people.redhat.com/anderson/junk
It reproduces the same debug output each time, which gets followed
immediately by the glibc abort.
Maybe it will contain some clues?
Thanks,
Dave
-----------------------------------------------------------------------------------------------------------
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility