Dave Anderson wrote:
 
Hi Castor,

Testing this latest patch on an ia64, there is improvement over
the original, but for one particular module, I get an abort that
was generated from glibc that I have never encountered before.
I saw it the first time when running "mod -S", and subsequently
narrowed it down to the ipv6 module.  Check this out:
 

Typically a realloc() error would occur due to an overflow of a previously
allocated buffer which trashed the malloc-bookkeeping.  I thought at first
that the supplied command string buffer of 1500 (BUFSIZE) bytes had
overflowed, but that is a stack buffer and is not malloc'd.  And we
don't overflow it anyway.

In any case, with a few debugging lines added, here is the command
buffer for the ipv6 add-symbol-file command, which contains 707 bytes:

add-symbol-file /lib/modules/2.6.18-1.2767.el5/kernel/net/ipv6/ipv6.ko 0xa00000021ed605b0 -s .exit.text 0xa00000021edb49a0 -s .rodata 0xa00000021edbd4c8 -s __ksymtab_strings 0xa00000021edbdc08 -s __versions 0xa00000021edbdf98 -s .data 0xa00000021edd6a20 -s .data.rel.ro 0xa00000021edd6c00 -s __ksymtab_gpl 0xa00000021edd6df8 -s __kcrctab_gpl 0xa00000021edd6ed8 -s .data.rel 0xa00000021edd6f48 -s .data.rel.local 0xa00000021ee39940 -s .data.rel.ro.local 0xa00000021ee3a9c0 -s .data.read_mostly 0xa00000021ee3a9e0 -s __ksymtab 0xa00000021ee3aa60 -s __kcrctab 0xa00000021ee3ac30 -s .gnu.linkonce.this_module 0xa00000021ee3ad80 -s .sdata 0xa00000021ee5d730 -s .bss 0xa00000021ee5b000 -s .sbss 0xa00000021ee5e8b8

and the realloc() crash that caused the abort was made here in
this part of the upper while loop in add_symbol_file_command()
in gdb-6.1/gdb/symfile.c:
 

                  if (expecting_sec_addr)
                    {
                      sect_opts[section_index].value = arg;
                      expecting_sec_addr = 0;
                      if (++section_index > num_sect_opts)
                        {
                          num_sect_opts *= 2;
                          sect_opts = ((struct sect_opt *)
                                       xrealloc (sect_opts,
                                                 num_sect_opts
                                                 * sizeof (struct sect_opt)));
                        }
                    }
                  else
                    error ("USAGE: add-symbol-file <filename> <textaddress> [-mapped] [-readnow] [-s <secname> <addr>]*");

At the point of failure, the argcnt had gone from 0 to 47, and the
current arg was pointing to "0xa00000021ee5d730", which is the .sdata
section address.  This was the first xrealloc() call made from the
loop above -- so it almost made it through the whole command string
without even having to do an xrealloc().

Anyway, that's as far as I took it -- but perhaps you can
artificially re-create an add-symbol-file command with a command
line size approaching the 700-odd bytes that this one consumes and
which has ~36 sections?

I did put a debug printf prior to each call to execute_command() that
shows the strlen of the command buffer and the lm->mod_sections, and
then did a "mod -S" to load all modules.  Note that both the 707 bytes
and the number of sections (36) of the ipv6 module were both the largest:

crash> mod -S | grep strlen
strlen(req->buf): 433 mod_sections: 23
strlen(req->buf): 374 mod_sections: 22
strlen(req->buf): 433 mod_sections: 23
strlen(req->buf): 240 mod_sections: 19
strlen(req->buf): 277 mod_sections: 29
strlen(req->buf): 375 mod_sections: 28
strlen(req->buf): 422 mod_sections: 25
strlen(req->buf): 421 mod_sections: 28
strlen(req->buf): 317 mod_sections: 36
strlen(req->buf): 326 mod_sections: 28
strlen(req->buf): 600 mod_sections: 29
strlen(req->buf): 180 mod_sections: 19
strlen(req->buf): 404 mod_sections: 23
strlen(req->buf): 615 mod_sections: 29
strlen(req->buf): 313 mod_sections: 35
strlen(req->buf): 501 mod_sections: 29
strlen(req->buf): 286 mod_sections: 30
strlen(req->buf): 397 mod_sections: 28
strlen(req->buf): 299 mod_sections: 24
strlen(req->buf): 419 mod_sections: 24
strlen(req->buf): 374 mod_sections: 27
strlen(req->buf): 420 mod_sections: 24
strlen(req->buf): 463 mod_sections: 25
strlen(req->buf): 420 mod_sections: 24
strlen(req->buf): 413 mod_sections: 31
strlen(req->buf): 310 mod_sections: 21
strlen(req->buf): 322 mod_sections: 30
strlen(req->buf): 302 mod_sections: 26
strlen(req->buf): 306 mod_sections: 31
strlen(req->buf): 313 mod_sections: 32
strlen(req->buf): 707 mod_sections: 36
*** glibc detected *** ./crash: realloc(): invalid next size: 0x600000000432f430 ***

So the 707-byte ipv6 command string, or maybe its 36 sections,
or a combination of the two, sends crash/gdb into the weeds...

Dave
 

 
# ./crash

crash 4.0-3.16
Copyright (C) 2002, 2003, 2004, 2005, 2006  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006  Fujitsu Limited
Copyright (C) 2006  VA Linux Systems Japan K.K.
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.

GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "ia64-unknown-linux-gnu"...

      KERNEL: /usr/lib/debug/lib/modules/2.6.18-1.2767.el5/vmlinux
    DUMPFILE: /dev/mem
        CPUS: 64
        DATE: Wed Jan  3 10:43:04 2007
      UPTIME: 01:40:46
LOAD AVERAGE: 0.15, 0.11, 0.17
       TASKS: 629
    NODENAME: altix3.lab.boston.redhat.com
     RELEASE: 2.6.18-1.2767.el5
     VERSION: #1 SMP Wed Nov 29 17:38:14 EST 2006
     MACHINE: ia64  (1500 Mhz)
      MEMORY: 122.5 GB
         PID: 10699
     COMMAND: "crash"
        TASK: e00000b04e300000  [THREAD_INFO: e00000b04e301040]
         CPU: 7
       STATE: TASK_RUNNING (ACTIVE)

crash> mod -l
     MODULE       NAME                   SIZE  OBJECT FILE
a00000021e189c00  ehci_hcd             204860  (not loaded)  [CONFIG_KALLSYMS]
a00000021e1bd100  uhci_hcd             185680  (not loaded)  [CONFIG_KALLSYMS]
a00000021e1efb00  ohci_hcd             179956  (not loaded)  [CONFIG_KALLSYMS]
a00000021e23cb00  dm_zero              134640  (not loaded)  [CONFIG_KALLSYMS]
a00000021e287700  jbd                  262432  (not loaded)  [CONFIG_KALLSYMS]
a00000021e2b9580  sd_mod               170772  (not loaded)  [CONFIG_KALLSYMS]
a00000021e303380  qla1280              276848  (not loaded)  [CONFIG_KALLSYMS]
a00000021e374300  ext3                 414624  (not loaded)  [CONFIG_KALLSYMS]
a00000021e3d9600  scsi_mod             387008  (not loaded)  [CONFIG_KALLSYMS]
a00000021e418980  mptbase              235792  (not loaded)  [CONFIG_KALLSYMS]
a00000021e44c780  scsi_transport_spi   183672  (not loaded)  [CONFIG_KALLSYMS]
a00000021e47ec80  mptscsih             176288  (not loaded)  [CONFIG_KALLSYMS]
a00000021e4ab700  mptspi               162536  (not loaded)  [CONFIG_KALLSYMS]
a00000021e4e1580  scsi_transport_fc    203748  (not loaded)  [CONFIG_KALLSYMS]
a00000021e5fcd00  dm_mod               253328  (not loaded)  [CONFIG_KALLSYMS]
a00000021e718e80  qla2xxx             1090472  (not loaded)  [CONFIG_KALLSYMS]
a00000021e825680  dm_mirror            187608  (not loaded)  [CONFIG_KALLSYMS]
a00000021e857480  autofs4              178336  (not loaded)  [CONFIG_KALLSYMS]
a00000021e884800  dm_snapshot          167224  (not loaded)  [CONFIG_KALLSYMS]
a00000021e8b1a00  lp                   156512  (not loaded)  [CONFIG_KALLSYMS]
a00000021e8fdc00  cdrom                206776  (not loaded)  [CONFIG_KALLSYMS]
a00000021e935400  sg                   203464  (not loaded)  [CONFIG_KALLSYMS]
a00000021e96f080  ide_cd               211824  (not loaded)  [CONFIG_KALLSYMS]
a00000021ea6bf80  tg3                  362244  (not loaded)  [CONFIG_KALLSYMS]
a00000021eaa6780  parport              208284  (not loaded)  [CONFIG_KALLSYMS]
a00000021eacef00  button               144200  (not loaded)  [CONFIG_KALLSYMS]
a00000021eb10980  parport_pc           184504  (not loaded)  [CONFIG_KALLSYMS]
a00000021eb3e400  vfat                 157504  (not loaded)  [CONFIG_KALLSYMS]
a00000021eb9a280  fat                  239936  (not loaded)  [CONFIG_KALLSYMS]
a00000021ecd6880  sunrpc               468360  (not loaded)  [CONFIG_KALLSYMS]
a00000021ee3ad80  ipv6                1141140  (not loaded)  [CONFIG_KALLSYMS]
a00000021ee9f580  bluetooth            375704  (not loaded)  [CONFIG_KALLSYMS]
a00000021eeef880  l2cap                310456  (not loaded)  [CONFIG_KALLSYMS]
a00000021ef48680  rfcomm               347144  (not loaded)  [CONFIG_KALLSYMS]
a00000021ef97980  hidp                 294256  (not loaded)  [CONFIG_KALLSYMS]
crash> mod -s ipv6
*** glibc detected *** ./crash: realloc(): invalid next size: 0x6000000001921fc0 ***
======= Backtrace: =========
/lib/libc.so.6.1[0x20000000002f2a70]
/lib/libc.so.6.1(realloc-0x1cb0b0)[0x20000000002f5e20]
./crash(xmrealloc+0x1fffffffffee6c40)[0x40000000003a7b20]
./crash[0x40000000002ff3a0]
./crash[0x4000000000422000]
./crash(cmd_func+0x1ffffffffff61430)[0x4000000000422320]
./crash(execute_command+0x1fffffffffee2410)[0x40000000003a3310]
./crash(gdb_command_funnel+0x1fffffffffe2f900)[0x40000000002f0810]
./crash(gdb_interface+0x1fffffffffcd7590)[0x40000000001984b0]
./crash[0x4000000000235af0]
./crash(load_module_symbols+0x1fffffffffd748f0)[0x4000000000235820]
./crash[0x4000000000175820]
./crash(cmd_mod+0x2000000000129d68)[0x4000000000174930]
./crash(exec_command+0x1fffffffffb99db0)[0x400000000005acf0]
./crash(main_loop+0x1fffffffffb9a2e0)[0x400000000005a8e0]
./crash(current_interp_command_loop+0x200000000001fb90)[0x40000000004e0ae0]
./crash[0x4000000000319820]
./crash[0x400000000039f1d0]
./crash[0x40000000003a4080]
./crash(catch_errors+0x1fffffffffee31e0)[0x40000000003a4140]
./crash[0x400000000031a790]
./crash[0x400000000039f1d0]
./crash[0x40000000003a4080]
./crash(catch_errors+0x1fffffffffee31e0)[0x40000000003a4140]
./crash(gdb_main+0x1fffffffffe587d0)[0x4000000000319740]
./crash(gdb_main_entry+0x1fffffffffe58860)[0x40000000003197e0]
./crash(gdb_main_loop+0x1fffffffffcd54e0)[0x4000000000196470]
./crash(main+0x1fffffffffb99820)[0x400000000005a330]
/lib/libc.so.6.1(__libc_start_main-0x2818e0)[0x200000000023f6c0]
./crash(_start+0x1fffffffffb95250)[0x4000000000056200]
======= Memory map: ========
00000000-00004000 r--p 00000000 00:00 0
2000000000000000-2000000000038000 r-xp 00000000 fd:00 10256390           /lib/ld-2.5.so
2000000000044000-2000000000050000 rw-p 00034000 fd:00 10256390           /lib/ld-2.5.so
2000000000050000-2000000000114000 r-xp 00000000 fd:00 10256405           /lib/libm-2.5.so
2000000000114000-2000000000120000 ---p 000c4000 fd:00 10256405           /lib/libm-2.5.so
2000000000120000-2000000000124000 rw-p 000c0000 fd:00 10256405           /lib/libm-2.5.so
2000000000124000-20000000001b0000 r-xp 00000000 fd:00 10883077           /usr/lib/libncurses.so.5.5
20000000001b0000-20000000001bc000 ---p 0008c000 fd:00 10883077           /usr/lib/libncurses.so.5.5
20000000001bc000-20000000001cc000 rw-p 00088000 fd:00 10883077           /usr/lib/libncurses.so.5.5
20000000001cc000-20000000001d0000 rw-p 20000000001cc000 00:00 0
20000000001d0000-20000000001d8000 r-xp 00000000 fd:00 10256403           /lib/libdl-2.5.so
20000000001d8000-20000000001e4000 ---p 00008000 fd:00 10256403           /lib/libdl-2.5.so
20000000001e4000-20000000001e8000 rw-p 00004000 fd:00 10256403           /lib/libdl-2.5.so
20000000001e8000-200000000020c000 r-xp 00000000 fd:00 10882711           /usr/lib/libz.so.1.2.3
200000000020c000-2000000000218000 ---p 00024000 fd:00 10882711           /usr/lib/libz.so.1.2.3
2000000000218000-200000000021c000 rw-p 00020000 fd:00 10882711           /usr/lib/libz.so.1.2.3
200000000021c000-2000000000480000 r-xp 00000000 fd:00 10256397           /lib/libc-2.5.so
2000000000480000-200000000048c000 ---p 00264000 fd:00 10256397           /lib/libc-2.5.so
200000000048c000-2000000000498000 rw-p 00260000 fd:00 10256397           /lib/libc-2.5.so
2000000000498000-20000000004d8000 rw-p 2000000000498000 00:00 0
20000000004d8000-2000000003c1c000 r--p 00000000 fd:00 10882710           /usr/lib/locale/locale-archive
2000000003c1c000-2000000003c2c000 rw-p 2000000003c1c000 00:00 0
2000000003c38000-2000000003c44000 r-xp 00000000 fd:00 10256427           /lib/libthread_db-1.0.so
2000000003c44000-2000000003c50000 ---p 0000c000 fd:00 10256427           /lib/libthread_db-1.0.so
2000000003c50000-2000000003c54000 rw-p 00008000 fd:00 10256427           /lib/libthread_db-1.0.so
2000000003c54000-2000000003c58000 rw-p 2000000003c54000 00:00 0
2000000003c6c000-2000000003da0000 rw-p 2000000003c6c000 00:00 0
2000000003da0000-2000000003dbc000 r-xp 00000000 fd:00 10884674           /usr/lib/libunwind.so.7.0.0
2000000003dbc000-2000000003dc8000 ---p 0001c000 fd:00 10884674           /usr/lib/libunwind.so.7.0.0
2000000003dc8000-2000000003dcc000 rw-p 00018000 fd:00 10884674           /usr/lib/libunwind.so.7.0.0
2000000003dcc000-2000000003df0000 rw-p 2000000003dcc000 00:00 0
2000000003e00000-2000000003e08000 r--s 00000000 fd:00 10977539           /usr/lib/gconv/gconv-modules.cache
2000000003e08000-2000000003e18000 rw-p 2000000003e08000 00:00 0
2000000003e1c000-2000000006ecc000 rw-p 2000000003e1c000 00:00 0
2000000006ed8000-2000000006ef4000 r-xp 00000000 fd:00 10256386           /lib/libgcc_s-4.1.1-20061130.so.1
2000000006ef4000-2000000006f00000 ---p 0001c000 fd:00 10256386           /lib/libgcc_s-4.1.1-20061130.so.1
2000000006f00000-2000000006f04000 rw-p 00018000 fd:00 10256386           /lib/libgcc_s-4.1.1-20061130.so.1
2000000006f04000-2000000006f14000 rw-p 2000000006f04000 00:00 0
2000000008000000-2000000008024000 rw-p 2000000008000000 00:00 0
2000000008024000-200000000c000000 ---p 2000000008024000 00:00 0
4000000000000000-40000000007e0000 r-xp 00000000 fd:00 9639915            /var/tmp/crash-4.0-3.16/crash
600000000000c000-600000000006c000 rw-p 007dc000 fd:00 9639915            /var/tmp/crash-4.0-3.16/crash
600000000006c000-6000000001ffc000 rw-p 600000000006c000 00:00 0          [heap]
60000fff7fffc000-60000fff80004000 rw-p 60000fff7fffc000 00:00 0
60000ffffecc0000-60000ffffed14000 rw-p 60000ffffecc0000 00:00 0          [stack]
a000000000000000-a000000000020000 ---p 00000000 00:00 0                  [vdso]
Aborted
#
 

So I set debug to 3, and redirected the debug output to a file.
It's big enough (866K) that I don't want to clutter up everybody's
mailbox, so I copied it here:

  http://people.redhat.com/anderson/junk

It reproduces the same debug output each time, which gets followed
immediately by the glibc abort.

Maybe it will contain some clues?

Thanks,
  Dave
 
 


-- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility