Context:
- crash-5.0.1
- glibc 2.4
- vmcore produced by x86_64 sles11 2.6.27.19-5-default
Problem:
crash> mod -s xfs /usr/people/hedi/xfs.ko.debug
mod: xfs: last symbol is not _MODULE_END_xfs?
*** glibc detected *** /tr/x86_64/bin/crash: double free or corruption (!prev):
0x0000000001558760 ***
<segmentation violation in gdb>
mod: /usr/people/hedi/xfs.ko.debug
gdb add-symbol-file command failed
hangs solid there and has to be killed with SIGKILL.
Grabbing a core reveals the following:
(gdb) bt f
#0 0x00002b628cd0ebb5 in raise () from /lib64/libc.so.6
#1 0x00002b628cd0ffb0 in abort () from /lib64/libc.so.6
#2 0x00002b628cd4a340 in malloc_printerr () from /lib64/libc.so.6
#3 0x00000000005454af in parse_exp_in_context (stringptr=0x400000000, block=<value
optimized out>, comma=<value optimized out>, void_context_p=0,
out_subexp=0x7b4760)
at parse.c:1101
except = {reason = RETURN_ERROR, error = GENERIC_ERROR, message = 0x1c790a0
"Dwarf Error: Could not find abbrev number 188 [in module
/usr/people/hedi/xfs.ko.debug]"}
old_chain = (struct cleanup *) 0x0
subexp = <value optimized out>
#4 0x000000060000000b in ?? ()
#5 0x0000000000000000 in ?? ()
(gdb) f 3
#3 0x00000000005454af in parse_exp_in_context (stringptr=0x400000000, block=<value
optimized out>, comma=<value optimized out>, void_context_p=0,
out_subexp=0x7b4760)
at parse.c:1101
1101 xfree (expout);
(gdb) list
1096 }
1097 if (except.reason < 0)
1098 {
1099 if (! in_parse_field)
1100 {
1101 xfree (expout);
1102 throw_exception (except);
1103 }
1104 }
1105
Not sure (yet) whether the error
mod: xfs: last symbol is not _MODULE_END_xfs?
Dwarf Error: Could not find abbrev number 188 [in module
/usr/people/hedi/xfs.ko.debug]
is a problem in crash or in the xfs.ko.debug objfile but that's another story,
the problem here is that crash shouldn't crash.
FWIW, this problem is most definitely a regression, indeed crash version
4.-8.11, for example, fails to load the objfile, with exactly the same error
message, with the notable difference that it does *not* crash.
Cheers,
Hedi.
P.S. The "last symbol is not _MODULE_END_<modulename>" has been reported
back in Jan 2009 (albeit with the difference that crash would load the
objfile despite the error message)
https://www.redhat.com/archives/crash-utility/2009-January/msg00070.html
but I am not sure the root cause was identified back then, or at least I am
failing to find, in the list archives, any proof of that.
--
Hedi Berriche
Global Product Support
http://www.sgi.com/support