I give up. How do I build with CFLAGS='-g3 -O0' ??
by Bruce Korb
I've tried all the easy ways, but gdb just absolutely,
positively insists upon being built with -O2 and I am
just too lazy to enjoy the torture I'm put through with
all the jumping around that results.
Is there any way short of sedding the generated Makefiles?
Capturing the configure step and inserting my own is too
obvious and insufficient.
(FYI, I am still trying to figure out why certain sections
of memory are unfindable in the crash dump.)
Thank you! Cheers - Bruce
11 years, 9 months
Problem with irq command on ARM core file
by Karlsson, Jan
Hi Dave
I tried the irq command on an ARM vmcore file with a fairly new kernel (3.4.0) and it did not work.
crash> irq
irq: cannot determine number of IRQs
The problem is in arm.c
if (symbol_exists("irq_desc"))
ARRAY_LENGTH_INIT(machdep->nr_irqs, irq_desc,
"irq_desc", NULL, 0);
as the symbol irq_desc does not exist any longer. Looking at x86_64.c I changed this to
if (symbol_exists("irq_desc"))
ARRAY_LENGTH_INIT(machdep->nr_irqs, irq_desc,
"irq_desc", NULL, 0);
else if (kernel_symbol_exists("nr_irqs"))
get_symbol_data("nr_irqs", sizeof(unsigned int),
&machdep->nr_irqs);
and then the command worked as it should again.
Jan
Jan Karlsson
Senior Software Engineer
MIB
Sony Mobile Communications
Tel: +46703062174
sonymobile.com<http://sonymobile.com/>
[cid:image001.jpg@01CE10F3.4F482AB0]
11 years, 9 months
[ANNOUNCE] crash-6.1.4 is available
by Dave Anderson
Note: this expedited release cycle is due to this issue:
[Crash-utility] Heads Up -- crash-6.1.3 may fail to load extension modules
https://www.redhat.com/archives/crash-utility/2013-February/msg00014.html
Download from: http://people.redhat.com/anderson
Changelog:
- Fix for a crash-6.1.3 regression with respect to the loading of
extension modules. Because of the change that replaced the obsolete
_init() and _fini() functions with constructor and destructor
functions, extension modules may fail to load when the extension
modules are built with older compiler/linkers. The problem is
due to the continued usage of the -nostartfiles compiler option
regardless whether the extension module has replaced its _init()
function with a constructor function; with older compiler/linkers,
the module may fail to load. The fix predetermines whether an
extension module still uses _init() or if it has been updated to
use a constructor function, and will use the -nostartfiles option
only on older "legacy" modules.
(anderson(a)redhat.com)
- Implemented a new "list -r" option that can be used with lists
that are linked with list_head structures. When invoked, the
command will traverse the linked list in the reverse order by
using the "prev" pointer instead of "next".
(rabin(a)rab.in)
- Fix for the "swap" command's FILENAME display. In some kernels
between 2.6.32 and 2.6.38 the swap partition's pathname may not
show the "/dev" filename component.
(anderson(a)redhat.com)
- Fix for the "swap" command's PCT display, which will display a
a negative percentage value if more than 5368709 swap pages are
in use.
(anderson(a)redhat.com)
11 years, 9 months
How to fire crash command from extension directly
by Vivek Satpute
Hi,
I have written crash extension(shared-object) for dumping few data
structures from crash. I want to trigger
standard crash command e.g. "bt", from crash extension itself. Can anyone
please help me
out, how can it be achieved ? I am using APIs exported by crash header file
defs.h
Your valuable suggestions would be helpful for me.
Thanks in advance.
Vivek Satpute
11 years, 9 months
A bug in 'bt' triggered by 'gdb set disassembly-flavor intel'
by Alex Sidorenko
Hi Dave,
a colleague of mine has noticed a strange thing using crash-6.1.3: if we
execute 'gdb set disassembly-flavor intel' first thing after starting crash (I
duplicated this using this command interactively, he has it in his
~/.crashrc), this might add a bogus frame to 'bt' output.
The default behaviour (crash64 is crash-6.1.3 compiled from sources on
x86_64):
$ crash64 vmlinux vmcore-2013-02-13-00.53.52
crash64 6.1.3
...
crash64> bt 4131
PID: 4131 TASK: ffff88041369a040 CPU: 0 COMMAND: "jbd2/dm-14-8"
#0 [ffff88041443d810] machine_kexec at ffffffff8103284b
#1 [ffff88041443d870] crash_kexec at ffffffff810ba982
#2 [ffff88041443d940] oops_end at ffffffff81501b00
#3 [ffff88041443d970] no_context at ffffffff81043bfb
#4 [ffff88041443d9c0] __bad_area_nosemaphore at ffffffff81043e85
#5 [ffff88041443da10] bad_area_nosemaphore at ffffffff81043f53
#6 [ffff88041443da20] __do_page_fault at ffffffff810446b1
#7 [ffff88041443db40] do_page_fault at ffffffff81503ade
#8 [ffff88041443db70] page_fault at ffffffff81500e95
[exception RIP: __jbd2_journal_remove_checkpoint+201]
RIP: ffffffffa0299a09 RSP: ffff88041443dc20 RFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8805e4c59bc0 RCX: 0000000000000000
RDX: ffff8807fb6a4278 RSI: ffff88041443dcec RDI: ffff8807fb6a4278
RBP: ffff88041443dc60 R8: 0000000000000001 R9: 00000000ffffffff
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000292
R13: ffff8805e4c59c48 R14: ffff88041443dfd8 R15: ffff8807fb6a4278
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffff88041443dc68] journal_clean_one_cp_list at ffffffffa0299d84 [jbd2]
#10 [ffff88041443dcc8] __jbd2_journal_clean_checkpoint_list at
ffffffffa0299e3c [jbd2]
#11 [ffff88041443dd28] jbd2_journal_commit_transaction at ffffffffa02978d1
[jbd2]
#12 [ffff88041443de68] kjournald2 at ffffffffa029df78 [jbd2]
#13 [ffff88041443dee8] kthread at ffffffff81091e06
#14 [ffff88041443df48] kernel_thread at ffffffff8100c14a
Now exiting crash and starting it again:
crash64> gdb set disassembly-flavor intel
crash64> bt 4131
PID: 4131 TASK: ffff88041369a040 CPU: 0 COMMAND: "jbd2/dm-14-8"
#0 [ffff88041443d810] machine_kexec at ffffffff8103284b
#1 [ffff88041443d870] crash_kexec at ffffffff810ba982
#2 [ffff88041443d940] oops_end at ffffffff81501b00
#3 [ffff88041443d970] no_context at ffffffff81043bfb
#4 [ffff88041443d9c0] __bad_area_nosemaphore at ffffffff81043e85
#5 [ffff88041443da10] bad_area_nosemaphore at ffffffff81043f53
#6 [ffff88041443da20] __do_page_fault at ffffffff810446b1
#7 [ffff88041443db40] do_page_fault at ffffffff81503ade
#8 [ffff88041443db70] page_fault at ffffffff81500e95
[exception RIP: __jbd2_journal_remove_checkpoint+201]
RIP: ffffffffa0299a09 RSP: ffff88041443dc20 RFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8805e4c59bc0 RCX: 0000000000000000
RDX: ffff8807fb6a4278 RSI: ffff88041443dcec RDI: ffff8807fb6a4278
RBP: ffff88041443dc60 R8: 0000000000000001 R9: 00000000ffffffff
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000292
R13: ffff8805e4c59c48 R14: ffff88041443dfd8 R15: ffff8807fb6a4278
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffff88041443dc28] __journal_remove_journal_head at ffffffffa029ee22
[jbd2]
#10 [ffff88041443dc68] journal_clean_one_cp_list at ffffffffa0299d84 [jbd2]
#11 [ffff88041443dcc8] __jbd2_journal_clean_checkpoint_list at
ffffffffa0299e3c [jbd2]
#12 [ffff88041443dd28] jbd2_journal_commit_transaction at ffffffffa02978d1
[jbd2]
#13 [ffff88041443de68] kjournald2 at ffffffffa029df78 [jbd2]
#14 [ffff88041443dee8] kthread at ffffffff81091e06
#15 [ffff88041443df48] kernel_thread at ffffffff8100c14a
As you see, now we have #9 that was not present while using the default
flavour.
This is not a 6.1.3 regression - I tried an older binary (6.1.1) on the same
vmcore and it behaves similarly.
Regards,
Alex
--
------------------------------------------------------------------
Alex Sidorenko email: asid(a)hp.com
BCS ERT Linux Hewlett-Packard (Canada)
------------------------------------------------------------------
11 years, 9 months
Getting dissassembly from .o or .ko file
by Ahmed Al-Mehdi
Hello,
I have a kernel panic that prints a backtrace, but no kernel
dump. The lines in the backtrace has the usual format:
[<addr1>] ? func1+num1/num2 [module1]
I understand that num1 is the address offset from the beginning of func1.
What is num2?
I tried to narrow down the location in func1() by doing the following steps:
loaded file1.o file into gdb and issued a "disassemble func1".
The disassembled version of func1, the lines pertaining to function calls
in func1() has the following format:
...............callq num3 <func1+num4> <====
And NOT the following format I am used to:
.................callq <addr2> <func2>
(func2 is a function called from within func1)
My question is related to line marked with <====
- Looking closely as the values of num3 and num4, the instruction seems to
point to a location somewhere in func1 itself, and not the called function-
func2. I must be reading the instruction wrong? How does one interpret the
"calls" instruction.
- I understand I can't get something like addr2 in the line marked with
<==== as the object file is not linked to the kernel. However, is there
any way or tools I can use so the function name shows up in the the line
<====. That would make it easier for me to understand the disassembled
code.
Using gdb on the kernel module (*.ko) did not make a difference in the
disassemble output.
I apologize for my cryptic question and for posting this question here as
this is not related to crash, however, I felt the audience of this mailing
list might be able to help.
Thank you,
Ahmed.
11 years, 9 months
[PATCH] Reverse traversal of linked list
by Rabin Vincent
I recently received a crash involving a link list, one of whose entries had
a corrupted next pointer. While debugging this, I found it useful to
examine the list by traversing it using the prev pointers instead of next.
The attached patch is what I used for this.
Rabin
11 years, 9 months
Heads Up -- crash-6.1.3 may fail to load extension modules
by Dave Anderson
It has come to my attention that the extension modules may fail to load
when running on older host systems with older compiler/linker versions.
This is due to these crash-6.1.3 changes:
- Update of the extensions/echo.c extension module example, and the
"extend" help page, to utilize a constructor function to call the
register_extension() function. The _init() and _fini() functions
have been designated as obsolete for usage by dlopen() and dlclose().
The echo.c example module has been modified to contain echo_init()
and echo_fini() functions marked as __attribute__((constructor)) and
__attribute__((destructor)) respectively.
(anderson(a)redhat.com)
- Updated extensions/dminfo.c, extensions/snap.c and extensions/trace.c
to replace their _init() and _fini() functions with constructor and
destructor functions.
(anderson(a)redhat.com)
I made the change because the dlopen(3) man page has had this section
for quite some time now:
The obsolete symbols _init and _fini
The linker recognizes special symbols _init and _fini. If a dynamic
library exports a routine named _init, then that code is executed after
the loading, before dlopen() returns. If the dynamic library exports a
routine named _fini, then that routine is called just before the
library is unloaded. In case you need to avoid linking against the
system startup files, this can be done by giving gcc the "-nostart-
files" parameter on the command line.
Using these routines, or the gcc -nostartfiles or -nostdlib options, is
not recommended. Their use may result in undesired behavior, since the
constructor/destructor routines will not be executed (unless special
measures are taken).
Instead, libraries should export routines using the __attribute__((con-
structor)) and __attribute__((destructor)) function attributes. See
the gcc info pages for information on these. Constructor routines are
executed before dlopen() returns, and destructor routines are executed
before dlclose() returns.
But when making the change, I did not take the "recommendation" to remove
the -nostartfiles option, for a couple reasons:
(1) Because of what are now "legacy" extension modules that still use
_init() and _fini() functions. Those extension modules still
require the -nostartfiles option.
(2) It didn't make a difference on Fedora 17 systems, which is where I
did my testing.
However, I do not want to force everybody to update their extension modules.
So instead I'm going to modify the extensions/Makefile to grep the C file
for the ((constructor)) attribute, and use or not-use the -nostartfiles option
based upon that.
It should also be noted that this only applies to extension modules that
do *not* utilize their own "<module>.mk" makefile. Accordingly, I will be
modifying the "snap.mk" file to remove the -nostartfile option, but on the
other hand, the "eppic.mk" file can remain unchanged until and unless the
owner wants to update it.
Since this is important enough, I'll come out with a crash-6.1.4 version
shortly.
Thanks,
Dave
11 years, 9 months
[ANNOUNCE] crash version 6.1.3 is available
by Dave Anderson
Download from: http://people.redhat.com/anderson
Changelog:
- Implemented a new "crash --log dumpfile" option which dumps the
kernel log buffer and exits. A kernel namelist is not required,
but the dumpfile must contain the VMCOREINFO data from the ELF
header of the original /proc/vmcore file that was created by the
kexec/kdump facility. Accordingly, this option supports kdump ELF
vmcores and compressed kdump vmcores created by the makedumpfile
facility, including those that are in makedumpfile's intermediary
"vmcore.flat" format.
(anderson(a)redhat.com)
- Fixes for the ppc64.c file to handle gcc-4.7.2 compiler warnings when
building crash with "make warn", or compiler failures when building
with "make Warn" on a PPC64 machine. Without the patch, gcc-4.7.2
generates three "error: variable ‘<variable>’ set but not used
[-Werror=unused-but-set-variable]" messages.
(anderson(a)redhat.com)
- Update the PPC64 architecure's internal storage of the kernel's
MAX_PHYSMEM_BITS value for Linux 3.7 and later kernels, which changed
from 44 to 46 to for 64TB support. Without the patch, there is no
known issue, but the stored value should be correct.
(anderson(a)redhat.com)
- Fix for the "mount" command's header display to indicate "MOUNT"
instead of "VFSMOUNT" on Linux 3.3 and later kernels because the
the first column contains a mount structure address instead of a
vfsmount structure address. For those later kernels, it is
permissable to enter either the mount structure address, or the
address of the vfsmount structure that is embedded within it, as
an optional argument. The output has also been tightened up so
that the DIRNAME field is not shifted to the right based upon the
DEVNAME field length.
(anderson(a)redhat.com)
- Fix for the "mount <superblock>" search option on 2.6.32 and later
kernels. Without the patch, it is possible that multiple filesystems
will be displayed.
(anderson(a)redhat.com)
- Update to the "mount" help page to indicate that a dentry address
may be used as a search option.
(anderson(a)redhat.com)
- Fix for the "ps -l [pid|task|command]" option to display the
specified tasks sorted with the most recently-run task (the largest
last_run/timestamp) shown first, as is done with the "ps -l" option
with no arguments. Without the patch, the timestamp data gets
displayed in the order of the "[pid|task|command]" arguments.
(anderson(a)redhat.com)
- Added the "ps" command to the set of supported "foreach" commands,
serving as an alternative manner of passing task-identifying
arguments to the "ps" command. For example, a command such as
"foreach RU ps" can be accomplished without having to pipe normal
"ps" output to "grep RU". All "ps" options are supported from the
"foreach" framework.
(anderson(a)redhat.com)
- Fix for the "ps -G" restrictor option such that it also takes affect
if the -p, -c, -l, -a, -r or -g options are used. Without the
patch, thread group filtering would only take effect when the default
"ps" command is used without any of the options above.
(anderson(a)redhat.com)
- Fortify the internal hq_open() function to return FALSE if it is
already open, and have restore_sanity() and restore_ifile_sanity()
call hq_close() unconditionally.
(anderson(a)redhat.com)
- Added the "extend" command to the set of built-in commands that
support minimal mode. A new MINIMAL flag has been created for
extension modules to set in their command_table_entry.flags field(s)
to signal that a command supports minimal mode. If the crash session
has been invoked with --minimal, then the "extend" command will
require that the module registers at least one command that has
the MINIMAL bit set.
(per.fransson.ml(a)gmail.com)
- Prevent the "__crc_*" symbols from being added to the the ARM kernel
symbol list.
(per.fransson.ml(a)gmail.com, rabin(a)rab.in)
- Prevent the "PRRR" and "NMRR" absolute symbols from being added to
the ARM kernel symbol list. Without the patch, it allows an invalid
set of addresses to pass the check in the in_ksymbol_range() function.
(per.fransson.ml(a)gmail.com)
- Fix for the ppc.c file to handle a gcc-4.7.2 compiler warning when
building crash with "make warn", or compiler failures when building
with "make Warn" on a PPC machine. Without the patch, gcc-4.7.2
generates the message "error: variable ‘dm’ set but not used
[-Werror=unused-but-set-variable]".
(anderson(a)redhat.com)
- Workaround for the "crash --osrelease dumpfile" option to be able
to work with malformed ARM compressed kdump headers. ARM compressed
kdumps that indicate header version 3 may contain a malformed
kdump_sub_header structure with offset_vmcoreinfo and size_vmcoreinfo
fields offset by 4 bytes, and the actual vmcoreinfo data is not
preceded by its ELF note header and its "VMCOREINFO" string. This
workaround finds the vmcoreinfo data and patches the stored header's
offset_vmcoreinfo and size_vmcoreinfo values. Without the patch, the
"--osrelease dumpfile" command line option fails with the message
"crash: compressed kdump: cannot lseek dump vmcoreinfo", followed by
"unknown".
(anderson(a)redhat.com)
- Fix for the "help -n" option on 32-bit compressed kdumps. Without
the patch, the offset_vmcoreinfo, offset_eraseinfo, and offset_note
fields of the kdump_sub_header have their upper 32-bits clipped off
when displayed. However, it should be harmless since the offset
values point into the first few pages of the dumpfile.
(anderson(a)redhat.com)
- Update of the extensions/echo.c extension module example, and the
"extend" help page, to utilize a constructor function to call the
register_extension() function. The _init() and _fini() functions
have been designated as obsolete for usage by dlopen() and dlclose().
The echo.c example module has been modified to contain echo_init()
and echo_fini() functions marked as __attribute__((constructor)) and
__attribute__((destructor)) respectively.
(anderson(a)redhat.com)
- Updated extensions/dminfo.c, extensions/snap.c and extensions/trace.c
to replace their _init() and _fini() functions with constructor and
destructor functions.
(anderson(a)redhat.com)
- Fix for the "bt" command on the PPC64 architecture when running
on Linux 3.7 kernel threads. Without the patch, some kernel threads
may fail to terminate on the final ".ret_from_kernel_thread" frame,
repeating that frame endlessly, because the stack linkage pointer
points back to itself instead of being NULL.
(anderson(a)redhat.com)
11 years, 9 months
Thoughts on swap_usage Crash extension?
by Aaron Tomlin
Hi,
This is a trivial crash extension to report the actual swap consumption of each user process.
What do you think, any suggestions [1]?
For example:
crash> ps tuned
PID PPID CPU TASK ST %MEM VSZ RSS COMM
1237 1 20 ffff8805418d7500 IN 0.0 174728 1664 tuned
crash> vm -p 1237 | grep SWAP | wc -l
974
crash> extend swap_usage.so
./swap_usage.so: shared object loaded
crash> swap_usage | grep tuned
1237 3896 tuned
crash> p/d 974 << 2
$3 = 3896
crash>
Thanks,
Aaron
---
[1]: https://github.com/aktlin115/crash-extension/blob/f5667ca9e4a521c0aaa3130...
11 years, 9 months