[PATCH] Remove lkcd_speedo() output spinner
by Sam Watters
This patch removes the lkcd_speedo() function and its use.
The lkcd_speedo() function produces a 'spinner' in output when crash is
busy processing kernel crash dump files. The spinner makes the output of
crash commands unpredictable and complicates the parsing of output.
---
diff -Nurp crash-7.1.0/lkcd_common.c crash-7.1.0+/lkcd_common.c
--- crash-7.1.0/lkcd_common.c 2015-02-06 12:44:11.000000000 -0600
+++ crash-7.1.0+/lkcd_common.c 2015-02-12 11:32:19.000000000 -0600
@@ -618,32 +618,6 @@ lkcd_memory_dump(FILE *fp)
}
-static void
-lkcd_speedo(void)
-{
- static int i = 0;
-
- if (pc->flags & SILENT) {
- return;
- }
-
- switch (++i%4) {
- case 0:
- lkcd_print("|\b");
- break;
- case 1:
- lkcd_print("\\\b");
- break;
- case 2:
- lkcd_print("-\b");
- break;
- case 3:
- lkcd_print("/\b");
- break;
- }
- fflush(stdout);
-}
-
/*
* The lkcd_lseek() routine does the bulk of the work setting things up
@@ -856,10 +830,6 @@ lkcd_lseek(physaddr_t paddr)
lseek(lkcd->fd, lkcd->page_offset_max, SEEK_SET);
eof = FALSE;
while (!eof) {
- if( (i++%2048) == 0) {
- lkcd_speedo();
- }
-
switch (lkcd_load_dump_page_header(dp, page))
{
case LKCD_DUMPFILE_OK:
9 years, 10 months
[PATCH 0/5] new options for memory debug commands
by Yu Zhao
search:
-f struct-page.flags-mask
When searching kernel memory, crash walks through all identity
mapping space which includes all physical memory. Nowadays we
have machines that equipped with more than 256G memory, it takes
a long time to go through all pages, especially most of them
are used by user space programs (e.g. LRU pages). This option
allows us to skip pages with certain struct page flags.
-g
Allow us to skip whole compound pages instead of only skipping
head pages when using -f, because some flags are only set to
head pages which tails pages share same properties with their
head pages.
kmem:
-k
Verify compound page order. Useful when debugging memory leak
caused by freeing compound pages with wrong order.
-r
Only print used pages (i.e. the page count is not zero). Makes
the output smaller (and faster to grep/vim/emacs).
Yu Zhao (5):
memory: better compound page support
memory: struct page.flags based filter
memory: skip compound pages when using search filter
memory: display and verify compound page order
memory: kmem option to skip free pages
defs.h | 4 +
help.c | 24 +++-
memory.c | 418 ++++++++++++++++++++++++++++++++++++++++++++++++---------------
3 files changed, 345 insertions(+), 101 deletions(-)
--
2.2.0.rc0.207.ga3a616c
9 years, 10 months
[PATCH] RESEND: Suppress lkcd_speedo() spinner in redirected output
by Sam Watters
Resent using a benign email client
When redirecting output to a file or script the 'spinner' written by the
lkcd_speedo() function complicates parsing output.
Currently the spinner is only suppressed if crash is run with the '-s'
option or the command 'set silent on' has been executed. The change in
this patch suppresses the spinner when the output is redirected to a
pipe or file.
---
diff -Nurp crash-7.1.0//lkcd_common.c crash-7.1.0+//lkcd_common.c
--- crash-7.1.0//lkcd_common.c 2015-02-06 12:44:11.000000000 -0600
+++ crash-7.1.0+//lkcd_common.c 2015-02-11 13:50:11.000000000 -0600
@@ -623,7 +623,8 @@ lkcd_speedo(void)
{
static int i = 0;
- if (pc->flags & SILENT) {
+ if (pc->flags & SILENT ||
+ pc->redirect & (REDIRECT_TO_PIPE | REDIRECT_TO_FILE)) {
return;
}
9 years, 10 months
[PATCH] Suppress lkcd_speedo() spinner in redirected output
by Sam Watters
When redirecting output to a file or script the 'spinner' written by the
lkcd_speedo() function complicates parsing output.
Currently the spinner is only suppressed if crash is run with the '-s'
option or the command 'set silent on' has been executed. The change in
this patch suppresses the spinner when the output is redirected to a
pipe or file.
----
diff -Nurp crash-7.1.0//lkcd_common.c crash-7.1.0+//lkcd_common.c
--- crash-7.1.0//lkcd_common.c 2015-02-06 12:44:11.000000000 -0600
+++ crash-7.1.0+//lkcd_common.c 2015-02-11 13:50:11.000000000 -0600
@@ -623,7 +623,8 @@ lkcd_speedo(void)
{
static int i = 0;
- if (pc->flags & SILENT) {
+ if (pc->flags & SILENT ||
+ pc->redirect & (REDIRECT_TO_PIPE |
REDIRECT_TO_FILE)) {
return;
}
9 years, 10 months
[ANNOUNCE] crash version 7.1.0 is available
by Dave Anderson
Download from: http://people.redhat.com/anderson
or
https://github.com/crash-utility/crash/releases
The master branch serves as a development branch that will contain all
patches that are queued for the next release:
$ git clone git://github.com/crash-utility/crash.git
Changelog:
- Support for "irq" and "irq -u" on the S390 and S390X architectures
if they are running Linux 3.12 and later kernels. Older kernels
without GENERIC_HARDIRQ support will fail with the error message
"irq: cannot determine number of IRQs".
(sebott(a)linux.vnet.ibm.com)
- Fix for the handling of multiple ramdump images. Without the patch,
entering more than one ramdump image on the command line may result
in a segmentation violation.
(oza(a)broadcom.com)
- Implemented the capability of building crash as an x86_64 binary
for analyzing little-endian PPC64 dumpfiles on an x86_64 host, which
can be done by entering "make target=PPC64". After the initial build
is complete, subsequent builds can be done by entering "make" alone.
(anderson(a)redhat.com)
- Fix for the "crash --log <dumpfile>" option on both of the PPC64
architectures. Without the patch, the command fails with the message
"crash: seek error: physical address: <address> type: log_buf
pointer", followed by "crash: cannot read log_buf value". This bug
was introduced in crash-7.0.0 by a patch that added support for the
PPC64 BOOK3E processor family.
(anderson(a)redhat.com)
- Fix for a misleading fatal error message if a 32-bit crash binary
built on an X86_64 host with "make target=X86" or "make target=ARM"
is used on a live X86_64 system without specifying a vmlinux
namelist. Without the patch, the session fails with the message
"crash: cannot find booted kernel -- please enter namelist argument".
The error message will be "crash: compiled for the X86 architecture"
or "crash: compiled for the ARM architecture".
(anderson(a)redhat.com)
- Fix for finding the starting stack and instruction pointer hooks for
the active tasks in x86_64 ELF or compressed dumpfiles created by the
KVM "virsh dump --memory-only" facility. Without the patch, the
backtraces of active tasks may show an invalid starting frame that
indicates "__schedule". The fix displays the exception RIP and dumps
the register contents that are stored in the dumpfile header. If the
active task was operating in the kernel, the backtrace continues from
there; if the task was operating in user-space, the backtrace is
complete at that point.
(anderson(a)redhat.com)
- Fix for the "waitq" command when it is passed the address of a
wait_queue_head_t structure. Without the patch, if the entries
on the list are dynamically-created __wait_queue structures on
kernel stacks, the tasks owning the kernel stack are not displayed.
(anderson(a)redhat.com)
- Implemented a new "net -n [pid|task]" option that displays the list
of network devices with respect the network namespace of the current
context, or that of a task specified by the optional "pid" or "task"
argument. The former "net -n <address>" option that translates
an IPv4 address expressed as a decimal or hexadecimal value into a
standard numbers-and-dots notation has been changed to "net -N".
(vvs(a)parallels.com)
- Fix for the kernel virtual address to symbol name translation for
special text region delimiter symbols declared in vmlinux.lds.S with
VMLINUX_SYMBOL(), such as __sched_text_start, __lock_text_start,
__kprobes_text_start, __entry_text_start and __irqentry_text_start.
Without the patch, if the addresses of those symbols are the same
value as the first "real" symbol in those text regions, commands
such as "dis" and "sym" may show the "_text_start" symbol name
instead of the desired text symbol name.
(qiaonuohan(a)cn.fujitsu.com, anderson(a)redhat.com)
- Enhancement of the "kmem -i" option to display memory overcommit
information, which will be appended to the traditional output of
the command. For example:
crash> kmem -i
PAGES TOTAL PERCENTAGE
TOTAL MEM 1965332 7.5 GB ----
FREE 78080 305 MB 3% of TOTAL MEM
USED 1887252 7.2 GB 96% of TOTAL MEM
SHARED 789954 3 GB 40% of TOTAL MEM
BUFFERS 110606 432.1 MB 5% of TOTAL MEM
CACHED 1212645 4.6 GB 61% of TOTAL MEM
SLAB 146563 572.5 MB 7% of TOTAL MEM
TOTAL SWAP 1970175 7.5 GB ----
SWAP USED 5 20 KB 0% of TOTAL SWAP
SWAP FREE 1970170 7.5 GB 99% of TOTAL SWAP
COMMIT LIMIT 2952841 11.3 GB ----
COMMITTED 1150595 4.4 GB 38% of TOTAL LIMIT
The COMMIT LIMIT and COMMITTED information is similar to that
displayed by the CommitLimit and Committed_AS lines in /proc/meminfo.
(atomlin(a)redhat.com)
- Fix for the "kmem [-s|-S] <address>" command, and the "rd -S[S]"
and "bt -F[F]" options. Without the patch, if the page structure
associated with a memory address still contains a (stale) pointer to
the address of a kmem_cache structure, but whose page.flags does not
have the PG_slab bit set, the address is incorrectly presumed to be
contained within that slab cache. As as result, the "kmem" command
may display one or more messages indicating a "bad inuse counter", a
"bad next pointer" or a "bad s_mem pointer", followed by an "address
not found in cache" error message. The "rd -S[S]" and "bt -F[F]"
commands may mislabel memory locations as belonging to slab caches.
(anderson(a)redhat.com)
- Added a new "vm -M <mm_struct>" option. When a task is exiting,
the mm_struct address pointer in its task_struct is NULL'd out, and
as a result, the "vm" command looks like this:
crash> vm
PID: 4563 TASK: ffff88049863f500 CPU: 8 COMMAND: "postgres"
MM PGD RSS TOTAL_VM
0 0 0k 0k
However, the mm_struct address can be retrieved from the task's
kernel stack and entered manually with this option, which allows the
"vm" command to attempt to dump the virtual memory data of the task.
It may, or may not, work, depending upon how far the virtual memory
deconstruction has proceeded. This option only verifies that the
address entered is from the "mm_struct" slab cache, and that
its mm_struct.mm_count is non-zero.
(qiaonuohan(a)cn.fujitsu.com, anderson(a)redhat.com)
- Fix for the X86_64 "bt" and "mach" commands when running against
kernels that have the following Linux 3.18 commit, which addresses
CVE-2014-9322. The kernel patch removes the per-cpu exception stack
used for handling stack segment faults:
commit 6f442be2fb22be02cafa606f1769fa1e6f894441
x86_64, traps: Stop using IST for #SS
Without this patch, backtraces that originate on any of the other 4
per-cpu exception stacks will be mis-labeled at the transition point
back to the previous stack. For example, backtraces that that
originate on the NMI stack will indicate that they are coming from
the "DOUBLEFAULT" stack. The patch examines all idt_table entries
during initialization, looking for gate descriptors that have
non-zero index values, and when found, pulls out out the handler
function address; from that information, the exception stack name
string array is properly initialized rather than being hard-coded.
This fix also properly labels the exception stack names on x86_64
CONFIG_PREEMPT_RT realtime kernels, which only utilize 3 exception
stacks instead of the traditional 5 (now 4 with this kernel commit),
instead of just showing "RT". Also, without the patch, the "mach"
command will mis-label the stack names when it displays the base
addresses of each per-cpu exception stack.
(anderson(a)redhat.com)
- Additional output for the "help [-D|-n]" options on X86 and X86_64
architectures. For compressed kdumps, the elf_prstatus structure in
each per-cpu NT_PRSTATUS note will be translated. For ELF kdumps,
the elf_prstatus structure in each per-cpu NT_PRSTATUS note, and
the QEMUCPUState structure in each per-cpu QEMU note, will be
translated.
(zhouwj-fnst(a)cn.fujitsu.com, anderson(a)redhat.com)
- Implemented a new "bt -A" option for the S390X architecture, which
adds support for displaying the new s390x vector registers. For
ELF dumps, the registers are taken from the VX ELF notes; for s390
dumps. the registers are taken from memory. The option produces the
same output as the -a option, but also displays the vector registers
for all active tasks.
(holzheu(a)linux.vnet.ibm.com)
- Fix for the 32-bit ARM virtual-to-physical address translation of
unity-mapped kernel virtual addresses in kernels configured with
CONFIG_ARM_LPAE if the system's phys_base exceeds 4GB.
(sdu.liu(a)huawei.com)
- Fix for the "help [-D|-n]" option on 32-bit X86 kernels that use the
64-bit ELF vmcore format generated by "virsh dump --memory-only".
Without the patch, the QEMUCPUState structures in QEMU notes are not
translated.
(qiaonuohan(a)cn.fujitsu.com)
- Additional output for the "help [-D|-n]" options on X86 and X86_64
architectures. For compressed kdumps generated by "virsh dump
--memory-only", the QEMUCPUState structure in each per-cpu QEMU
note will be translated, and the dumpfile offset address of each
QEMU note will be displayed.
(qiaonuohan(a)cn.fujitsu.com, anderson(a)redhat.com)
- Introduction of support for the 32-bit MIPS architecture. This
initial support is restricted to 32-bit MIPS kernels that are
configured as little-endian. With respect to dumpfile types, only
ELF vmcores are recognized. In addition to building crash as a
32-bit MIPS binary, it is also possible to build crash as an x86
binary on an x86 or x86_64 host so that crash analysis of MIPS
dumpfiles can be performed on an x86 or x86_64 host. The x86 binary
can be built by entering "make target=MIPS" for the initial build;
subsequent builds with MIPS support can be accomplished by entering
"make" alone.
(rabin(a)rab.in)
- Added support for big-endian 32-bit MIPS kernels. Only native MIPS
crash binaries may be built with big-endian support; running the
"make target=MIPS" build option on an x86 or x86_64 host creates
x86 binaries with little-endian support only.
(rabin(a)rab.in)
- Update the "ps" help page to reflect that the "ps -l" option may be
based upon the task_struct's sched_entity.last_arrival. Without the
patch, it indicates that either the task_struct's last_run or
timestamp value are used.
(anderson(a)redhat.com)
- Fix for the "kmem -z" option output to change the zone structure's
pages_scanned field from a signed to an unsigned long integer.
(Alexandr_Terekhov(a)epam.com)
- Fix for "kmem -z" option on Linux 2.6.30 and later kernels. Without
the patch, the zone structure's all_unreclaimable and pages_scanned
fields are not dumped.
(anderson(a)redhat.com)
- Fix for the PPC64 "bt" command on both big-endian and little-endian
architectures. Without the patch, backtraces of the active tasks
may be "empty" on little-endian machines, or show a one-liner of
the form: "#0 [c0000005f4db7a60] (null) at 501 (unreliable)" on
big-endian machines.
(anderson(a)redhat.com)
- Additional output for the "help [-D|-n]" options for the PPC64
architecture. For compressed kdump and ELF kdump dumpfiles, the
elf_prstatus structure in each per-cpu NT_PRSTATUS note will be
translated.
(anderson(a)redhat.com)
- The "help -r" option has been extended to dump the PPC64 registers
stored in each per-cpu NT_PRSTATUS note in compressed kdump and
ELF kdump dumpfiles.
(anderson(a)redhat.com)
- Prevent "help -r" and "help -[D|n]" from generating a segmentation
violation when attempting to access non-existent NT_PRSTATUS notes
for offline cpus in ELF or compressed kdumps.
(anderson(a)redhat.com)
- Fix for the "kmem -V" option output to change the display of the
vm_event_states fields from signed to unsigned long integers.
(adobriyan(a)gmail.com)
- Fix to allow the "ps -G" qualifier to be used in conjunction with
the "ps -p" option. Without the patch, "ps -G -p" fails with the
error message "ps: do_list: hash queue is in use?"
(anderson(a)redhat.com)
- Fix for the "runq" command on kernels that are configured with
CONFIG_RT_GROUP_SCHED=n. Without the patch, real-time tasks queued
on a per-cpu rt_rq.rt_prio_array will not be displayed under the
"RT PRIO_ARRAY" header.
(mty.shibata(a)gmail.com)
- Fix for a regression introduced in crash-7.0.9 when running on a live
32-bit ARM machine. Without the patch, a segmentation violation
is generated during session initialization.
(anderson(a)redhat.com)
- Enhancement of the "PANIC:" message displayed by the initial system
banner and by the "sys" command. Without the patch, many panic types
are categorized under the same generic message of the form:
PANIC: "Oops: 0000 [#1] SMP " (check log for details)
or in other types of crashes, no message is displayed at all. With
this patch, a more comprehensive search is made of the kernel log for
a more informative panic message.
(drc(a)yahoo-inc.com, anderson(a)redhat.com)
- Add appropriate checks for the MIPS architecture to allow extension
modules to be loaded with the "extend" command.
(rabin(a)rab.in)
- Update the extensions/trace.c extension module to account for the
movement of the ftrace_event_call.name member into an anonymous
union in Linux 3.15, commit de7b2973903c6cc50b31ee5682a69b2219b9919d.
(rabin(a)rab.in)
- Added support for VMware .vmss suspended state files as dumpfiles.
Similar to all other supported dumpfile types, it is invoked as:
$ crash vmlinux <vmname>.vmss
A "<vmname>.vmss" file created by the VMware vSphere ESX hypervisor
contains a header and the full memory image. A "<vmname>.vmss" file
created by the VMware Workstation facility only contains the header,
and must be accompanied by a companion "<vmname>.vmem" memory image
that is located in the same directory as the "<vmname>.vmss" file.
(hfu(a)vmware.com)
9 years, 10 months
[PATCH] Fix trace extension for v3.15+
by Rabin Vincent
Since Linux v3.15 (specifically, the following commit), the event name is
optionally moved to another structure.
commit de7b2973903c6cc50b31ee5682a69b2219b9919d
Author: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Date: Tue Apr 8 17:26:21 2014 -0400
tracepoint: Use struct pointer instead of name hash for reg/unreg tracepoints
- char *name;
+ union {
+ char *name;
+ /* Set TRACE_EVENT_FL_TRACEPOINT flag when using "tp" */
+ struct tracepoint *tp;
+ };
This patch handles this in the trace extension so that both kernels with and
without that commit work.
---
extensions/trace.c | 46 ++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 42 insertions(+), 4 deletions(-)
diff --git a/extensions/trace.c b/extensions/trace.c
index a09f6a1..8639fb2 100644
--- a/extensions/trace.c
+++ b/extensions/trace.c
@@ -988,19 +988,42 @@ static void ftrace_destroy_event_types(void)
free(ftrace_common_fields);
}
+#define TRACE_EVENT_FL_TRACEPOINT 0x40
+
static
int ftrace_get_event_type_name(ulong call, char *name, int len)
{
static int inited;
static int name_offset;
+ static int flags_offset;
+ static int tp_name_offset;
+ uint flags;
ulong name_addr;
- if (!inited) {
- inited = 1;
- name_offset = MEMBER_OFFSET("ftrace_event_call", "name");
- }
+ if (inited)
+ goto work;
+
+ inited = 1;
+ name_offset = MEMBER_OFFSET("ftrace_event_call", "name");
+ if (name_offset >= 0)
+ goto work;
+ name_offset = ANON_MEMBER_OFFSET("ftrace_event_call", "name");
+ if (name_offset < 0)
+ return -1;
+
+ flags_offset = MEMBER_OFFSET("ftrace_event_call", "flags");
+ if (flags_offset < 0)
+ return -1;
+
+ tp_name_offset = MEMBER_OFFSET("tracepoint", "name");
+ if (tp_name_offset < 0)
+ return -1;
+
+ inited = 2;
+
+work:
if (name_offset < 0)
return -1;
@@ -1008,6 +1031,21 @@ int ftrace_get_event_type_name(ulong call, char *name, int len)
"read ftrace_event_call name_addr", RETURN_ON_ERROR))
return -1;
+ if (inited == 2) {
+ if (!readmem(call + flags_offset, KVADDR, &flags,
+ sizeof(flags), "read ftrace_event_call flags",
+ RETURN_ON_ERROR))
+ return -1;
+
+ if (flags & TRACE_EVENT_FL_TRACEPOINT) {
+ if (!readmem(name_addr + tp_name_offset, KVADDR,
+ &name_addr, sizeof(name_addr),
+ "read tracepoint name", RETURN_ON_ERROR))
+ return -1;
+ }
+
+ }
+
if (!read_string(name_addr, name, len))
return -1;
--
2.1.4
9 years, 10 months
[PATCH] Support extensions for MIPS
by Rabin Vincent
Add appropriate checks for MIPS to is_shared_object() so that extensions
work.
---
symbols.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/symbols.c b/symbols.c
index 59312e6..c3883f0 100644
--- a/symbols.c
+++ b/symbols.c
@@ -3452,7 +3452,8 @@ is_shared_object(char *file)
switch (swap16(elf32->e_machine, swap))
{
case EM_386:
- if (machine_type("X86") || machine_type("ARM"))
+ if (machine_type("X86") || machine_type("ARM") ||
+ machine_type("MIPS"))
return TRUE;
break;
@@ -3466,6 +3467,11 @@ is_shared_object(char *file)
return TRUE;
break;
+ case EM_MIPS:
+ if (machine_type("MIPS"))
+ return TRUE;
+ break;
+
case EM_PPC:
if (machine_type("PPC"))
return TRUE;
--
2.1.4
9 years, 10 months
[PATCH] take Hardware Error & kernel pointer bug as separate panicmsg
by drc@yahoo-inc.com
There are just too many kinds of panic types are categorized under
the same Oops: xxxx, makes this field really ambiguous and not so useful
PANIC: "Oops: 0000 [#1] SMP " (check log for details)
this patch separated 3 kinds of panicmsg out, as the most happening cases
among the machines managed by me; the match string are copied
from kernel source code exactly, after applied, I got panicmsg like:
include/linux/kernel.h:#define HW_ERR
panicmsg: "[Hardware Error]: CPU 7: Machine Check Exception: 5 Bank 11: f200003f000100b2"
drivers/char/sysrq.c:__handle_sysrq
panicmsg: "SysRq : Trigger a crash"
arch/x86/kernel/traps.c:do_general_protection
panicmsg: "general protection fault: 8800 [#1] SMP"
arch/x86/mm/fault.c:show_fault_oops
panicmsg: "BUG: unable to handle kernel paging request at 00001248a68eb328"
Signed-off-by: Derek Che <drc(a)yahoo-inc.com>
---
task.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/task.c b/task.c
index 4214d7f..6ecfbcf 100644
--- a/task.c
+++ b/task.c
@@ -5509,8 +5509,24 @@ get_panicmsg(char *buf)
}
rewind(pc->tmpfile);
while (!msg_found && fgets(buf, BUFSIZE, pc->tmpfile)) {
- if (strstr(buf, "Oops: ") ||
- strstr(buf, "kernel BUG at"))
+ if (strstr(buf, "[Hardware Error]: "))
+ msg_found = TRUE;
+ }
+ rewind(pc->tmpfile);
+ while (!msg_found && fgets(buf, BUFSIZE, pc->tmpfile)) {
+ if (strstr(buf, "SysRq : "))
+ msg_found = TRUE;
+ }
+ rewind(pc->tmpfile);
+ while (!msg_found && fgets(buf, BUFSIZE, pc->tmpfile)) {
+ if (strstr(buf, "general protection fault"))
+ msg_found = TRUE;
+ }
+ rewind(pc->tmpfile);
+ while (!msg_found && fgets(buf, BUFSIZE, pc->tmpfile)) {
+ if (strstr(buf, "Oops: ") ||
+ strstr(buf, "kernel BUG at") ||
+ strstr(buf, "BUG: unable to handle kernel "))
msg_found = TRUE;
}
rewind(pc->tmpfile);
9 years, 10 months