RFE: run crash in "help mode"
by Bruce Korb
Hi,
Sometimes it is useful to not clutter up my active screen and run
crash in another window so I can examine "help mod" and type the
command in the real session. That, or extract all the help
text into a .texi doc? Something, please, anyway....
Thank you!
Regards, Bruce
example usage:
$ crash --help-mode
crash-help-for>
12 years, 10 months
[PATCH V2] Add -C option for search
by zhangyanfei
Hello Dave,
I made a new patch to implement the -C option for search. This time the
patch is accomplished in a simple manner according to your advice.
Thanks.
Zhang Yanfei
12 years, 10 months
mod cmd fixes for -g -r
by Bob Montgomery
6.0.3 added -g and -r options to the mod command, but they can't be used
together. This patch allows "-r -g" or "-rg" or "-Srg" etc.
crash-6.0.3> mod -r -g -S /usr/lib/debug/lib/modules/3.1.4-clim-3-amd64
mod: invalid option -- 'g'
Usage:
mod -s module [objfile] | -d module | -S [directory] | -D | -r | -R | -o | -g
Enter "help mod" for details.
Still working on the flaky module symbol problem...
Bob Montgomery
12 years, 10 months
[PATCH] Add -C option for search
by zhangyanfei
Hello Dave,
I add a new option -C for sub-command search to display memory contents
just before and after the search target. The number of the memory unit
displayed is specified by -C.
for example,
crash> search -p -C 3 dddd
17b5100: dddddddddddddddd dddddddddddddddd dddddddddddddddd
17b5118: dddd cdcdcdcdcdcdcdcd cdcdcdcdcdcdcdcd
17b5130: cdcdcdcdcdcdcdcd
--
85d04f818: 29 ffff88085d04f818 ffffea001d3f7e30
85d04f830: dddd 1000 ffff88085b48a000
85d04f848: ffff880876383b80
--
8752cfec8: ddd9 dddb dddb
8752cfee0: dddd dddd dddf
8752cfef8: dddf
--
8752cfed0: dddb dddb dddd
8752cfee8: dddd dddf dddf
8752cff00: dde1
--
crash>
Thanks.
Zhang Yanfei
12 years, 10 months
[PATCH v2 0/3] PPC32 vmalloc address translation
by Suzuki K. Poulose
The following series implements :
* An infrastructure for platform based vmalloc translation for PPC32
* vmalloc translation support for PPC440
We maintain a list of platforms with the relevant definitions for Virtual
Address translation bits. Based on the 'powerpc_base_platform' kernel global
variable, we could dynamically find the platform the core was generated
for and match it to the definitions.
---
Suzuki K. Poulose (3):
[ppc] virtual address translation bits for PPC440
[ppc] Support for platform based Virtual address translation
[ppc] Non-linear address translation routine
defs.h | 19 ++++-
ppc.c | 264 +++++++++++++++++++++++++++++++++++-----------------------------
2 files changed, 160 insertions(+), 123 deletions(-)
--
Suzuki K. Poulose
12 years, 10 months
[ANNOUNCE] crash version 6.0.3 is available
by Dave Anderson
Download from: http://people.redhat.com/anderson
Changelog:
- Fix to gdb-7.3.1/bfd/bfdio.c to properly zero out a complete struct
stat with a corrected memset argument; caught when compiling with
the Clang Static Analyzer.
(idoenmez(a)suse.de)
- Fix for the SIAL extension module to remove a call to sial_free()
for an uninitialised variable that can result in a segmentation
violation when unloading a sial script.
(lmcilroy(a)redhat.com)
- Fix for the "runq" command for kernels that are configured with
CONFIG_FAIR_GROUP_SCHED. Without the patch, tasks contained within
the task-group of a cpu's currently-running task may not be displayed.
(d.hatayama(a)jp.fujitsu.com, anderson(a)redhat.com)
- Implemented support for the analysis of 32-bit PPC ELF kdump vmcores.
(suzuki(a)in.ibm.com)
- Implemented the capability of building a PPC crash binary on a PPC64
host, which can be done by entering "make target=PPC". After the
initial build is complete, subsequent builds can be done by entering
"make" alone.
(suzuki(a)in.ibm.com, anderson(a)redhat.com)
- Determine the PPC page size from the kdump PAGESIZE vmcoreinfo data.
(suzuki(a)in.ibm.com, anderson(a)redhat.com)
- Fix for the "kmem -[sS]", "kmem -[fF]" and "kmem <address>"
options in 3.2 kernels. Without the patch, the commands fail with
the error "kmem: invalid structure member offset: page_lru".
(anderson(a)redhat.com)
- Addition of a set of dumpfile read diagnostic debug statements. They
are primarily of use when dealing with kdump invocation or runtime
read failures (ELF kdumps or compressed kdumps), and can serve to
help pinpoint the problem as a faulty/corrupted dumpfile vs. a crash
utility bug. Some statements are seen when invoking crash with "-d1",
more with "-d4", and all of them with "-d8". During runtime, debug
statements may be seen by entering "set debug <level>".
(dmair(a)suse.com, anderson(a)redhat.com)
- Fix for X86 kernels that have CONFIG_X86_32, CONFIG_DISCONTIGMEM,
CONFIG_DISCONTIGMEM_MANUAL and CONFIG_NUMA all configured. Without
the patch, the VM subsystem fails to initialize properly because the
pgdat structures are allocated by the remap allocator.
(ptesarik(a)suse.cz)
- Fix for the "vtop" command on large NUMA X86 kernels where a node's
starting physical address is larger than 32-bits. Without the patch,
the page struct contents of a virtual address may not be displayed.
Associated with that fix, the "kmem -n" line that displays a node's
MEM_MAP, START_PADDR and START_MAPNR values has been adjusted to more
properly handle large physical addresses.
(dmair(a)suse.com)
- Update for the ARM architecture to recognize a recent change of
its vmlinux section name from ".init" to ".init.text". Without the
patch, a warning message indicating "crash: cannot determine text
init space" is displayed during initialization.
(rabin(a)rab.in)
- Significant speed increase of the "kmem -p" command, especially on
large-memory systems.
(qiaonuohan(a)cn.fujitsu.com)
- Implemented new "irq -a" and "irq -s" options. The "irq -a" option
displays the cpu affinity for in-use IRQs. The "irq -s" option
displays per-cpu IRQ stats in a similar manner to /proc/interrupts
for all cpus. To show a limited set of per-cpu IRQ stats, there is
an associated "-c" option that limits the cpus shown, which can be
expressed as "-c 1,3,5", "-c 1-3", or "-c 1,3,5-7,10". The options
are currently restricted to X86, X86_64, ARM, PPC64 and IA64.
(zhangyanfei(a)cn.fujitsu.com, anderson(a)redhat.com,
per.xx.fransson(a)stericsson.com)
- Removal of a redundant read of the kernel's __per_cpu_offset pointers
in the ARM architecture's arm_get_crash_notes() function.
(per.xx.fransson(a)stericsson.com)
- Fix for an ARM architecture segmentation violation because of a stack
overflow due to recursion in the page table translation code. This
was seen when analyzing a dumpfile where the page tables had been
corrupted.
(rabin(a)rab.in)
- Fix for the the "FREE HIGH" tally in the X86 "kmem -i" display.
Without the patch, the PAGES, TOTAL and PERCENTAGE values would
always show zero values.
(anderson(a)redhat.com)
- Fix for the "kmem -n" output display for 32-bit architectures that
are configured with CONFIG_SPARSEMEM. Without the patch, the values
under the CODED_MEM_MAP, MEM_MAP and PFN columns are all shifted to
the left.
(anderson(a)redhat.com)
- Cleanup of several SIAL extension module files to address bison 2.5
and gcc 4.4.3 compile-time warnings.
(lchouinard(a)s2sys.com)
- Fix for "net -[sS]" command options on the ARM architecture. Without
the patch, invalid data would be displayed because the calculation of
the socket address was off by 4 bytes.
(Jan.Karlsson(a)sonyericsson.com, anderson(a)redhat.com)
- Fix for the ARM "bt" command to allow the core kernel unwind tables
to be used in cases where the module unwind tables are inaccessible.
(rabin(a)rab.in)
- Implementation of a new "dev -d" option that displays disk device
I/O statistics. For each disk device, its major number, gendisk and
request_queue addresses are displayed along with the total number of
allocated I/O requests that are in-progress. The total I/O requests
are then split out into synchronous vs. asynchronous counts (or reads
vs. writes in older kernels), and the number that are in-flight in
the device driver.
(wency(a)cn.fujitsu.com)
- Update for 3.1.x and later kernels configured with CONFIG_SLAB, which
have replaced the kmem_cache.nodelists[] array with a pointer to an
outside array. Without the patch, the crash session fails during
invocation with the error "crash: zero-size memory allocation!".
(bob.montgomery(a)hp.com, anderson(a)redhat.com)
- Implemented support for the analysis of 32-bit PPC compressed kdump
vmcores.
(suzuki(a)in.ibm.com)
- Prevent the "runq" command from dumping an unending loop of tasks if
the CFS runqueue has been corrupted. If the output of a cpu's
runqueue would display a duplicate task, the output will stop with
the message "WARNING: duplicate CFS runqueue node: task <address>".
(dmair(a)suse.com)
- Repurposed/renamed the rarely-used and rarely-needed "mod -r" option
to "mod -R". The option is used to reinitialize the module data; all
currently-loaded symbolic and debugging data is deleted, and the
installed module list will be updated (live systems only).
(anderson(a)redhat.com)
- Implemented a new "mod -r" option, which will pass the "-readnow"
flag to the embedded gdb module, which will override the two-stage
strategy that it uses for reading symbol tables from module object
files. If the crash session was invoked with the "--readnow" flag,
then the same override will occur automatically. It should be noted
that doing will increase the virtual and resident memory set size.
(anderson(a)redhat.com)
- Performance increase for the "kmem -s <address>" option on
kernels configured with CONFIG_SLAB, most notably on kernels
whose kmem_cache.array[NR_CPUS] array is several pages in size.
(qiaonuohan(a)cn.fujitsu.com)
- Require that the "<slabname>" argument to "kmem -s <slabname>"
be escaped with a '\' character in two situations:
(1) in the highly-unlikely case of a kmem_cache slab named "list",
to prevent the ambiguity with the "kmem -s list" command option.
(2) if the first character of the <slabname> actually is a '\'
character.
(anderson(a)redhat.com)
12 years, 10 months
[PATCH] improve the performance of kmem -s [address]
by qiaonuohan
Hello Dave,
These two patches are used to improve the performance of "kmem -s
[address]".
Current code need to search all kmem_caches twice. It is a waste of
time. With my patch, the address is sent to "dump_kmem_cache" function
to make the second search be restricted to the kmem_cache found in the
first search. It means the second search is neglected.
I have implemented the improvement for three types of kmem_cache. The
search for "slub" does not waste that much time as its peers, so I did
not modify the code related to "slub". Additionally, I only get the
environment to test my first patch, so the second patch has not been tested.
Here is the statistic of my test on RHEL6.2 x86_64, the vmcore used for
testing has about 150000 kmem_caches
commands in file:
1. only 'quit'
origin code : about 36s
with patch : about 36s
2. 1 time of searching the 100000th kmem_cache
origin code : about 59s
with patch : about 39s
3. 100 times of searching the 100000th kmem_cache
origin code : about 38min30s
with patch : about 5min
p.s.
If I create a kmem_cache called "list", a little confusion may happen
when using "kmem -s list". I am wondering is it necessary to introduce
an another option replacing list to avoid such collision.
--
--
Regards
Qiao Nuohan
12 years, 11 months
[PATCH v2 1/1] CFS runqueue loop detection
by David Mair
Here is a patch against crash v6.0.3rc24 that adds duplicate node
detection per-CPU for the CFS runqueue display in dump_CFS_runqueues()
for the runq command.
This resolves for that 6.0.3 rc the failure to bail-out of the unending
looping display I get with the crash dump I have that has a corrupted
CFS runqueue containing a loop.
Signed-off-by: David Mair <dmair(a)suse.com>
---
task.c | 11 ++++++++++-
1 files changed, 10 insertions(+), 1 deletions(-)
diff --git a/task.c b/task.c
index c81cb74..7a3e8e1 100755
--- a/task.c
+++ b/task.c
@@ -7060,7 +7060,14 @@ dump_tasks_in_cfs_rq(ulong cfs_rq)
OFFSET(sched_entity_run_node));
if (!tc)
continue;
- dump_task_runq_entry(tc);
+ if (hq_enter((ulong)tc)) {
+ dump_task_runq_entry(tc);
+ } else {
+ error(WARNING, "Duplicate CFS runqueue node, task %lx"
+ ", probable loop\n",
+ tc->task);
+ return total;
+ }
total++;
}
@@ -7220,7 +7227,9 @@ dump_CFS_runqueues(void)
fprintf(fp, " CFS RB_ROOT: %lx\n", (ulong)root);
+ hq_open();
tot = dump_tasks_in_cfs_rq(cfs_rq);
+ hq_close();
if (!tot) {
INDENT(5);
fprintf(fp, "[no tasks queued]\n");
12 years, 11 months
[PATCH v2 0/4] Compressed KDUMP core analysis support for PPC32
by Suzuki K. Poulose
Changes since V1:
* Introduced generic routines to parse the ELF Notes, which could be
reused for different architectures.
* Better logical split of the patches.
I have tested the build for warnings using 'make Warn' with gcc version 4.3.4.
---
Suzuki K. Poulose (4):
[ppc] Enable stack trace analysis for compressed Kdump
[netdump] Update the flags when ELF Notes are processed
[ppc] Support for compressed KDUMP
Generic routines for processing elf notes
diskdump.c | 111 ++++++++++++++++++++++++++++++++++++++++++++++--------------
netdump.c | 3 ++
ppc.c | 14 ++++++--
3 files changed, 101 insertions(+), 27 deletions(-)
--
Suzuki K. Poulose
12 years, 11 months
[PATCH] [PPC32] Fix vmalloc address translation for BookE
by Suzuki K. Poulose
This patch fixes the vmalloc address translation for BookE.This
patch is based on the PPC44x definitions and may not work fine for
other systems.
crash> mod
mod: cannot access vmalloc'd module memory
crash>
After the patch :
crash> mod
MODULE NAME SIZE OBJECT FILE
d1018fd8 mbcache 6023 (not loaded) [CONFIG_KALLSYMS]
d1077190 jbd 58360 (not loaded) [CONFIG_KALLSYMS]
d107ca98 llc 4525 (not loaded) [CONFIG_KALLSYMS]
d1130de4 ext3 203186 (not loaded) [CONFIG_KALLSYMS]
d114bbac squashfs 26129 (not loaded) [CONFIG_KALLSYMS]
On ppc44x, the virtual-address is split as below :
Bits |0 10|11 19|20 31|
-----------------------------------
| PGD | PMD | PAGE_OFFSET |
-----------------------------------
The PAGE_BASE_ADDR is a 64bit value(of type phys_addr_t).
Note : I am not sure how do we distinguish the different values (PGDIR_SHIFT etc)
for different PPC32 systems. Since there are a lot of different platforms
under PPC32, we need some mechanism to dynamically determine the PGDIR, PTE
shift values. One option is to put the information in the VMCOREINFO.
Or we should hard code these values for each platform and
compile a crash for a particular platform.
Thoughts ?
Signed-off-by: Suzuki K. Poulose <suzuki(a)in.ibm.com>
---
defs.h | 4 ++--
ppc.c | 20 ++++++++++++--------
2 files changed, 14 insertions(+), 10 deletions(-)
diff --git a/defs.h b/defs.h
index 82d51e5..844f369 100755
--- a/defs.h
+++ b/defs.h
@@ -2603,8 +2603,8 @@ struct load_module {
#define VTOP(X) ((unsigned long)(X)-(machdep->kvbase))
#define IS_VMALLOC_ADDR(X) (vt->vmalloc_start && (ulong)(X) >= vt->vmalloc_start)
-#define PGDIR_SHIFT (22)
-#define PTRS_PER_PTE (1024)
+#define PGDIR_SHIFT (21)
+#define PTRS_PER_PTE (512)
#define PTRS_PER_PGD (1024)
#define _PAGE_PRESENT 0x001 /* software: pte contains a translation */
diff --git a/ppc.c b/ppc.c
index 2a10fac..6a1db2a 100755
--- a/ppc.c
+++ b/ppc.c
@@ -381,8 +381,8 @@ ppc_kvtop(struct task_context *tc, ulong kvaddr, physaddr_t *paddr, int verbose)
ulong *page_dir;
ulong *page_middle;
ulong *page_table;
- ulong pgd_pte;
- ulong pte;
+ ulong pgd_pte;
+ unsigned long long pte; /* PTE is 64 bit */
if (!IS_KVADDR(kvaddr))
return FALSE;
@@ -404,9 +404,13 @@ ppc_kvtop(struct task_context *tc, ulong kvaddr, physaddr_t *paddr, int verbose)
fprintf(fp, "PAGE DIRECTORY: %lx\n", (ulong)pgd);
page_dir = pgd + (kvaddr >> PGDIR_SHIFT);
-
- FILL_PGD(PAGEBASE(pgd), KVADDR, PAGESIZE());
- pgd_pte = ULONG(machdep->pgd + PAGEOFFSET(page_dir));
+ /*
+ * The (kvaddr >> PGDIR_SHIFT) may exceed PAGESIZE().
+ * Use PAGEBASE(page_dir) to read the page containing the
+ * translation.
+ */
+ FILL_PGD(PAGEBASE(page_dir), KVADDR, PAGESIZE());
+ pgd_pte = ULONG((unsigned long)machdep->pgd + PAGEOFFSET(page_dir));
if (verbose)
fprintf(fp, " PGD: %lx => %lx\n", (ulong)page_dir, pgd_pte);
@@ -417,7 +421,7 @@ ppc_kvtop(struct task_context *tc, ulong kvaddr, physaddr_t *paddr, int verbose)
page_middle = (ulong *)pgd_pte;
if (machdep->flags & CPU_BOOKE)
- page_table = page_middle + (BTOP(kvaddr) & (PTRS_PER_PTE - 1));
+ page_table = (unsigned long long *)page_middle + (BTOP(kvaddr) & (PTRS_PER_PTE - 1));
else {
page_table = (ulong *)((pgd_pte & (ulong)machdep->pagemask) + machdep->kvbase);
page_table += ((ulong)BTOP(kvaddr) & (PTRS_PER_PTE-1));
@@ -428,10 +432,10 @@ ppc_kvtop(struct task_context *tc, ulong kvaddr, physaddr_t *paddr, int verbose)
(ulong)page_table);
FILL_PTBL(PAGEBASE(page_table), KVADDR, PAGESIZE());
- pte = ULONG(machdep->ptbl + PAGEOFFSET(page_table));
+ pte = ULONGLONG((unsigned long)machdep->ptbl + PAGEOFFSET(page_table));
if (verbose)
- fprintf(fp, " PTE: %lx => %lx\n", (ulong)page_table, pte);
+ fprintf(fp, " PTE: %lx => %llx\n", (ulong)page_table, pte);
if (!(pte & _PAGE_PRESENT)) {
if (pte && verbose) {
12 years, 11 months