[PATCH] Fix bugs in runq
by Zhang Yanfei
Hello Dave,
In runq command, when dumping cfs and rt runqueues,
it seems that we get the wrong nr_running values of rq
and cfs_rq.
Please refer to the attached patch.
Thanks
Zhang Yanfei
12 years, 4 months
Using FAULT_ON_ERROR in readmem calls
by Karlsson, Jan
Hi Dave
I would like to discuss the usage of FAULT_ON_ERROR in readmem calls. I have now seen a number of situations where this prevents Crash to produce appropriate results when some memory is corrupt.
The last problem I saw a few days ago was in kernel.c, in function dumplog
readmem(log_buf, KVADDR, buf,
log_buf_len, "log_buf contents", FAULT_ON_ERROR)
The problem was that log_buf_len contained a very large value (memory overwrite?) so the readmem failed due to the size. This means of course that it was not possible to print the log, but as this function is called during Crash startup it also had the consequence that Crash terminated during startup. By just changing FAULT_ON_ERROR to RETURN_ON_ERROR and perform a return if the readmem failed I could use Crash to investigate this vmcore file, except for printing the log.
A second place where I have made some patches in Crash is in function arm_uvtop (arm.c). In the readmem calls in this function I have changed FAULT_ON_ERROR to RETURN_ON_ERROR and just made a "return FALSE;" if the readmem fails. Unfortunately I do not remember the details why I made this change, but I think there were a case where Crash terminated during startup and with these changes it was possible to investigate the vmcore file.
Another situation I have seen is in help functions like fill_vma_cache and fill_file_cache. When I use these functions in extensions the commands will fail and terminate immediately if a readmem call fails. In several cases I could easily handle such a failure and the command could still produce a lot of relevant results.
In the plugins I write I use RETURN_ON_ERROR in principle everywhere and of course then handle the error situations myself. I have done this to avoid situations as the ones described above.
I am not asking you to remove most usage of FAULT_ON_ERROR, as I realize the size and risks with such changes. However I would like to bring up this question and hear your views. When working with vmcore files with minor memory corruptions, using FAULT_ON_ERROR will limit the usability of Crash.
Jan
Jan Karlsson
Senior Software Engineer
MIB
Sony Mobile Communications
Tel: +46703062174
sonymobile.com<http://sonymobile.com/>
[cid:image001.jpg@01CD84FD.28EB6440]
12 years, 4 months
[PATCH] add support to "virsh dump-guest-memory"(qemu memory dump)
by qiaonuohan
Hello Dave,
I made this patch to make crash can analyze core dump file created by
"virsh dump-guest-memory"(I will call it "qemu memory dump" below). In
my test(guest OS: RHEL6.2 x86 & x86_64), the patch works well with dump
files created by "qemu memory dump".
However, after some investigation, I think I need to discuss further
works with you.
The core dump created by qemu memory dump is similar to kdump. The
distinctness only focuses on note sections. The former one gets
note sections with a name called "QEMU".
1. Some registers' information stored in "CORE" note sections, needed
by crash, also stores in "QEMU" note sections. I think it's not
reasonable to replace them. What do you think?
2. Other registers which are only stored in "QEMU" note sections are
not directly used in crash. I will continue investigating the use of
these registers. And if you give some suggestion, it will be helpful.
--
--
Regards
Qiao Nuohan
12 years, 4 months
Extension modules in C++
by Petr Tesarik
Hi all,
as part of SUSE HackWeek8, David started work on a GUI extension using Qt4,
which is a C++ project. One of the early annoyances is that an extension
module must include the declarations from defs.h, and we currently use some C
identifiers which happen to be keywords in C++, namely:
- struct namespace
- struct namespace namespace (in struct symbol_table_data)
- char *typename (in struct gnu_request)
Can I rename them? But you said earlier that the existing API must never
change... Any other suggestions to make this include file parseable by a C++
compiler?
TIA,
Petr Tesarik
SUSE Linux
12 years, 4 months
[ANNOUNCE] crash version 6.0.9 is available
by Dave Anderson
Download from http://people.redhat.com/anderson
Changelog:
- Fix for building on host machines that have glibc-2.15.90 installed,
in which case the glibc header file /usr/include/bits/siginfo.h no
longer declares a "struct siginfo", but only the "siginfo_t" typedef.
Without the patch, the build of the embedded gdb module fails with
the error message "linux-nat.h:63:18: error: field 'siginfo' has
incomplete type".
(anderson(a)redhat.com)
- Add support for reading compressed kdump dumpfiles that were
compressed by the snappy compressor. This feature is disabled by
default. To enable this feature, build the crash utility in the
following manner:
(1) Install the snappy libraries by using the host system's package
manager or by directly downloading libraries from author's
website. The packages required are:
- snappy
- snappy-devel
The author's website is: http://code.google.com/p/snappy
(2) Create a CFLAGS.extra file and an LDFLAGS.extra file in top-level
crash sources directory:
- enter -DSNAPPY in the CFLAGS.extra file
- enter -lsnappy in the LDFLAGS.extra file.
(3) Build crash with "make" as always.
(d.hatayama(a)jp.fujitsu.com)
- Prevent the "ptov" command from returning an invalid virtual address
on 32-bit architectures. Without the patch, the command may result
in an invalid virtual address if the physical address entered cannot
be accessed by a unity-mapped kernel virtual address. The patch
verifies that the calculated virtual address can be translated back
into the supplied physical address.
(Jan.Karlsson(a)sonymobile.com, anderson(a)redhat.com)
- Fix to automatically try /proc/kcore as an alternative live memory
source when the /dev/crash driver does not exist and /dev/mem is
unusable because the kernel was configured with CONFIG_STRICT_DEVMEM.
Without the patch, the automatic switch from /dev/mem to /proc/kcore
is only attempted on the X86 and X86_64 architectures.
(anderson(a)redhat.com)
- Added missing linefeeds to several error messages in makedumpfile.c.
(anderson(a)redhat.com)
- Fix for a regression introduced by a crash-5.1.1 patch that reworked
the handling of "set" commands that are put in .crashrc files, such
that only certain command options would get resolved before the crash
session is initialized. Without this patch, the "--less", "--more",
"--no_scroll" and "--CRASHPAGER" crash command line options do not
properly override conflicting "set scroll <option>" entries that
are put in a .crashrc file.
(anderson(a)redhat.com)
- Added new "--hex" and "--dec" crash command line options, which will
set the command output format to hexadecimal or decimal. These two
command line options will override any "set radix [10|16]" settings
in a .crashrc file; since decimal is the default, the "--dec" option
would only be necessary to override a "set radix 16" setting in a
.crashrc file.
(anderson(a)redhat.com)
- Fix for the "runq" and "timer" commands when running against 2.6.34
and later kernels that are not configured with CONFIG_SMP. Without
the patch, the "runq" command fails with the error message "runq:
per-cpu runqueues does not exist", and the "timer" command fails
with the error message "timer: zero-size memory allocation! (called
from <address>)".
(anderson(a)redhat.com)
- If code.google.com is not available from the host build machine, then
"make extensions" will be delayed by a 10 minute timeout of the
"git clone" command that downloads the EPPIC library and extension
module source tree. The patch pings code.google.com first in order
to determine its availability before attempting the download.
(anderson(a)redhat.com)
- For kernel versions 3.5 and later, in which the kernel log buffer has
been converted from a byte-buffer to a variable-length record buffer,
the "log -m" option will display the level in hexadecimal, and
depending upon the kernel version, the value also contains either the
facility or flags bits.
(anderson(a)redhat.com)
- Fix for accessing the per-cpu registers from ARM vmcores generated
by recent kernels in which the per-cpu data region has been moved
into mapped kernel virtual address space. Without the patch, an
incorrect physical address is calculated, resulting in bogus register
contents.
(Jan.Karlsson(a)sonymobile.com)
- Check that an s390x dumpfile is a "live dump" earlier during session
initialization so that the internal LIVE_DUMP flag will get set when
"crash --minimal" is invoked.
(holzheu(a)linux.vnet.ibm.com)
- Removed the usage of C++ keywords in structure and structure member
names declared in "defs.h" so that extension modules written in C++
will compile successfully. Accordingly, the "struct namespace" is
renamed to "struct symbol_namespace", the struct symbol_table_data's
"namespace" member is renamed to "kernel_namespace", and the struct
gnu_request's "typename" member is renamed to "type_name".
(anderson(a)redhat.com)
- Fix for the date displayed by the initial system banner and by the
"sys" command for Linux version 3.6 and later. Without the patch,
the date displayed will be that of the UNIX epoch, i.e., midnight,
Jan 1, 1970 UTC, adjusted to local time.
(anderson(a)redhat.com)
- When the eppic.so extension module is built by "make extensions", the
EPPIC source tree is downloaded from its upstream source repository
at https://code.google.com/p/eppic. However, if an EPPIC_GIT_URL
environment variable is defined, then the URL that it points to will
be used as an alternative git source repository.
(per.fransson.ml(a)gmail.com)
- Fix for a segmentation violation generated by the "struct" command
when printing a structure member using the "struct_name.member"
argument format, where the member is a "char *" that points to a
string that contains a "%" character.
(bob.montgomery(a)hp.com, adk(a)acunu.com)
- Patchset to support the most recent Xen hypervisor and Xen pvops
kernels:
(1) Always calculate max_cpus value
(2) Read only crash notes for onlined CPUs
(3) Read variables from dynamically allocated per_cpu data
(4) Get idle data from alternative source
(5) Read data correctly from dynamically allocated console ring
(6) Add support for 3 level P2M tree
(daniel.kiper(a)oracle.com)
- Fix for building a 32-bit eppic.so extension module after having
built crash with "make target=ARM" or "make target=X86" on an x86_64
host. Without the patch, the eppic.so extension module would be
built as a 64-bit binary.
(per.fransson.ml(a)gmail.com, anderson(a)redhat.com)
- For the ARM architecture, fix the determination of the kernel modules
base address when modules are not installed, and update the "mach"
command to display the "KERNEL MODULES BASE" address.
(mika.westerberg(a)iki.fi, anderson(a)redhat.com)
- Fix for the "kmem -[sS]" commands for Linux version 3.6 and later
kernels configured with CONFIG_SLUB. Without the patch, the commands
fail with the error message "kmem: invalid structure member offset:
kmem_cache_objsize".
(anderson(a)redhat.com)
- Fix for an invocation failure when running against Linux version 3.6
and later kernels that are configured with CONFIG_SLAB. Without the
patch, the crash session fails during initialization with the error
message "crash: invalid structure member offset: kmem_cache_s_next".
(anderson(a)redhat.com)
- Fix for the "kmem -[sS]" commands on kernels that are configured with
CONFIG_SLUB to prevent a silent hang if a per-node slab cache partial
list recurses back onto itself. Without the patch, it was necessary
to kill the command; with the patch an error message is displayed and
the command continues on to the next kmem slab cache.
(anderson(a)redhat.com)
- Fix for the "kmem -[sS]" and "kmem -s list" options on dumpfiles from
kernels that are configured with CONFIG_SLUB which have been filtered
by the makedumpfile facility. Without the patch, it is possible that
those commands may generate the error message "kmem: page excluded:
kernel virtual address: <address> type: kmem_cache buffer", and
would require either the "--zero_excluded" command line option or
having to execute "set zero_excluded on" during runtime in order to
complete successfully.
(anderson(a)redhat.com)
12 years, 4 months
Crash on Linux 3.6 rc1
by Mark Tinguely
I trip over this bug on Linux 3.6 rc1. Crash runs fine on Linux 3.5.
Thanks,
--Mark Tinguely.
------
~/xfs # crash System.map vmlinux
crash 6.0.8
Copyright (C) 2002-2012 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
crash: invalid structure member offset: kmem_cache_s_next
FILE: memory.c LINE: 7945 FUNCTION: kmem_cache_init()
[/usr/bin/crash] error trace: 468317 => 49dbb2 => 487f28 => 5083da
5083da: OFFSET_verify+202
487f28: kmem_cache_init+312
49dbb2: vm_init+5794
468317: main_loop+215
~/xfs # cat /proc/version
Linux version 3.6.0-rc1 (root@cxfsxe12) (gcc version 4.3.4
[gcc-4_3-branch revision 152973] (SUSE Linux) ) #1 SMP Fri Aug 10
17:03:36 CDT 2012
12 years, 4 months
Re: [Crash-utility] Question about ARM module address range
by anderson@prospeed.net
> Hi Dave
>
> I have taken a short look at modules_vaddr and module_end and I have both
> seen relevant data:
>
> crash> help -m
> modules_vaddr: bf000000
> modules_end: bfffffff
>
> and data similar to what you see. What I also have seen is that when
> modules are loaded then modules_vaddr and modules_end seems correct and
> when no modules have been loaded then strange values are presented. I have
> looked at too few examples to be certain that this is "always" true.
>
> I assume (not checked in source code) that no vmalloc area is allocated
> for modules if no modules are loaded. Then the function
> first_vmalloc_address() will return data which is stored in modules_vaddr
> but has nothing to do with this.
>
> So I think that the question is what values should modules_vaddr and
> modules_end have if no modules are loaded. Does it matter, except that it
> might be confusing for a user? Looking at arm.c where modules_vaddr and
> modules_end are used, I think the code will behaves correctly (by luck?!),
> also in the case of no modules.
>
> Jan
>
> Jan Karlsson
> Senior Software Engineer
> MIB
I don't have access to my sample ARM vmcores (on vacation), but none
of them have any modules loaded. So in those 3 cases, the vmalloc range
starts at either d0807000 or c6024000, and so the hardwired modules_end
is confusing. But it appears from your description that modules are put
in a their own virtual address region from bf00000 to bffffff, whereas
other vmalloc() calls generate virtual addresses above c000000?
(as shown by kmem -v). In that case, you're right, the code would
work as is. Anyway, it did confuse me a bit -- perhaps arm_cmd_mach()
should show different "KERNEL VMALLOC BASE" and "KERNEL MODULES BASE"
addresses, i.e., similar to x86_64?
Dave
Dave
12 years, 4 months
using crash for ARM
by paawan oza
Hi,
I would like to use crash utility on ARM.
what I understand is there might be two ways to go about it.
1) cross compile whole crash for arm itself, which doesnt seem to be good option because on arm target we will need lots of depedent packages.
2) run crash on x86 and have gdbserver/remoter server compiled on target. and have serial connection and so on..
please suggest instructions or any pointers regarding the same.
Regards,
Oza.
12 years, 4 months
[PATCH v2 0/6] crash: Bundle of fixes for Xen
by Daniel Kiper
Hi,
It looks that Xen support for crash have not been maintained
since 2009. I am trying to fix this. Here it is bundle of fixes:
- xen: Always calculate max_cpus value,
- xen: Read only crash notes for onlined CPUs,
- x86/xen: Read variables from dynamically allocated per_cpu data,
- xen: Get idle data from alternative source,
- xen: Read data correctly from dynamically allocated console ring, too
(fixed in this release),
- xen: Add support for 3 level P2M tree
(new patch in this release).
Daniel
12 years, 4 months