mod -S and debuginfo kernel rpm
by Michael Holzheu
Hi,
The kernel debuginfo rpms (e.g. kernel-debuginfo-xxx.s390x.rpm) provide
kernel modules that contain debug sections in order to access the data
structures with tools like crash. The kernel modules are named
"<module_name>.ko.debug".
The mod -S command in crash does not recognize the ".debug" files. Is
that by intention? What is the recommended way to load all the debug
information?
Michael
15 years, 7 months
cpu online map changes in 2.6.29
by Michael Holzheu
Hi all,
Currently crash does not work with 2.6.29 kernels, because of changes in
the cpu_online_map code:
crash 4.0-8.9
Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Red Hat,
Inc.
...
crash: cannot resolve "cpu_online_map"
Is anybody working on that issue?
Michael
15 years, 7 months
Re: makedumpfile question / request
by Dave Anderson
----- "Ken'ichi Ohmichi" <oomichi(a)mxs.nes.nec.co.jp> wrote:
> Hi Dave,
>
> Dave Anderson wrote:
> > Is there a reason that makedumpfile does not fill in the utsname structure
> > in the compressed dumpfile's header?
>
> Thank you for good point.
>
> makedumpfile does not fill it because makedumpfile might not be able to
> get kernel debug information (containing symbol system_utsname/init_uts_ns).
> makedumpfile does not need kernel debug information if dump_level is 0 or 1,
> and it does not read a new_utsname structure. (check_release() is not called.)
>
>
> > The data structure does get read into a local new_utsname structure in the
> > check_release() function, but it doesn't get saved and copied into the
> > disk_dump_header in write_kdump_header().
> >
> > It would be helpful if that were in place as a quick ID for what the
> > compressed dumpfile contains.
>
> I feel that is worth.
> How about saving new_utsname data into disk_dump_header only if dump_level
> is 2 or bigger ?
Well, that's certainly preferable than the way it is now.
But let me ask you this...
Given that the init_uts_ns structure is always located in:
(1) unity-mapped memory, or in a mapped kernel region for x86_64/ia64, and
(2) that your initial() function calls get_phys_base() in all cases,
can't you just strip the relevant unity-mapping from the supplied
VMCOREINFO/init_uts_ns symbol value, apply the phys_base, and then
read it from the vmcore file?
Dave
15 years, 7 months
makedumpfile question / request
by Dave Anderson
Is there a reason that makedumpfile does not fill in the utsname structure
in the compressed dumpfile's header?
The data structure does get read into a local new_utsname structure in the
check_release() function, but it doesn't get saved and copied into the
disk_dump_header in write_kdump_header().
It would be helpful if that were in place as a quick ID for what the
compressed dumpfile contains.
Dave
15 years, 7 months
crash version 4.0-8.9 is available
by Dave Anderson
- Tentatively scheduled as the baseline version for the RHEL5.4 crash
utility errata release.
- Implemented a new "bt -g" option, which will display the backtraces
of all threads in the targeted task's thread group. The thread
group leader's backtrace will be displayed first, regardless of
which task was the target of the "bt" command.
(anderson(a)redhat.com)
- Implement support for the kdump "split-dumpfile" format, which can
split /proc/vmcore into multiple dumpfiles as specified by the
"makedumpfile --split" command option. It simply requires that all
of the split dumpfile names be entered on the crash command line.
(tindoh(a)redhat.com)
- Fix for "kmem -i", "kmem -n" and "kmem -p" on x86_64 CONFIG_SPARSEMEM
and CONFIG_SPARSEMEM_EXTREME kernels that have MAX_PHYSMEM_BITS
increased from 40 to 44. Without the patch, erroneous page-related
data could be displayed depending upon the amount of physical memory
contained by the target system.
(anderson(a)redhat.com)
- For the architectures that support it, the "--machdep option=value"
command line option has been modified to allow more than one machine-
dependent argument. (anderson(a)redhat.com)
- The starting backtrace location of active, non-crashing, xen dom0
tasks are not available in kdump dumpfiles, nor is there anything
that can be searched for in their respective stacks. Therefore, for
those those tasks, the "bt" command will indicate: "bt: starting
backtrace locations of the active (non-crashing) xen tasks cannot be
determined: try -t or -T options". Without the patch, the backtrace
would either be empty, or it would show an invalid backtrace starting
at the last location where schedule() had been called.
(anderson(a)redhat.com)
- Fix for potentially empty "bt -t" output, and for "bt -T" potentially
dumping the text return addresses in the hard or soft IRQ stacks
instead of the process stack. This could occur if the targeted task
was the last task that used the hard or soft IRQ stack (x86 only).
(anderson(a)redhat.com)
Download from: http://people.redhat.com/anderson
15 years, 7 months
Re: [RFC]: Feature to display local arguments and variables
by Dave Anderson
----- "Sharyathi Nagesh" <sharyath(a)in.ibm.com> wrote:
> >> How to Build/Requirements
> >> 1. Enable compiling extends option in main Makefile
> >
> > You mean to just enter "make extensions", right?
> Yes, found the Makefile under extensions is not called due to this, we
> faced this issue with crash 4.0.7.7 didn't try with latest crash
> though.
It's not a new concept. If the extensions/Makefile sees a "local.c" file,
it will then look for a "local.mk" file and invoke it. I applied your
patch, and "make extensions" from the top-level directory works as expected
(although the compilation fails as expected):
# make extensions
gcc -nostartfiles -shared -rdynamic -o local.so local.c -fPIC -ldw -L ../../elfutils-0.137/libdw -I ../../elfutils-0.137/libdw -I ../../elfutils-0.137/libelf/ -DX86_64 -Wall;
local.c:8:23: error: libdw.h: No such file or directory
local.c:9:19: error: dwarf.h: No such file or directory
local.c:11:21: error: netdump.h: No such file or directory
local.c:33: error: expected ‘)’ before ‘*’ token
...
> >
> >> 2. create a symbolic link to netdump.h under extensions/
> >
> > You should be able to do that in your local.mk file.
> Ok, sure will take care of that. Please let us know of any reference
> *.mk file that is used. will help as this will need to take care of
> library dependencies in this file
The extensions/sial.mk file is the only example in the upstream tree.
Actually there are two alternatives:
(1) as you are doing now, put your local.c and local.mk and whatever other
files you need into an existing extensions subdirectory, or
(2) build a standalone package that only requires the crash-devel
subpackage.
Alternative (2) allows you to create a standalone package that
uses the crash-devel subpackage, which only consists of the "defs.h"
file. Doing it that way you don't need the complete crash source tree.
Your package is unique in that it currently needs "netdump.h"
for the vmcore_data structure. I am not averse to moving that
structure declaration into "defs.h" if you want to go that route.
Check out the IBM crash-spu-commands-1.1-1.src.rpm package as an example
of alternative (2):
http://riksun.riken.go.jp/pub/pub/Linux/cern/updates/slc5X/SRPMS/crash-sp...
In any case, that's easy enough to deal with in the future.
I'd just continue working in the extensions subdirectory for now.
> >> 3. Context switching via -r command still needs to implemented
> >
> > BTW, the "placeholder" code in cmd_local() for -p and -r would be
> > disastrous because they modify a context structure with a totally
> > different pid. What you'll want to do is update local->tc with a
> > pointer to the context of the entered pid argument. (There's helper
> > functions to do that)
> yes you are right initialize_local() function needs to handle it.
> For -p opion we need
> 1. function to provide task context for the pid passed
That would be pid_to_context().
But more on that below...
> 2. Current register context for the process, for Active process we
> could get from ELF Notes, if there is a way to access register set for
> these process it would be helpful
The only register sets are found in the per-cpu NT_PRSTATUS notes in
the ELF header. They don't exist for the other non-active tasks, and
in fact, the older netdump format only contains a single ELF NT_PRSTATUS
for the crashing cpu task.
Furthermore, if the kdump vmcore is compressed with "makedumpfile -c",
then the resultant dumpfile's unique non-ELF header doesn't even contain
those register sets.
> -c option allows user to shift context. If user changes active CPU via
> set -c <new CPU>, local -c will allow user to move current local context
> to that particular cpu context.
>
> Please Let us know your thoughts on this implementation
I don't know why that would be necessary. When your cmd_local() function
is called without "-p <pid>", it should just default to the current context
which would have been set with the prior "set -c <cpu>", "set <task>" or
"set <pid>". For that matter, why do you even need "-p <pid>"? I'm not
clear on why you wouldn't be using the current context all of the time?
It appears that your mechanism will only work if the target task is
an active task whose register set is contained in the ELF header.
So when your command is called, it seems like it would simply need to
first determine "can I handle this context?", and if not, just bail out.
Dave
15 years, 7 months
Re: [RFC]: Feature to display local arguments and variables
by Dave Anderson
----- "Sharyathi Nagesh" <sharyath(a)in.ibm.com> wrote:
> Hi
> We have implemented this piece of code to provide, crash tool,
> capability to display local variables and arguments. We would kindly
> request you to provide your feedback and guidance on this code so that
> we can take it further
When I get a chance I'll reserve a ppc64 machine to give this a go.
But for now, I've just got a few questions and suggestions.
>
> How to Build/Requirements
> 1. Enable compiling extends option in main Makefile
You mean to just enter "make extensions", right?
> 2. create a symbolic link to netdump.h under extensions/
You should be able to do that in your local.mk file.
> 3. This feature makes use of libelf and libdw libraries provided by elfutils package.
> 4, This feature is implemented as an extend, local.so library needs to
> be loaded once crash prompt is available
>
> Features
> 1. Currently it has feature to display arguments/locals of top most
> stack frame. ex usage: local params or local locals
What function would be considered the "top most" stack frame, say if
panic() were called, i.e., when crash_kexec() gets called with a NULL
"regs" argument? It appears that it would be crash_kexec() itself.
On the other hand, most dumps are generated from die(), but the regs
have been passed through several functions, so it's not clear to me
what you would see.
For those of us who are ppc64-machine-challenged, a few examples would
have been helpful...
> 2. Currently code can analyze only ppc64 dumps
> 3. It displays the current address of the local variables ( and some
> times direct values when the variables are stored in register and not in
> stack frame)
> 4. In case of code optimization, variable information is not available
> at that time printing "Failed to fetch information" (This is in
> accordance with gdb out put)
>
> TBD
> 1. Stack unwinding code is still need to be implemented
> 2. Support for x86 dumps is still not provided
BTW, have you explicitly left out x86_64?
> 3. Context switching via -r command still needs to implemented
BTW, the "placeholder" code in cmd_local() for -p and -r would be
disastrous because they modify a context structure with a totally
different pid. What you'll want to do is update local->tc with a
pointer to the context of the entered pid argument. (There's helper
functions to do that)
>
> Note: We were planning to display the variable values instead of
> addresses as we are doing currently, but it adds up to additional
> challenges in terms of typecasting the variables, in case it is array or
> structure, so for the time being we are printing only address and expect
> user to dump contents using rd command. Suggestions here will be very helpful.
Presumably the user will know what the local data types are, and can
simply display them using the "struct <datatype> <address>" command.
Using rd is fine, but may be unnecessarily primative. But anyway,
are you saying that the local variable data types cannot be determined
from some dwarf interface?
>
> Attaching the code...but the Code is not extensively tested
> please let us know your thoughts
Another minor nit -- I'd prefer that a new function be added to
netdump.c similar to get_kdump_vmcore_data() that you can use
for "real" netdump dumpfiles -- if you are planning to actually
planning to support netdump dumpfiles. Just create a new
get_netdump_vmcore_data() instead of making the structure global.
Anyway, it looks promising on paper!
Thanks,
Dave
15 years, 7 months
Ramin SHARIATIAN est absent(e).
by ramin.shariatian@ineo.com
Je serai absent(e) à partir du 14/04/2009 de retour le 20/04/2009.
Je répondrai à votre message dès mon retour.
15 years, 7 months
[RFC]: Feature to display local arguments and variables
by Sharyathi Nagesh
Hi
We have implemented this piece of code to provide, crash tool,
capability to display local variables and arguments. We would kindly
request you to provide your feedback and guidance on this code so that
we can take it further
How to Build/Requirements
1. Enable compiling extends option in main Makefile
2. create a symbolic link to netdump.h under extensions/
3. This feature makes use of libelf and libdw libraries provided by
elfutils package.
4, This feature is implemented as an extend, local.so library needs to
be loaded once crash prompt is available
Features
1. Currently it has feature to display arguments/locals of top most
stack frame. ex usage: local params or local locals
2. Currently code can analyze only ppc64 dumps
3. It displays the current address of the local variables ( and some
times direct values when the variables are stored in register and not in
stack frame)
4. In case of code optimization, variable information is not available
at that time printing "Failed to fetch information" (This is in
accordance with gdb out put)
TBD
1. Stack unwinding code is still need to be implemented
2. Support for x86 dumps is still not provided
3. Context switching via -r command still needs to implemented
Note: We were planning to display the variable values instead of
addresses as we are doing currently, but it adds up to additional
challenges in terms of typecasting the variables, in case it is array or
structure, so for the time being we are printing only address and expect
user to dump contents using rd command. Suggestions here will be very
helpful.
Attaching the code...but the Code is not extensively tested
please let us know your thoughts
Thanks
Sharyathi Nagesh
15 years, 7 months
Re: [Crash-utility] what is recommended dump level (makedumpfile)
by Dave Anderson
----- "Tokuhisa Inagaki" <tokuhisa.inagaki(a)ctc-g.co.jp> wrote:
> Hi, all.
>
> what is recommended dump level when using makedumpfile
> (kdump)? I'm looking for explanation like this.
>
> ------------ from diskdump document --------------------
> Note that the partial dump feature has some risks. There
> are memory management lists which are scanned for a page's
> memory attribute, so if the list has been corrupted, the
> scanning process may fail. For example, when specifying a
> dump_level from 4-7 or from 12-15, the kernel's free page
> linked lists are scanned; if the list is corrupt, diskdump
> may hang. Furthermore, it is possible that a page type
> that has been skipped may be necessary to fully investigate
> the cause of some issues. Therefore, a memory collection
> level should be selected to suit each situation. The
> recommended level is 19, because it is easiest to determine
> whether a page is zero-filled or if it is a page cache page,
> and because no page lists need to be traversed.
> -------------------------------------------------------
>
> Does anyone know document describing these things
> when using makedumpfile ?
>
>
> Thanks,
> toku.
I don't know of any document for makedumpfile other than the
/usr/share/doc/kexec-tools-<version>/kexec-kdump-howto.txt
and the output of "makedumpfile -h". Neither of those has
any caveats such as the diskdump document above, but I'm
guessing that the same kind of issues would apply when makedumpfile
walks page lists, the mem_map, etc. But I don't know what the
failure mode would be, given that it's a user application as opposed
to a kernel module.
In any case, the makedumpfile maintainers are on this list
and hopefully can shed some light on your query.
Dave
15 years, 7 months