Maneesh Soni wrote:
Hi Dave,
Following is a list of a few proposed improvements to crash utility though
for most of the items there are no names associated.
Please let us know if these look useful or not. And if found appropriate
would it be possible for you to merge these with the crash todo list.
Thanks to Badari Pulavarty, Richard Moore and Vara Prasad for the inputs.
Regards
Maneesh
--------------------------------------------------------------------------------
DESCRIPTION:
clean & correct stack back traces on platforms ALL the time.
- x86_64 (currently wrong and need fixing)
- frame pointers off ? (on x86 we still don't have frame pointers on)
RESOLUTION STATUS: Work-in-progress by Rachita Kothiyal <rachita(a)in.ibm.com>
Certainly a welcome task. I suggest segregating the code in a separate
file (as done with lkcd_x86_trace.c), and the new entry point can simply
be plugged into machdep->back_trace function pointer at init time.
There should also be an "out" to allow it to be set back to use the
current x86_64_low_budget_back_trace_cmd(). Also, if it doesn't
support -fomit-frame-pointer, it's not worth doing.
--------------------------------------------------------------------------------
DESCRIPTION:
Code restructuring:
- move as much code for advanced commands to libraries so that
crash is at least able to open the dump image and perform minimal
set of commands like bt, dump dmesg log, disassemble etc. irrespective
of kernel version.
- code is hard to read & understand - need to re-write some of the
basic subsystems like memory mapping, pagetable management etc
RESOLUTION STATUS:
Work-in-progress by Dave Wilder <dwilder(a)us.ibm.com> and
Maneesh Soni <maneesh(a)in.ibm.com>
I don't quite understand how moving code to libraries is going to
achieve the goal here. Things in some of the various *_init() functions
could certainly be streamlined (or skipped) in order to make it more
likely to make it to the first prompt. For example, the task table initialization
could be made to simply fill in the context data for just the panic task.
(But it almost sounds like you just want to use gdb alone for the minimal
set of commands you've listed?)
As far as "re-writes" are concerned, please keep in mind the
necessity of backwards-compatibility. I'd much rather keep the current
code -- that's known to work -- in place, and if you come up with
something new, or re-shuffled, make it only callable when the kernel
is of a known kernel version or later.
The point is, let's not just re-invent the wheel just for purpose of
re-inventing the wheel.
--------------------------------------------------------------------------------
DESCRIPTION:
Crash & kernel version independence:
kernel headers & code - reuse ? It would be nice to figure
out a way to include kernel headers and sections of kernel code
to do hard stuff (like memory mapping functions page_to_pfn,
pfn_to_page, pagetable decoding etc..).
RESOLUTION STATUS:
Work-in-progress by Dave Wilder <dwilder(a)us.ibm.com> and
Maneesh Soni <maneesh(a)in.ibm.com>
I don't particularly like this suggestion. (I thought we just went through
a problem where Ubuntu kernels don't even have kernel headers?)
As far as code reuse, we already do that in a number of places, so
I guess that's OK.
And there is just never seems to be a "one-size-fits-all" set of
kernel functions/macros that covers all bases over the life of
the kernel and each processor type.
But as always, I'm open to suggestion.
--------------------------------------------------------------------------------
DESCRIPTION:
Mini report:
The goal of this is to produce a summary report of common information
that is used to track problems. The idea here is for many problems we
probably don't need to get the whole dump shipped and as you probably
figured out by now it is not easy to ship and store these huge dump
files.
RESOLUTION STATUS: TBD
Not a bad idea...
--------------------------------------------------------------------------------
DESCRIPTION:
Automatic verification of the dump:
When you get a dump to look at problem there are few common tasks one
performs, the idea here is to automate those tasks and provide a simple
interface in the tool. Another possibility is automatic verification of
important datastructures, for example if the task list says there are
30 tasks this feature automatically walks the list and counts to verify
if there are 30 in the list or not, if 30 entries or not found this may
give a clue of some kind of a corruption.
RESOLUTION STATUS: TBD
OK. Often this gets recognized already, or if things are horribly corrupted,
the session won't even come up.
--------------------------------------------------------------------------------
DESCRIPTION:
function arguments:
Display arguments in the stack trace. At present, we do not have support
for PPC64 and x86_64. On PPC64, user can dump retrieve only for top
level frame from pt_regs. However, user can dump complete stack frame
and read arguments. So, it is manual process and need to have some
expertise on the stack frame
RESOLUTION STATUS: TBD
Have at it... Given the x86_64 usage of registers for passing args,
good luck.
--------------------------------------------------------------------------------
DESCRIPTION:
local variables:
Facilitate possible display of local variables with stack frames
Since we are using debug vmlinux, we can find local variables locations
from Dwarf2.
RESOLUTION STATUS: TBD
Again, I guess it might be nice.
--------------------------------------------------------------------------------
DESCRIPTION:
better assembly & source languge, line# display in disassembly
RESOLUTION STATUS: TBD
Talk to gdb -- that's where it all comes from... For any text address,
gdb has the associated line number data. It often looks confusing because
the text comes from a header macro or inline or whatever. I don't know
what you can do about that.
--------------------------------------------------------------------------------
DESCRIPTION:
per-cpu info (like stacks traces)
RESOLUTION STATUS: TBD
Needs more of a description...
--------------------------------------------------------------------------------
DESCRIPTION:
User space enhancements
- show user space stack backtrace, if present in the dump file,
- ability to link user space namelist (debug object files),
RESOLUTION STATUS: TBD
I thought crash was a kernel [crash/live-system] analyzer?
You currently can add user-space debug data with "add-symbol-file",
which loads the debug data and symbols into gdb. I have done this
kind of thing, but it's been an "almost-never" kind of situation, where
I've wanted to display a user program's data structure.
But if you want to start throwing in this kind of user-space stuff,
please just keep it segregated.
--------------------------------------------------------------------------------
DESCRIPTION:
Platform specific enhancements
- Establish CPU registers at the time of exceptions in the
current context
- Ability to handle CPU registers from current context using symbols in
expressions
- Ability to format basic processor structures like LDT, GDT, task gates
for x86 arch
Not clear on what "establishing" CPU registers means. We already
dump exception frames.
I guess you mean to be able to use a register connotation in certain
commands, as opposed to the address contained in the register?
That's potentially messy, because it puts processor-specific stuff
in processor-neutral code.
As far as the LDT, GDB, task gates formatting, that's fine.
RESOLUTION STATUS: TBD
--------------------------------------------------------------------------------
DESCRIPTION:
cross architecture support for crash
RESOLUTION STATUS: TBD
No way -- we've been through this before. It is essentially a complete re-write.
If you want this, make a new command entirely.
--------------------------------------------------------------------------------
I've made my personal feelings on these kinds of things before,
which is to take a "minimalist" approach. Every new bell and whistle
is virtually guaranteed to break as the kernel churns. And they all
require an additional support burden. If I had my druthers, crash
would have less rather than more at this point.
But I understand that this has become a community project, and
with the few exceptions above, I'm open to all patch suggestions.
Thanks,
Dave