Magnus Damm wrote:
The idea is that the crash_notes contents in the Xen hypervisor space
contains registers indexed by physical cpu number.
It is possible to locate the crashing physical cpu by looking up a
global variable in hypervisor symbol, and from there it should be
possible to backtrack and find the domain pseudo-phys to virt mapping
table. I say "should" because it is probably pretty hairy.
Actually, given that the crash utility is only interested in the
specifics of the dom0 kernel, it has no interest in physical
cpus. If you're specifically interested in debugging a crash
that occurred while operating in the xen binary, you're going
to want to use gdb on the vmcore file with xen-syms-xxx
namelist file. You can still run crash on the same vmcore to
find out what was going on in the dom0 kernel, but there's
no awareness of the xen hypervisor underpinnings; you'll
just get the state of the dom0 kernel at the time of the crash.
But I would guess-timate that the majority of the crashes are
going to have occurred in the dom0 kernel, and not while
running in the hypervisor.
Now, given that that the crash_notes context contains registers
that are indexed by the physical cpu number, well, that's not
helpful to crash's needs with respect to dom0. That's why you
guys must have created the additional NT_XEN_DOM0_CR3
ELF note.
I guess I understand why you feel it's a burden to continue
the maintenance of such a thing, but given that the panic can
occur either while operating in the dom0 kernel or while in
the xen hypervisor code, it makes perfect sense (to me) to
make a minimal effort by including an indication of the dom0
cr3 or dom0 pfn_to_mfn_frame_list_list value in the vmcore.
Perhaps you consider it a case of the tail wagging the dog,
but to me, it would be more a case of accomodating the needs
of the consumer... ;-)
Our internal interfaces are not particularly clean at the moment. We
have code that keeps the crash_notes in the hypervisor, but passes the
physical addresses (or machine addresses in xen lingo) for the notes all
the way down to kexec-tools in dom0 user space. These addresses are then
used to create the ELF headers. dom0 only knows about VCPU:s, but
because we are creating a system-wide crash dump we want to use physical
cpus. So down in user space we then need to create a mapping between
physical cpu:s and VCPU:s. And can we be sure that dom0 has all cpus
available as VCPU:s?
I don't care about that. All I need is a starting point for translating dom0
kernel virtual addresses. And that is either a dom0 cr3 value or the
domain's pfn_to_mfn_frame_list_list value.
> But again, there's no easy way for the crash utility to dig
> them out of a completely foreign binary's.
No, but that's because your tool is missing knowledge about the binary
right? =) Is there any easy way out... No! =) Or maybe there is?
No!
I hope we can find a good balance between your code and ours. Maybe a
relatively fair balance could be that we provide per-physical cpu
pointers to some virtual to physical mapping tables which should be easy
to parse for your tool, but in return your tool doesn't depend on
finding register information using the note program headers in the ELF
header...
Now we're getting complex -- I'm pretty sure I don't know what
you're talking about here... Or how it can possibly lead to a dom0
cr3 or pfn_to_mfn_frame_list_list value?
> That's good, isn't it? If I've understood
things right it's possible to
> locate the data you need using the domain list symbol?
Yeah...
To clarify, it's possible for *you*, i.e., the kexec/kdump code, to locate
the data that way. The crash utility, using the vmlinux/vmcore file
pair, doesn't know anything about what the "domain_list" is, the
structures that it uses/links-to. And even if it did, it wouldn't know
how to find it in the vmcore file.
> Yeah, I agree that navigating around those structures
seems rather
> painful. But OTOH, if you want to know things that only the internals
> can tell you, you need to be able to parse them, right? But maybe you
> only want to cover the "simple" dom0 case. (Simple yeah right)
That's right -- crash is *only* interested in the dom0 case; again
it's clueless about the hypervisor, and rightly so.
It's just such a unique case. It's like trying to debug "ls" using
a "cat" binary, where the core file is usable for debugging either
one.
Thanks,
Dave