Date: Mon, 13 Sep 2010 11:59:44 -0400
From: anderson(a)redhat.com
To: crash-utility(a)redhat.com
Subject: Re: [Crash-utility] (pvops 2.6.32.21) crash: cannot read/find cr3 page
----- "tom anderson" <xentoma(a)hotmail.com > wrote:
> if I use a pvops domU kernel version 2.6.32.18 crash works fine. However if I
> use a pvops domU kernel version 2.6.32.21 I get the error messages:
>
> crash: cannot find mfn 874307 (0xd5743) in page index
> crash: cannot read/find cr3 page
>
> Any suggestions as to what is wrong?
Hi Tom,
I can't really give you specific suggestions as to what is wrong,
but at least tell what the crash utility is encountering.
I suppose there's good news and bad news concerning this issue,
the good news being that it worked OK with 2.6.32.18, which is
fairly close to your failing 2.6.32.21. Since I've done very little
with Xen support since Red Hat dropped Xen development beyond our
RHEL5 2.6.18-era release, it's always good to hear that it actually
still worked with a 2.6.32.18 kernel. I imagine eventually something
will break in the future, and at that time I may likely require outside
assistance to keep Xen support in place.
Anyway, that all being said, in your failure case, here are the issues
at hand. The header shows this:
xc_core:
header:
xch_magic: f00febed (XC_CORE_MAGIC)
xch_nr_vcpus: 7
xch_nr_pages: 521792 (0x7f640)
xch_ctxt_offset: 1896 (0x768)
xch_index_offset: 2137305088 (0x7f64b000)
xch_pages_offset: 45056 (0xb000)
elf_class: ELFCLASS64
elf_strtab_offset: 2145653760 (0x7fe41400)
format_version: 0000000000000001
shared_info_offset: 38072 (0x94b8)
The "xch_nr_pages" indicates that the domU vmlinux kernel has 521792
pseudo-physical pages assigned to it, where those pseudo-physical pages
are backed by the Xen hypervisor by machine pages, which are the "real"
physical pages. And so when the crash utility needs to access a
pseudo-physical page used by a domU kernel, that pseudo-physical page
needs to be translated to the actual machine physical page that backs it,
and then that physical page needs to be found in the dumpfile. The PFN
(page frame number) of the pseudo-physical pages are call "pfns" and the
PFN of the machine pages are called "mfns" or "gmfns".
To match a pfn with its corresponding mfn, the kdump operation dumps an
array of pfn-to-mfn pairs in the vmcore's ".xen_p2m" section, this taken
from
http://www.sfr-fresh.com/unix/misc/xen-4.0.1.tar.gz:a/xen-4.0.1/docs/misc...
".xen_p2m" section
name ".xen_p2m"
type SHT_PROGBITS
structure array of struct xen_dumpcore_p2m
struct xen_dumpcore_p2m {
uint64_t pfn;
uint64_t gmfn;
};
description
This elements represents the frame number of the page
in .xen_pages section.
pfn: guest-specific pseudo-physical frame number
gmfn: machine physical frame number
The size of arrays is stored in xch_nr_pages member of header
note descriptor in .note.Xen note section.
The entryies are stored in pfn-ascending order.
This section must exist when the domain is non auto
translated physmap mode. Currently x86 paravirtualized domain.
The "pfn" value associated with the "gmfn" value, is in turn used
as an index into an array of actual pages in the dumpfile, which is
found at the "xch_pages_offset" at 45056 (0xb000).
The start of the index array is found in the dumpfile at the "xch_index_offset"
at 2137305088 (0x7f64b000), and ends at the "elf_strtab_offset" at 2145653760
(0x7fe41400). Accordingly, if you subtract 2137305088 from 2145653760,
the array of xen_dumpcore_p2m structures is 8348672 bytes, which when
divided by the size of the data structure (16), it equals the value of
"xch_nr_pages", or 521792.
Anyway, the very first read attempt requires the crash utility to do a
one-time-only recreation of the kernel's "p2m_top" array (pvops kernels
only),
and in so doing needs to first read the page found in the hypervisor's cr3
register, which contains a machine address:
<readmem: ffffffff81614800, KVADDR, "kernel_config_data", 32768, (ROE),
2fed090>
addr: ffffffff81614800 paddr: 1614800 cnt: 2048
GETBUF(248 -> 0)
FREEBUF(0)
MEMBER_OFFSET(vcpu_guest_context, ctrlreg): 4984
ctrlreg[0]: 80050033
ctrlreg[1]: d5742000
ctrlreg[2]: 0
ctrlreg[3]: d5743000
ctrlreg[4]: 2660
ctrlreg[5]: 0
ctrlreg[6]: 0
ctrlreg[7]: 0
crash: cannot find mfn 874307 (0xd5743) in page index
crash: cannot read/find cr3 page
It contained a machine address of d5743000, which when shifted-right equates
to an PFN (or "mfn") of 874307 (0xd5743). It then walked through the index
array of xen_dumpcore_p2m structures in the dumpfile, looking for the one that
contains that "gmfn" value.
But for whatever reason, it could not find it. That being the
case, there's no way it can continue.
I can't really help much more than that. The function that
walks through the array is xc_core_mfn_to_page() in xendump.c.
It prints the "cannot find mfn ..." message, and returns back
to the x86_64_pvops_xendump_p2m_create() function in x86_64.c,
which prints the final, fatal, "cannot read/find cr3 page"
message.
If you capture the same type of debug output with the earlier
kernel, you should see it get to the point above and continue
on from there.
Dave
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
Dave,
Thanks for your response and providing such detailed information relating to the issue at
hand.
-Thomas