> Date: Mon, 13 Sep 2010 11:59:44 -0400
> From: anderson@redhat.com
> To: crash-utility@redhat.com
> Subject: Re: [Crash-utility] (pvops 2.6.32.21) crash: cannot read/find cr3 page
>
> ----- "tom anderson" <xentoma@hotmail.com > wrote:
>
> > if I use a pvops domU kernel version 2.6.32.18 crash works fine. However if I
> > use a pvops domU kernel version 2.6.32.21 I get the error messages:
> >
> > crash: cannot find mfn 874307 (0xd5743) in page index
> > crash: cannot read/find cr3 page
> >
> > Any suggestions as to what is wrong?
>
> Hi Tom,
>
> I can't really give you specific suggestions as to what is wrong,
> but at least tell what the crash utility is encountering.
>
> I suppose there's good news and bad news concerning this issue,
> the good news being that it worked OK with 2.6.32.18, which is
> fairly close to your failing 2.6.32.21. Since I've done very little
> with Xen support since Red Hat dropped Xen development beyond our
> RHEL5 2.6.18-era release, it's always good to hear that it actually
> still worked with a 2.6.32.18 kernel. I imagine eventually something
> will break in the future, and at that time I may likely require outside
> assistance to keep Xen support in place.
>
> Anyway, that all being said, in your failure case, here are the issues
> at hand. The header shows this:
>
> xc_core:
> header:
> xch_magic: f00febed (XC_CORE_MAGIC)
> xch_nr_vcpus: 7
> xch_nr_pages: 521792 (0x7f640)
> xch_ctxt_offset: 1896 (0x768)
> xch_index_offset: 2137305088 (0x7f64b000)
> xch_pages_offset: 45056 (0xb000)
> elf_class: ELFCLASS64
> elf_strtab_offset: 2145653760 (0x7fe41400)
> format_version: 0000000000000001
> shared_info_offset: 38072 (0x94b8)
>
> The "xch_nr_pages" indicates that the domU vmlinux kernel has 521792
> pseudo-physical pages assigned to it, where those pseudo-physical pages
> are backed by the Xen hypervisor by machine pages, which are the "real"
> physical pages. And so when the crash utility needs to access a
> pseudo-physical page used by a domU kernel, that pseudo-physical page
> needs to be translated to the actual machine physical page that backs it,
> and then that physical page needs to be found in the dumpfile. The PFN
> (page frame number) of the pseudo-physical pages are call "pfns" and the
> PFN of the machine pages are called "mfns" or "gmfns".
>
> To match a pfn with its corresponding mfn, the kdump operation dumps an
> array of pfn-to-mfn pairs in the vmcore's ".xen_p2m" section, this taken from
> http://www.sfr-fresh.com/unix/misc/xen-4.0.1.tar.gz:a/xen-4.0.1/docs/misc/dump-core-format.txt
>
> ".xen_p2m" section
> name ".xen_p2m"
> type SHT_PROGBITS
> structure array of struct xen_dumpcore_p2m
> struct xen_dumpcore_p2m {
> uint64_t pfn;
> uint64_t gmfn;
> };
>
> description
> This elements represents the frame number of the page
> in .xen_pages section.
> pfn: guest-specific pseudo-physical frame number
> gmfn: machine physical frame number
> The size of arrays is stored in xch_nr_pages member of header
> note descriptor in .note.Xen note section.
> The entryies are stored in pfn-ascending order.
> This section must exist when the domain is non auto
> translated physmap mode. Currently x86 paravirtualized domain.
>
> The "pfn" value associated with the "gmfn" value, is in turn used
> as an index into an array of actual pages in the dumpfile, which is
> found at the "xch_pages_offset" at 45056 (0xb000).
>
> The start of the index array is found in the dumpfile at the "xch_index_offset"
> at 2137305088 (0x7f64b000), and ends at the "elf_strtab_offset" at 2145653760
> (0x7fe41400). Accordingly, if you subtract 2137305088 from 2145653760,
> the array of xen_dumpcore_p2m structures is 8348672 bytes, which when
> divided by the size of the data structure (16), it equals the value of
> "xch_nr_pages", or 521792.
>
> Anyway, the very first read attempt requires the crash utility to do a
> one-time-only recreation of the kernel's "p2m_top" array (pvops kernels only),
> and in so doing needs to first read the page found in the hypervisor's cr3
> register, which contains a machine address:
>
> <readmem: ffffffff81614800, KVADDR, "kernel_config_data", 32768, (ROE), 2fed090>
> addr: ffffffff81614800 paddr: 1614800 cnt: 2048
> GETBUF(248 -> 0)
> FREEBUF(0)
> MEMBER_OFFSET(vcpu_guest_context, ctrlreg): 4984
> ctrlreg[0]: 80050033
> ctrlreg[1]: d5742000
> ctrlreg[2]: 0
> ctrlreg[3]: d5743000
> ctrlreg[4]: 2660
> ctrlreg[5]: 0
> ctrlreg[6]: 0
> ctrlreg[7]: 0
> crash: cannot find mfn 874307 (0xd5743) in page index
>
> crash: cannot read/find cr3 page
>
> It contained a machine address of d5743000, which when shifted-right equates
> to an PFN (or "mfn") of 874307 (0xd5743). It then walked through the index
> array of xen_dumpcore_p2m structures in the dumpfile, looking for the one that
> contains that "gmfn" value.
>
> But for whatever reason, it could not find it. That being the
> case, there's no way it can continue.
>
> I can't really help much more than that. The function that
> walks through the array is xc_core_mfn_to_page() in xendump.c.
> It prints the "cannot find mfn ..." message, and returns back
> to the x86_64_pvops_xendump_p2m_create() function in x86_64.c,
> which prints the final, fatal, "cannot read/find cr3 page"
> message.
>
> If you capture the same type of debug output with the earlier
> kernel, you should see it get to the point above and continue
> on from there.
>
> Dave
>
> --
> Crash-utility mailing list
> Crash-utility@redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
 
Dave,
 
Thanks for your response and providing such detailed information relating to the issue at hand.
 
-Thomas