Hi Daniel,
On 16/02/16 12:35, Daniel Kiper wrote:
 Hey Juergen,
 
 As I saw you are strongly playing with p2m stuff, so,
 I hope that you can enlighten me a bit in that area. 
Yes, the p2m stuff is always fun. :-)
 OVM, Oracle product, uses as dom0 kernel Linux 3.8.13
 (yep, I know this is very ancient stuff) with a lot of
 backports. Among them there is commit 2c185687ab016954557aac80074f5d7f7f5d275c
 (x86/xen: delay construction of mfn_list_list). After
 an investigation I discovered that it breaks crash tool.
 It fails with following message:
 
 crash: read error: kernel virtual address: ffff88027ce0b700  type: "current_task
(per_cpu)"
 crash: read error: kernel virtual address: ffff88027ce2b700  type: "current_task
(per_cpu)"
 crash: read error: kernel virtual address: ffff88027ce4b700  type: "current_task
(per_cpu)"
 crash: read error: kernel virtual address: ffff88027ce6b700  type: "current_task
(per_cpu)"
 crash: read error: kernel virtual address: ffff88027ce10c64  type: "tss_struct ist
array"
 
 Addresses and symbols depends on a given build.
 
 The problem is that xen_max_p2m_pfn in xen_build_mfn_list_list()
 is equal to xen_start_info->nr_pages. This means that memory
 which is above due to some remapping/relocation (usually it is
 small fraction) is not mapped via p2m_top_mfn and p2m_top_mfn_p.
 I should mention here that Xen is started with e.g. dom0_mem=1g,max:1g.
 If I remove max argument then crash works because xen_max_p2m_pfn
 is greater than xen_start_info->nr_pages. Additionally, the issue
 could be fixed by replacing xen_max_p2m_pfn in xen_build_mfn_list_list()
 with max_pfn.
 
 After that I decided to take a look at Linux kernel upstream. I saw
 that xen_max_p2m_pfn in xen_build_mfn_list_list() is equal to "the
 end of last usable machine memory region available for a given
 dom0_mem argument + something", e.g.
 
 For dom0_mem=1g,max:1g:
 
 (XEN) Xen-e820 RAM map:
 (XEN)  0000000000000000 - 000000000009fc00 (usable)
 (XEN)  000000000009fc00 - 00000000000a0000 (reserved)
 (XEN)  00000000000f0000 - 0000000000100000 (reserved)
 (XEN)  0000000000100000 - 000000007ffdf000 (usable)   <--- HERE
 (XEN)  000000007ffdf000 - 0000000080000000 (reserved)
 (XEN)  00000000b0000000 - 00000000c0000000 (reserved)
 (XEN)  00000000feffc000 - 00000000ff000000 (reserved)
 (XEN)  00000000fffc0000 - 0000000100000000 (reserved)
 (XEN)  0000000100000000 - 0000000180000000 (usable)
 
 Hence xen_max_p2m_pfn == 0x80000
 
 Later I reviewed most of your p2m related commits and I realized
 that you played whack-a-mole game with p2m bugs. Sadly, I was not
 able to identify exactly one (or more) commit which would fix the
 same issue (well, there are some which fixes similar stuff but not
 the same one described above). So, if you explain to me why
 xen_max_p2m_pfn is set to that value and does not e.g. max_pfn then
 it will be much easier for me to write proper fix and maybe fix
 the same issue in upstream kernel if it is needed (well, crash
 tool does not work with new p2m layout so first of all I must fix it;
 I hope that you will help me to that sooner or later). 
The reason for setting xen_max_p2m_pfn to nr_pages initially is it's
usage in __pfn_to_mfn(): this must work with the initial p2m list
supplied by the hypervisor which just has only nr_pages entries.
Later it is updated to the number of entries the linear p2m list is
able to hold. This size has to include possible hotplugged memory
in prder to be able to make use of that memory later (remember: the
p2m list's size is limited by the virtual space allocated for it via
xen_vmalloc_p2m_tree()).
 Additionally, during that work I realized that p2m_top (xen_p2m_addr
 in latest Linux kernel) and p2m_top_mfn differs. As I saw p2m_top
 represents all stuff (memory, missing, identity, etc.) found in PV
 guest address space. However, p2m_top_mfn is just limited to memory
 and missing things. Taking into account that p2m_top_mfn is used just
 for migration and crash tool it looks that it is sufficient. Am I correct?
 Am I not missing any detail? 
Basically p2m_top and p2m_top_mfn hold the same information. p2m_top has
just some special mappings for identity pages: they translate to
"invalid" mfns just as in p2m_top_mfn, but via dedicated pages which are
identified by comparing their addresses (or pfns) in order to detect
the identity pages.
As you thought: this distinction isn't necessary for p2m_top_mfn, so it
can be omitted there.
 
 Daniel
 
 PS I am sending this to wider forum because I think that it
    is worth spreading knowledge even if it is not strictly
    related to latest Xen or Linux kernel developments. 
OTOH: what was hard to write should be hard to read. ;-)
Feel free to ask further questions.
Juergen