On 17/02/16 15:52, Daniel Kiper wrote:
On Wed, Feb 17, 2016 at 03:27:01PM +0100, Juergen Gross wrote:
> On 17/02/16 14:59, Daniel Kiper wrote:
>> On Tue, Feb 16, 2016 at 01:55:33PM +0100, Juergen Gross wrote:
>>> Hi Daniel,
>>>
>>> On 16/02/16 12:35, Daniel Kiper wrote:
>>>> Hey Juergen,
>>
>> [...]
>>
>>>> After that I decided to take a look at Linux kernel upstream. I saw
>>>> that xen_max_p2m_pfn in xen_build_mfn_list_list() is equal to "the
>>>> end of last usable machine memory region available for a given
>>>> dom0_mem argument + something", e.g.
>>>>
>>>> For dom0_mem=1g,max:1g:
>>>>
>>>> (XEN) Xen-e820 RAM map:
>>>> (XEN) 0000000000000000 - 000000000009fc00 (usable)
>>>> (XEN) 000000000009fc00 - 00000000000a0000 (reserved)
>>>> (XEN) 00000000000f0000 - 0000000000100000 (reserved)
>>>> (XEN) 0000000000100000 - 000000007ffdf000 (usable) <--- HERE
>>>> (XEN) 000000007ffdf000 - 0000000080000000 (reserved)
>>>> (XEN) 00000000b0000000 - 00000000c0000000 (reserved)
>>>> (XEN) 00000000feffc000 - 00000000ff000000 (reserved)
>>>> (XEN) 00000000fffc0000 - 0000000100000000 (reserved)
>>>> (XEN) 0000000100000000 - 0000000180000000 (usable)
>>>>
>>>> Hence xen_max_p2m_pfn == 0x80000
>>>>
>>>> Later I reviewed most of your p2m related commits and I realized
>>>> that you played whack-a-mole game with p2m bugs. Sadly, I was not
>>>> able to identify exactly one (or more) commit which would fix the
>>>> same issue (well, there are some which fixes similar stuff but not
>>>> the same one described above). So, if you explain to me why
>>>> xen_max_p2m_pfn is set to that value and does not e.g. max_pfn then
>>>> it will be much easier for me to write proper fix and maybe fix
>>>> the same issue in upstream kernel if it is needed (well, crash
>>>> tool does not work with new p2m layout so first of all I must fix it;
>>>> I hope that you will help me to that sooner or later).
>>>
>>> The reason for setting xen_max_p2m_pfn to nr_pages initially is it's
>>> usage in __pfn_to_mfn(): this must work with the initial p2m list
>>> supplied by the hypervisor which just has only nr_pages entries.
>>
>> That make sense.
>>
>>> Later it is updated to the number of entries the linear p2m list is
>>> able to hold. This size has to include possible hotplugged memory
>>> in prder to be able to make use of that memory later (remember: the
>>> p2m list's size is limited by the virtual space allocated for it via
>>> xen_vmalloc_p2m_tree()).
>>
>> However, I have memory hotplug disabled and kernel set xen_max_p2m_pfn
>> to 0x80000 (2 Gib) even if dom0 memory was set to 1 GiB. Hmmm... Why?
>> I suppose that if xen_max_p2m_pfn == max_pfn then everything should work.
>> Am I missing something?
>
> The virtual p2m list's size is aligned to PMD_SIZE (2 MB). For 1 GB dom0
> memory max_pfn will be a little bit above 0x40000 due to the BIOS
> area resulting in a 4 MB p2m list. And xen_max_p2m_pfn is reflecting
> this size. You could reduce it to max_pfn without any problem, as long
> as memory hotplug is disabled. At least I think so.
To be precise PMD_SIZE * PMDS_PER_MID_PAGE, so, it equals to 0x80000 in
this case. Why we need so huge alignment? Could not we use smaller one,
e.g. PAGE_SIZE?
The idea when implementing this was to use the same boundaries as the
3 level p2m tree would use. Doing so wouldn't waste any real memory, so
that's how it was coded. Additionally this solution will support
memory hotplug which shouldn't be forgotten for the common case.
So to sum it up: This alignment doesn't harm (with the exception of the
crash tool), but it makes life easier for memory hotplug and for the
construction of the p2m tree (with the current alignment I don't need
special handling of a partial valid mid level frame).
Juergen