Hi,
Thanks for your explanation, Isaku.
But, in any case, Itsuro, can you do what is possible with
your patch, and re-submit it?
I understand the first 1GB(for this example) is necessary
for looking ordinal linux kernel structures.
I feel now it is better to make p2m_frame table for the first
contiguous physical address space which is treated as RAM,
and make special command to look the shared_info area or
io space if it is necessary.
Let me consider for a while. (I can't access any IA64
environment right now.)
I will reply early next week.
Thanks.
Dave Anderson said:
> Isaku Yamahata wrote:
>
>> Hi Dave.
>> I think I can explain it.
>>
>> Sometimes xen needs to share pages with dom0.
>> For example shared_info, grant table pages, another domain's pages
>> and etc.
>> In such a case, Xen/IA64 puts those pages in the dom0 pseudo physical
>> addresses space, i.e. it updates dom0 p2m table, thus dom0 can
>> access those pages.
>> Pseudo physical addresses are predefined or given by xen or dom0.
>> Currently shared_info is assigned at pseudo physical address
>> of 1UL << 40 = 1TB.
>> This corresponds to the following entry.
>> > f00000007d8b0080: 000000007f428000 0000000000000000
>> ..B.............
>>
>> Dom0 controls devices so that it needs to access I/O area.
>> For that purpose, dom0 p2m table has the entry which points I/O area
>> such that dom0 pseudo physical address = machine address.
>> I guess that the following entry corresponds to I/O area.
>> > f00000007d8b07f0: 0000000000000000 000000007bed4000
>> .........@.{....
>> In order to confirm this, The native linux's /proc/iomem is necessary.
>>
>> thanks.
>>
>
> OK, thanks for that explanation...
>
> It *still* seems to be a huge waste of memory. Taking the
> example dump, the 1GB of "normal" memory requires 32 p2m_mfn
> values for address translation, plus -- if I understand you
> correctly -- 1 for the shared_info, plus 1 for the I/O area.
> That's a total of 34 8-byte values, or 272 bytes. Whereas
> this first patch uses 524288 entries, or 4MB of memory.
> It seems to me there should be a better way to handle it,
> even if those two particular pseudo-physical regions are
> "special-cased" for ia64.
>
But, in any case, Itsuro, can you do what is possible with
your patch, and re-submit it?
>
> Thanks guys,
> Dave
>
>
>>
>> On Fri, May 11, 2007 at 10:02:39AM -0400, Dave Anderson wrote:
>> > Itsuro ODA wrote:
>> >
>> > Hi Dave,
>> >
>> > > This all sounds good, and I agree that the p2m_mfn should
>> > > be added to the ia64 XEN_ELFNOTE_CRASH_INFO.
>> > >
>> > > However, there's something incorrect in your calculation of
>> > > "xkd->p2m_frames" in your ia64_xen_kdump_p2m_create()
>> implementation.
>> > > It looks like it should be 32, but it's set to 524288. As a
>> result
>> > > that wastes a lot of memory, and "help -n" is pretty
much
>> unusable
>> > > since wants to dump all ~512k entries:
>> >
>> > This is because IA64's pseudo-physical memory map (domain on xen
>> > specific).
>> >
>> > phys-to-machine mapping is managed as 3-level page table.
>> > pgd looks like:
>> > -------------------------------------------------------------
>> > crash> doms
>> > DID DOMAIN ST T MAXPAGE TOTPAGE VCPU SHARED_I
>> > P2M_MFN
>> > 32753 f000000007dac080 ?? O 0 0 0 0
>> > ----
>> > 32754 f000000007ff0080 ?? X 0 0 0 0
>> > ----
>> > 32767 f000000007ff4080 ?? I 0 0 1 0
>> > ----
>> > >* 0 f000000007da4080 ?? 0 10000 f986 1
>> f000000007d90000
>> > 1f62c
>> >
>> > crash> domain f000000007da4080
>> > struct domain {
>> > domain_id = 0,
>> > shared_info = 0xf000000007d90000,
>> > ...
>> > arch = {
>> > mm = {
>> > pgd = 0xf00000007d8b0000
>> > },
>> > ...
>> > crash> rd 0xf00000007d8b0000 256
>> > f00000007d8b0000: 000000007c8d8000 0000000000000000
>> ...|............
>> > f00000007d8b0010: 0000000000000000 0000000000000000
>> ................
>> > f00000007d8b0020: 0000000000000000 0000000000000000
>> ................
>> > f00000007d8b0030: 0000000000000000 0000000000000000
>> ................
>> > f00000007d8b0040: 0000000000000000 0000000000000000
>> ................
>> > f00000007d8b0050: 0000000000000000 0000000000000000
>> ................
>> > f00000007d8b0060: 0000000000000000 0000000000000000
>> ................
>> > f00000007d8b0070: 0000000000000000 0000000000000000
>> ................
>> > f00000007d8b0080: 000000007f428000 0000000000000000
>> ..B.............
>> > f00000007d8b0090: 0000000000000000 0000000000000000
>> ................
>> > ...
>> > f00000007d8b07c0: 0000000000000000 0000000000000000
>> ................
>> > f00000007d8b07d0: 0000000000000000 0000000000000000
>> ................
>> > f00000007d8b07e0: 0000000000000000 0000000000000000
>> ................
>> > f00000007d8b07f0: 0000000000000000 000000007bed4000
>> .........@.{....
>> >
-------------------------------------------------------------------------
>> > (256 * 2048 = 524288)
>> >
>> > It is certain that (pseudo-)physical memory "256GB-" and
"-4TB"
>> exits.
>> > These area are shared by domain-0 and xen hypervisor.
>> > These area should be accessed in dom0's analysis session.
>> >
>> > (I said:)
>> > > > But this patch is a bit tricky. And the memory usage is
>> > > > large if the machine memory layout is sparse.
>> >
>> > It is wrong. This should be "the memory usage is large if
>> > pseudo-physical memory layout is sparse."
>> > And it is always sparse actually...
>> >
>> > Thanks.
>> >
>> >
>> > Hi Itsuro,
>> >
>> > I now understand the difference in the 3rd-level p2m
>> > frame contents being page table entries instead of mfn
>> > values.
>> >
>> > However, I still do not understand what you mean regarding
>> > the concept of the pseudo-physical memory being "sparse".
>> > Looking at the dumpfile again, it appears to have the same
>> > type of flat pseudo-physical memory layout just like the
>> > other architectures.
>> >
>> > Dom0 has ~1GB of pseudo-physical memory:
>> >
>> > crash> sys
>> > KERNEL: ../20070510-sample-dump-2/vmlinux-xen-ia64
>> > DUMPFILE: ../20070510-sample-dump-2/vmcore.tiger.iomem_machine
>> > CPUS: 1
>> > DATE: Mon May 7 04:07:43 2007
>> > UPTIME: 00:01:47
>> > LOAD AVERAGE: 0.11, 0.04, 0.01
>> > TASKS: 21
>> > NODENAME: (none)
>> > RELEASE: 2.6.18-xen
>> > VERSION: #3 SMP Mon May 7 13:14:41 JST 2007
>> > MACHINE: ia64 (1296 Mhz)
>> > MEMORY: 1 GB
>> > PANIC: "SysRq : Trigger a crashdump"
>> > crash>
>> >
>> > And as far as dom0's VM is concerned, its memory map only knows
>> > about the 64512 pages in DMA zone 0:
>> >
>> > crash> kmem -n
>> > NODE SIZE PGLIST_DATA BOOTMEM_DATA NODE_ZONES
>> > 0 64512 a000000100482f80 a000000100608950 a000000100482f80
>> > a000000100483500
>> > a000000100483a80
>> > a000000100484000
>> > MEM_MAP START_PADDR START_MAPNR
>> > e0000000010b0000 0 0
>> >
>> > ZONE NAME SIZE MEM_MAP START_PADDR START_MAPNR
>> > 0 DMA 64512 e0000000010b0000 0 0
>> > 1 DMA32 0 0 0 0
>> > 2 Normal 0 0 0 0
>> > 3 HighMem 0 0 0 0
>> > crash>
>> >
>> > So the "end of memory" would be just below 1GB:
>> >
>> > crash> eval 64512 * 16k
>> > hexadecimal: 3f000000 (1008MB)
>> > decimal: 1056964608
>> > octal: 7700000000
>> > binary:
>> 0000000000000000000000000000000000111111000000000000000000000000
>> > crash>
>> >
>> > So, with respect to dom0, how would it ever go beyond 32
>> > p2m_frames? Putting a debug printf in xen_kdump_p2m, it
>> > shows this:
>> >
>> > crash> rd -p 3f000000
>> > xen_kdump_p2m: mfn_idx for 3f000000: 31
>> > 3f000000: 0000000000000000 ........
>> > crash>
>> >
>> > So that shows that there only needs to be 32 p2m_frames
>> > for accessing all of dom0 pseudo-physical memory.
>> >
>> > But it also shows that you are allowing access to memory
>> > that is *beyond* the end of dom0 pseudo-physical memory,
>> > since 3f000000 should not be readable. There is not a
>> > page structure associated with 3f000000:
>> >
>> > crash> kmem -p | tail
>> > e000000001421dd0 3efd8000 ------- ----- 1 0
>> > e000000001421e08 3efdc000 ------- ----- 1 0
>> > e000000001421e40 3efe0000 ------- ----- 1 60
>> > e000000001421e78 3efe4000 ------- ----- 1 60
>> > e000000001421eb0 3efe8000 ------- ----- 1 60
>> > e000000001421ee8 3efec000 ------- ----- 1 60
>> > e000000001421f20 3eff0000 ------- ----- 2 0
>> > e000000001421f58 3eff4000 ------- ----- 1 80
>> > e000000001421f90 3eff8000 ------- ----- 1 80
>> > e000000001421fc8 3effc000 ------- ----- 1 80
>> > crash>
>> >
>> > By doing few other "rd -p" commands, I see that you seem
>> > to be allowing memory accesses based upon what's in the ELF
>> > header PT_LOAD segments, which are "machine" physical memory
>> > descriptors:
>> >
>> > crash> help -n | grep phys_end
>> > phys_end: 1000
>> > phys_end: 7000
>> > phys_end: 9000
>> > phys_end: 82000
>> > phys_end: 85000
>> > phys_end: a0000
>> > phys_end: 4000000
>> > phys_end: 81b3000
>> > phys_end: ffc0000
>> > phys_end: 10000000
>> > phys_end: 7ab06000
>> > phys_end: 7c8d2000
>> > phys_end: 7c92e000
>> > phys_end: 7c938000
>> > phys_end: 7c97e000
>> > phys_end: 7cdf6000
>> > phys_end: 7cdfc000
>> > phys_end: 7ce2a000
>> > phys_end: 7d001000
>> > phys_end: 7d002000
>> > phys_end: 7d044000
>> > phys_end: 7d045000
>> > phys_end: 7d37e000
>> > phys_end: 7d700000
>> > phys_end: 7d77e000
>> > phys_end: 7d8b4000
>> > phys_end: 7f980000
>> > phys_end: 7fa00000
>> > phys_end: 7feda000
>> > crash>
>> >
>> > So it appears that the physical machine running the
>> > dom0 and hypervisor has almost 2GB of "real" physical
>> > memory. And if I try to read the limit address of
>> > 7feda000, it fails:
>> >
>> > crash> rd -p 7feda000
>> > xen_kdump_p2m: mfn_idx for 7feda000: 63
>> > rd: read error: physical address: 7feda000 type: "64-bit
PHYSADDR"
>> > crash>
>> >
>> > But the last page of physical memory can be read:
>> >
>> > crash> rd -p 7fed9000
>> > xen_kdump_p2m: mfn_idx for 7fed9000: 63
>> > 7fed9000: 000000007f9da0a0 ........
>> > crash>
>> >
>> > "rd -p" is supposed to read pseudo-physical memory in xen
>> > kernels, but it seems to be allowing reads based upon the
>> > PT_LOAD segment contents? In other words, it seems to
>> > be mixing dom0 pseudo-physical memory and the system's
>> > machine memory, because 7fed9000 is not a legitimate dom0
>> > pseudo-physical address.
>> >
>> > (And even with that happening, the maximum p2m_frame index
>> > is still only 63 -- how can it ever be 512k with respect
>> > to dom0's pseudo-physical memory?)
>> >
>> > So I'm sorry, but this does not make sense to me...
>> >
>> > Dave
>> >
>> >
>> >
>>
>> > --
>> > Crash-utility mailing list
>> > Crash-utility(a)redhat.com
>> >
https://www.redhat.com/mailman/listinfo/crash-utility
>>
>> --
>> yamahata
>>
>> --
>> Crash-utility mailing list
>> Crash-utility(a)redhat.com
>>
https://www.redhat.com/mailman/listinfo/crash-utility
>
--
Itsuro ODA <oda(a)valinux.co.jp>