Dave Anderson wrote:
Itsuro ODA wrote:
Hi Dave,

The attached patch enables to analyze dom0 linux from
whole memory dump on IA64. (for crash-4.0-4.1)
It is just quick hack.
(I was asked from IA64 Xen developers and made it.)

Each domain manages own machine memory by domain.arch.mm.pgd
in IA64. It is 3-level page table.
I thougnt the mfn of domain.arch.mm.pgd can be regarded as
p2m_mfn.

I intended to modify as less existent code as possible.
But this patch is a bit tricky. And the memory usage is
large if the machine memory layout is sparse.
(maybe xen_kdump_p2m should be prepare for each arch ?)

Would you consider to support dom0 analysis for IA64 ?

I prepared two sample dumps. Please find from the following
URLs.

1) http://people.valinux.co.jp/~oda/20070510-sample-dump-1.tar
  contents:
  - vmcore.gz
    This is taken by a hard assist dump. netdump style ELF vmcore.
    So XEN_ELFNOTE_CRASH_INFO does not exist.
  - vmcore.ka.gz
    It is coverted to kdump style and added XEN_ELFNOTE_CRASH_INFO
    manually.
  - vmlinux.debug.gz
    for dom0 analysis
  - xen-syms-2.6.18-8.el5.gz
    for xencrash

  To get p2m_mfn, xencrash's doms command is usefull.
--------------------------------------------------------------------------
# crash xen-syms-2.6.18-8.el5 vmcore
...
crash> doms
   DID       DOMAIN      ST T  MAXPAGE  TOTPAGE VCPU     SHARED_I          P2M_MFN
  32753 f000000007ac8080 RU O     0        0      0          0              ----
  32754 f000000007acc080 RU X     0        0      0          0              ----
> 32767 f000000007ff8080 RU I     0        0      4          0              ----
      0 f000000007aa4080 RU 0   10000    fc28     1  f000000007a88000       1abb7
>*    1 f000000007a78080 RU U   10603    10603    3  f000000007a5c000       1a909
crash>
----------------------------------------------------------------------------

  Then normal crash session with --p2m_mfn option.
----------------------------------------------------------------------------
# crash --p2m_mfn=1abb7 vmlinux.debug vmcore
 

I'm still downloading the above, so I haven't been able
to test it yet...
...
----------------------------------------------------------------------------

  vmcore.ka has XEN_ELFNOTE_CRASH_INFO. so --p2m_mfn option not need.
----------------------------------------------------------------------------
# crash vmlinux.debug vmcore.ka
...
----------------------------------------------------------------------------

  --p2m_mfn option is effective only if a vmcore has XEN_ELFNOTE_CRASH_INFO
  now.
  I think specifying --p2m_mfn option is regarded as the vmcore is
  XEN_CORE_DUMPFILE(). The patch supports this.
  I think it is necessary for dumps which does not have
  XEN_ELFNOTE_CRASH_INFO such as above sample.

2) http://people.valinux.co.jp/~oda/20070510-sample-dump-2.tar
  contents:
  - vmcore.tiger.iomem_machine.gz
    taken by Xen kdump
  - vmlinux-xen-ia64.bz2
  - xen-syms-ia64.bz2

  I asked Xen kdump developper (simon@valinux) to add "p2m_mfn" to
  XEN_ELFNOTE_CRASH_INFO.
  So this change of Xen kdump is not open yet.
  If this is OK for crash, it will be commited.

Thanks.
--
Itsuro ODA <oda@valinux.co.jp>
 


This all sounds good, and I agree that the p2m_mfn should
be added to the ia64 XEN_ELFNOTE_CRASH_INFO.

However, there's something incorrect in your calculation of
"xkd->p2m_frames" in your ia64_xen_kdump_p2m_create() implementation.
It looks like it should be 32, but it's set to 524288.  As a result
that wastes a lot of memory, and "help -n" is pretty much unusable
since wants to dump all ~512k entries:

$ ./crash vmlinux-xen-ia64 vmcore.tiger.iomem_machine
...
crash> help -n
...
         xen_kdump_data:
                    flags: 5 (KDUMP_P2M_INIT|KDUMP_MFN_LIST)
                  p2m_mfn: 1f62c
                      cr3: 0
            last_mfn_read: 1fd09
                     page: 6000000000c96bd0
                 accesses: 1340
               cache_hits: 1255 (95%)
               p2m_frames: 524288
       p2m_mfn_frame_list: 200000000539c010
1efba 1f5cc 1fd09 1e185 1d984 1d183 1c982 1c181 1b980 1b17f 1a97e 1a17d 1997c 1917b 1897a 18179 17978 17177 16976 16175 15974 15173 14972 14171 13970 1316f 1296e 1216d 1196c 1116b 1096a 10169 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1efb8 1efb7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1efb6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0

If you enter "crash -d2", you'll also see all 512k, mostly useless,
entries...
 


Perhaps I am misunderstanding the reference you made with respect
to the "machine memory layout is sparse"?

In the other xen arches, the p2m_frame list reflects contiguous
pseudo-physical memory, so I don't understand why, on your ia64
sample dump which has 1GB of memory, that there would be more
than 32 p2m frames?

Each p2m frame would contain 2048 entries (given 16k pages).
So with 32 p2m frames, that would account for 65536 pages,
which equates to 1GB of pseudo-physical memory.

Am I missing something?

Dave