On Thu, Jun 02, 2016 at 11:37:24AM -0400, Dave Anderson wrote:
----- Original Message -----
> On Wed, Jun 01, 2016 at 11:41:04AM -0400, Dave Anderson wrote:
> >
> > ----- Original Message -----
> >
> > > > >
> > > > > case PRE_GDB:
> > > > > + /* This check is somewhat redundant */
> > > > > + if (kernel_symbol_exists("kimage_voffset"))
> > > > > + machdep->flags |= NEW_VMEMMAP;
> > > > > +
> > > >
> > > > Certainly not redundant on a live system without
CONFIG_RANDOMIZE_BASE.
> > >
> > > Yeah, but for crashdump case, it's redundant.
> >
> > I understand, but if you're going to add a comment, it should clarify
> > rather than mystify... ;-)
>
> OK, how about this:
> "Check for NEW_VMEMMAP again for a live system"
>
> Or just remove it?
Sure, that's better.
> >
> > > > > arm64_calc_VA_BITS();
> > > > > - machdep->machspec->page_offset = ARM64_PAGE_OFFSET;
> > > > > + ms = machdep->machspec;
> > > > > + ms->page_offset = ARM64_PAGE_OFFSET;
> > > > > + /* FIXME: idmap for NEW_VMEMMAP */
> > > >
> > > > What's the FIXME here?
> > >
> > > Identity-mapping no longer starts from PAGE_OFFSET if KASLR.
> > > Since machdep->identity_map_base is not used anywhere in crash,
> > > I wonder how I should handle it.
> > >
> > > > > machdep->identity_map_base = ARM64_PAGE_OFFSET;
> >
> > You're right, machdep->identity_map_base is (currently) only used in
> > architecture-specific files, although arm64.c does not use it.
> > But what you have done above is correct.
>
> So should I remove the comment?
Right -- there's nothing to fix!
> >
> > > > > @@ -631,6 +718,19 @@ arm64_calc_phys_offset(void)
> > > > > if (machdep->flags & PHYS_OFFSET) /* --machdep override
*/
> > > > > return;
> > > > >
> > > > > + if (machdep->flags & NEW_VMEMMAP) {
> > > > > + struct syment *sp;
> > > > > + ulong value;
> > > > > +
> > > > > + sp = kernel_symbol_search("memstart_addr");
> > > > > + if (sp && readmem(sp->value, KVADDR, (char
*)&value,
> > > > > + sizeof(value), "memstart_addr",
> > > > > + QUIET|RETURN_ON_ERROR)) {
> > > > > + ms->phys_offset = value;
> > > > > + return;
> > > > > + }
> > > > > + }
> > > > > +
> > > >
> > > > As we've discussed before, I cannot accept the chicken-and-egg
snippet
> > > > above.
> > > >
> > > > If machdep->machspec->kimage_voffset has not been determined,
then the > > > > arm64_VTOP()
> > > > call made by readmem() will utilize
machdep->machspec->phys_offset -- > > > > which is what
> > > > you're trying to initialize here.
> > >
> > > Right, I was a bit confused because my patch basically had an assumption
> > > that we have some sort of way to know kimage_voffset.
> > >
> > > Meanwhile, if this hunk be moved after "if ACTIVE())" like,
> > > ms->phys_offset = 0;
> > > if (ACTIVE()) {
> > > read("/proc/iomem");
> > > ...
> > > } else {
> > > if (machdep->flags & NEW_VMEMMAP) {
> > > readmem("memstart_addr")
> > > ...
> > > } else if (DISKDUMP_DUMPFILE() ...) {
> > > } else if (KDUMP_DUMPFILE() ...) {
> > > }
> > > }
> > > the crash command would still work for a live system if !KASLR.
> >
> > Again, the readmem() is not necessary.
> >
> > >
> > > > On my live system that has a phys_offset of 0x40000000000, the
readmem()
> > > > will fail quietly, but that's not a acceptable usage of the
RETURN_ON_ERROR
> > > > failure mode, because "memstart_addr" is a legitimate
virtual address that
> > > > should never fail. Also, because the actual kernel phys_offset can
be
> > > > negative, it would seem to be entirely within the realm of
possibility that
> > > > the readmem() could mistakenly return successfully, but have read the
wrong
> > > > location.
> > > >
> > > > So what it boils down to is that readmem() should NEVER be called
until ALL of the
> > > > pieces required by arm64_VTOP() have all been properly initialized,
however
> > > > that might be accomplished. Not to mention that calling it this
early sets a
> > > > dangerous precedent.
> > >
> > > If you think so, we can add a specialized function like:
> > > arm64_readmem_kernel_symbol(char *, ...)
> >
> > I don't think so -- the point is to avoid making a readmem() call this
early.
>
> I think that I fully understand your concerns here, but my point is:
> - I give you a value of kimage_voffset in VMCOREINFO
> - you know vmlinux and have vmcore
> - you can and do determine kaslr_offset
> - so why don't you identify PHYS_OFFSET by reading "memstart_addr"?
The VMCOREINFO is primarily there for the user-space kdump and makedumpfile facilities.
Crash does check it occasionally, but can't depend upon it, because there are too
many other
dumpfile formats.
That is the reason that I insist on minimizing the number of VMCOREINFO
parameters. Adding redundant ones can cause confusion in the future.
>
> Using readmem() or not is a matter of implementation.
> I think that we can use READMEM(), instead of readmem(), in
> arm64_readme_kernel_symbol(), which is solely used, at least at the moment,
> in arm64_calc_phys_offset().
I'm sure I don't understand. Calling READMEM() as opposed to readmem()
would require the virtual-to-physical address translation to be done
on the "memstart_addr" symbol.
Right, but as I said,
- we've already calculated kaslr_offset with derive_kaslr_offset() called by
symtab_init(), which is prior to machdep_init(PRE_GDB).
- we can determine the *real(runtime)* virtual address of "memstart_addr"
by
<vaddr> = <vaddr of memstart_addr in vmlinux> + kaslr_offset
- then we will identify the physical address by
<paddr> = <vaddr> - <kimage_voffset>
- so why cannot we call READMEM(pc->readmem) here(arm64_calc_phys_offset)
at the end of machdep_init(PRE_GDB)?
pc->readmem will be initialized definitely before any machdep_init(*).
See what I mean?
Logically, I don't see any breakage of existing APIs/assumptions.
(So I said, it is a matter of implementation.)
Anyway, READMEM should be hidden and
constrained to readmem() itself. (It is used in one other odd-ball place
prior to pc->readmem being initialized and a System.map file being used
as the correct source of kernel symbol values.)
> Well, we may have a different story for makedumpfile as there seems to be
> an assumption that we have only a vmcore file, neither vmlinux nor System.map.
> So I suggested to Pratyush that he could post a patch later to add any
> VMCOREINFO parameters that you wanted for makedumpfile.
Right, there are going to be kdump/makedumpfile requirements, one of them being
the capability of setting the phys_base value in the kdump_sub_header.
> > > > And in the case of kdump's ELF vmcore and compressed vmcore
formats, there
> > > > is an existing API between kdump and the crash utility to pass back
the
> > > > phys_base. In the kexec-tool's makedumpfile.c file, there is
the
> > > > get_phys_base_arm64() function that
> > >
> > > I can't find makedumpfile.c in kexec-tools.
> >
> > Hold on, clarification below...
> >
> > >
> > > > currently calculates the offset by using the PT_LOAD segments, and
presumably will
> > > > have to be modified to use new VMCOREINFO data. But regardless of
how it's one,
> > > > the architecture-neutral write_kdump_header() laster copies that
offset value into
> > > > the kdump_sub_header.phys_base field for the crash utility to access.
Trying to do
> > > > a readmem() this early in time is essentially breaking that API.
> > >
> > > You may already notice that "phys_offset" may have two different
meanings:
> > > * the start address of System RAM
> > > (PHYS_OFFSET on v4.5 or early, and __PHYS_OFFSET on v4.6 or later)
> > > * the value used in phys_to_virt()/virt_to_phys() for any address
> > > in linear-mapping
> > > (PHYS_OFFSET on v4.5 or early, and PHYS_OFFSET on v4.6 or later)
> > >
> > > I think that current crash utility needs only the latter.
> > > Please correct me if I misunderstand something.
> >
> > Right, the __PHYS_OFFSET value seems to be the kernel entry point address,
> > but that is not used. Crash needs the PHYS_OFFSET value to mimic the
kernel's
> > arm64 __virt_to_phys() function.
> >
> > Sorry about my reference to "kexec-tool's makedumpfile.c file",
as I was
> > talking about the Red Hat kexec-tools package, which also contains
> > the makedumpfile facility.
>
> I see.
>
> > Makedumpfile is upstream in a couple places, where the most recently
> > released
> > version (1.5.9) is here:
> >
> >
https://sourceforge.net/projects/makedumpfile/files/makedumpfile/
> >
> > Or the development repository is here:
> >
> >
git://git.code.sf.net/p/makedumpfile/code
> >
> > Regardless of which location, there are the generic get_phys_base() macros
> > here in "makedumpfile.h" for those architectures that need it:
> >
> > makedumpfile.h 814 #define get_phys_base() get_phys_base_arm64()
> > makedumpfile.h 826 #define get_phys_base() get_phys_base_arm()
> > makedumpfile.h 837 #define get_phys_base() stub_true()
> > makedumpfile.h 850 #define get_phys_base() get_phys_base_x86_64()
> > makedumpfile.h 861 #define get_phys_base() stub_true()
> > makedumpfile.h 871 #define get_phys_base() stub_true()
> > makedumpfile.h 882 #define get_phys_base() stub_true()
> > makedumpfile.h 894 #define get_phys_base() get_phys_base_ia64()
> >
> > And where the arm64-specific function is in "arch/arm64.c":
> >
> > int
> > get_phys_base_arm64(void)
> > {
> > unsigned long phys_base = ULONG_MAX;
> > unsigned long long phys_start;
> > int i;
> > /*
> > * We resolve phys_base from PT_LOAD segments. LMA contains
> > physical
> > * address of the segment, and we use the lowest start as
> > * phys_base.
> > */
> > for (i = 0; get_pt_load(i, &phys_start, NULL, NULL, NULL); i++)
{
> > if (phys_start < phys_base)
> > phys_base = phys_start;
> > }
> >
> > if (phys_base == ULONG_MAX) {
> > ERRMSG("Can't determine phys_base\n");
> > return FALSE;
> > }
> >
> > info->phys_base = phys_base;
> >
> > DEBUG_MSG("phys_base : %lx\n", phys_base);
> >
> > return TRUE;
> > }
> >
> > Note that it currently looks for the lowest physical address that is
> > referenced in any of the PT_LOAD segments of /proc/vmcore. As you have
> > discussed for Linux 4.6, perhaps it will be necessary that the value
> > of PHYS_OFFSET ("memstart_addr") will have to be passed in a
VMCOREINFO
> > item. It's not clear to me why PT_LOAD p_paddr values will no longer
> > be usable, but that's up to the makedumpfile/kdump maintainers. And
> > certainly a convenient VMCOREINFO item would be ideal/preferable.
> >
> > Anyway, later on, the info->phys_base value gets stored in the compressed
> > kdump header in the write_kdump_header() function in
"makedumpfile.c":
> >
> > int
> > write_kdump_header(void)
> > {
> > int ret = FALSE;
> > size_t size;
> > off_t offset_note, offset_vmcoreinfo;
> > unsigned long size_note, size_vmcoreinfo;
> > struct disk_dump_header *dh = info->dump_header;
> > struct kdump_sub_header kh;
> > char *buf = NULL;
> >
> > ... [ cut ] ...
> >
> > /*
> > * Write sub header
> > */
> > size = sizeof(struct kdump_sub_header);
> > memset(&kh, 0, size);
> > /* 64bit max_mapnr_64 */
> > kh.max_mapnr_64 = info->max_mapnr;
> > kh.phys_base = info->phys_base;
> > kh.dump_level = info->dump_level;
> > ...
> >
> > And then in the crash utility source code, it simply gets accessed
> > like so:
> >
> > int
> > diskdump_phys_base(unsigned long *phys_base)
> > {
> > if (KDUMP_CMPRS_VALID()) {
> > *phys_base = dd->sub_header_kdump->phys_base;
> > return TRUE;
> > }
> >
> > return FALSE;
> > }
> >
> > Now, for ELF vmcores, the crash utility calls arm64_kdump_phys_base()
> > which calls arm_kdump_phys_base(), which simply mimics what the
> > makedumpfile.c's get_phys_base_arm64() does. And that function may
> > need to be changed for Linux 4.6 in order to access the new VMCOREINFO
> > variable.
>
> It sounds to me that, once VOMCOREINFO("PHYS_OFFSET") is added for
> makedumpfile, crash utility wants to use it, too. Right?
It won't have to -- it has always used diskdump_phys_base(). Now, for
kdump ELF vmcores, yes, it could use pc->read_vmcoreinfo().
>
> If so, what about, say, VA_BITS which will presumably be needed as
> VMCOREINFO later for makedumpfile but is calculated for now in
> crash utility?
Makedumpfile may require VA_BITS, but it's pretty straight forward in crash
because we have the symbol values from the vmlinux file. That being
the case, if it's available in VMCOREINFO, it could be used as an
alternative for compressed kdumps or kdump ELF vmcores.
I know another example: VMCOREINFO("KERNELOFFSET"), which is actually
equal to "kaslr_offset".
It exists on x86, but is not utilized by crash util.
>
> Please don't take me wrong. I, as an author of kdump for arm64, just want
> to understand what parameters and why we want to have.
Right, although it's going to be kind of play-as-we-go. The crash utility is
going to require kdump-ELF and makedumpfile to offer the the things that
it needs, but kdump-ELF/makedumpfile may have its own requirements to generate
the things that the crash utility needs. As far as I can tell, at a minimum,
VA_BITS, kimage_voffset and PHYS_BASE would be required.
I'm not quite sure, but do you, as a maintainer of crash, expect those
parameters to be also appended for crash?
Thanks,
-Takahiro AKASHI
Thanks,
Dave
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
--
Thanks,
-Takahiro AKASHI