On Wed, Jan 08, 2014 at 03:32:01PM -0500, Dave Anderson wrote:
>
>
> ----- Original Message -----
> >
> >
> >
> > I have proposed a patch to makedumpfile to (optionally) exclude from
> > a dump the page structures representing excluded pages.
> > The idea being that a system with terabytes of system memory has
> > millions of pages of page structures. And most of them are unneeded.
> >
> > That patch thread begins here:
> >
http://marc.info/?l=kexec&m=138853299130125&w=2
> >
> > Dave [Anderson] raised these crash-related issues;
> > Although I'm sure you tested this, I find it amazing that
> > only the "kmem -[fF]" option is the only command option
> > that is affected?
> > If I'm not mistaken, this would be the first time that legitimate
> > kernel data would be excluded from the dump, and the user would
> > have no obvious way of knowing that it had been done, correct?
> > If it were encoded in the in the header somewhere, at least a
> > warning message could be printed during crash initialization.
> > ...
> > Right, but look at all of the other page struct offsets in addition to
> > page.lru that are used. The page.flags usage comes to mind, and for
> > example, what would "kmem -p" display for the missing pages?
> > Or "kmem <address>"? And would "kmem -i" display
invalid data?
> > I'm just speculating off the top of my head, but the page structure is
> > such a fundamental data structure with several of its fields being
> > used,
> > just enter "help -o page_" to see all of its potential member
usages.
> >
> > So I am submitting two patches for your consideration, should the patch
> > to exclude unused vmemmap pages be taken into makedumpfile.
> >
> > - [PATCH 1/2] crash: initial note of excluded page structures
> > This one makes crash startup look like this:
> > This program has absolutely no warranty. Enter "help warranty"
for
> > details.
> >
> > NOTE: Unused vmemmap page structures are excluded from this dump.
> > GNU gdb (GDB) 7.6
> > Copyright (C) 2013 Free Software Foundation, Inc.
> >
> > - [PATCH 1/2] crash: kmem warnings for excluded page structures
> > This patch modifies kmem options -f, -F, -s addr, -S addr, and -i.
> > Those are the only options that I could detect looking for excluded
> > pages.
> >
> > This patch applies on top of the first, and adds some warnings to the
> > output of these kmem options. For example:
> >
> > crash> kmem -f
> > Note: kmem -f may fail because unused page structures are excluded
> > from
> > this dump.
> > NODE
> > 0
> > ZONE NAME SIZE FREE MEM_MAP START_PADDR
> > START_MAPNR
> > 0 DMA 4095 3934 ffffea0000000038 1000 0
> > AREA SIZE FREE_AREA_STRUCT BLOCKS PAGES
> > 0 4k ffff880000013068 2 2
> > ...
> >
> > crash> kmem -i
> > PAGES TOTAL PERCENTAGE
> > TOTAL MEM 128008147 488.3 GB ----
> > FREE 127599276 486.8 GB 99% of TOTAL MEM
> > USED 408871 1.6 GB 0% of TOTAL MEM
> > SHARED 11049 43.2 MB 0% of TOTAL MEM
> > BUFFERS 5722 22.4 MB 0% of TOTAL MEM
> > CACHED 44638 174.4 MB 0% of TOTAL MEM
> > SLAB 62139 242.7 MB 0% of TOTAL MEM
> >
> > TOTAL SWAP 4893032 18.7 GB ----
> > SWAP USED 0 0 0% of TOTAL SWAP
> > SWAP FREE 4893032 18.7 GB 100% of TOTAL SWAP
> >
> > Note: 1970727 free pages not found (excluded); results are incomplete.
> > Unused page structures are excluded from this dump.
> >
> > -Cliff Wickman
> > cpw(a)sgi.com
>
> Cliff,
>
> Can you make this patch far simpler? I would prefer that an
> error message *follows* the gdb banner, i.e., where a number of
> current warnings get displayed? You seem to be showing it just
> above the gdb banner, where it kind of gets lost.
Okay I'll move the message.
> Secondly, I'm not sure where/how you are determining the 1970727
> pages that are excluded. Is it possible to put that in the
> early warning message?
The 1970727 pages were counted by the second patch. That is, I
put a counter in the readmem path that counted 'excluded' errors.
So there were 1970727 such errors during the execution of the
kmem -i.
It is probably not worth it to make a place for a new number
in the dump header. Knowing how many million pages were
excluded won't help if 'your' problem page was one of them.
>
> Thirdly, I'm not convinced that the kmem locations where you
> are adding per-option warning messages are going to be the only
> place where problems may arise. For that reason, I would
> not even bother putting them in all those various locations,
> but rather listing the "known" commands that may fail in the
> early warning message. So do something like:
Drop the 2nd patch then. If you think it might be more misleading
then helpful I'm fine with just the initialization warning.
OK -- I just don't want to clutter the code with a bunch of per-option
warning messages, but would rather have an ominous, as-wordy-as-you'd-like-it,
message delivered right up front.
Take a look at "ERASEINFO_DATA" in the crash sources, and do a similar thing.
It's the same concept, where in that case, sensitive kernel data may have been
erased from the vmcore.
Thanks,
Dave