Re: [Crash-utility] add support for incomplete elf dump file

Friday, 24 October 2014

----- Original Message -----
...
 On 10/23/2014 08:10 PM, Dave Anderson wrote:
 >
 >
 > ----- Original Message -----
 >> Hello Dave,
 >>
 >> On 10/22/2014 09:49 PM, Dave Anderson wrote:
 >>>
 >>> Hello Wenjian,
 >>>
 >>> First -- please send patches as email attachments.  Even if I cut-and-paste
your
 >>> patch from the crash-utility archives (which normally works), there are
 >>> still no tabs in your patch.
 >>
 >> Appreciate your suggestion.
 >>
 >>> Now, with respect to the patch, it's not clear to me what would happen
if you
 >>> make no changes to check_dumpfile_size()?
 >>>
 >>> It seems to me that a read error would occur regardless whether you change
 >>> the PT_LOAD-related contents or not.
 >>>
 >>> If no changes are made:
 >>>
 >>>     If a readmem() of a truncated page is attempted, read_netdump() would
 >>>     calculate an offset based upon the original PT_LOAD contents, but then
 >>>     the subsequent read() would fail, and would return a READ_ERROR.
 >>>
 >>> If your patch is applied:
 >>>
 >>>     If a readmem() of a truncated page is attempted, read_netdump() would
not
 >>>     be able to calculate an offset, and would return a READ_ERROR.
 >>>
 >>> What's the difference?  Why even bother making the changes?
 >>>
 >> If my patch is applied:
 >> 	If a readmem() of a truncated page is attempted,
 >>
 >> 	(pls->zero_fill&&  (paddr>= pls->phys_end)&& 
(paddr<  pls->zero_fill)),
 >> 	,his will be right. So read_netdump() will fill bufptr with zero and
 >> 	return cnt.
 >>
 >> In my patch, information of PT_LOAD segment is modified, so that the segment
will
 >> not go beyond the file size. And use zero to replace the lost part.
 >>
 >> Previously, we intend to do this work in makedumpfile by modifying the PT_LOAD
 >> header of the ELF dump file. But kumagai atsushi thought it's not good to
make the
 >> irreversible change. So we think do it in crash maybe a better choice.
 >>
 >>> I also don't understand what the difference between
"truncated" and "incomplete"
 >>> is?  Why did you separate the messages into two?
 >>>
 >> At first, I thought the difference is the incomplete flag.
 >> Now, I think it's not necessary to distinguish them.
 >>
 >>
 >>> Anyway, I had already created a patch in preparation for the changes to
 >>> makedumpfile for ELF and compressed kdump vmcores.  The patch will apply
 >>> to the current github master branch.  Please apply it alone, and tell me
 >>> what happens.
 >>
 >> The patch can show the incomplete information, but will be interrupted by the
 >> READ_ERROR (we don't change the PT_LOAD headers in makedumpfile any more).
 >
 > Well, it *should* be interrupted.  What's worse, "interruption" or the
receipt
 > of completely bogus zero-filled data?
 >
 > This whole scheme is even crazier than I thought.  I'm going to have
 > to make the initialization-time warning message even more ominous than
 > I already have it.
 >
 Actually, I haven't realized it until doing the work in crash.
 I think which is better or worse depends on what people want to do.
 If incomplete flag exists, it may indicate people want to analyse the file,
 and returning zero-filled data is better. Otherwise, "interruption" may be
 expected. 
Right, this needs much more discussion... (see below)

...

 > But I still don't think it's necessary to modify the original PT_LOAD
 > contents.  If read_netdump() finds a legitimate offset, but that offset
 > is beyond the end of the incomplete/truncated file, it can just check the
 > INCOMPLETE flag and return a zero-filled buffer if it's set.   Wouldn't
 > that work?

 It does work. And now, the question turns to be which way is better,
 first:   do it in check_dumpfile_size() by modifying relevant information
 	 (not modifying the information in ELF file)

 second:  do it in read_netdump() by return zero-filled data with
           checking offset and flag
 isn't it? 
Correct, check_dumpfile_size() is meant to be informational in nature.
If zero-filled data gets returned, it should really should be done
in read_netdump().

With the patch I that proposed, you can simplify this part of your patch:

+                       unsigned int e_flag = (NULL == nd->elf64) ?
+                                                (nd->elf32)->e_flags :
(nd->elf64)->e_flags;
+                       int status = e_flag & DUMP_ELF_INCOMPLETE;
+                       if (status && read_ret >= 0) {

to just this:
                        if (is_incomplete_dump() && (read_ret >= 0)) { 

But what about compressed kdumps?  In the case of DUMP_DH_COMPRESSED_INCOMPLETE
dumpfiles, will makedumpfile manipulate the bitmaps and the page_desc.offset values
of all "missing" pages to point to the special "pd_zero" page?  Will
it be
impossible to distinguish between legitimate zero-filled pages and "missing"
pages?

I suppose that if that is true, then the behavior should be the same for
both ELF and compressed kdumps, which would be to return zero-filled data.

On the other hand...

What's bothersome about automatically returning zero-filled data is that,
currently, if an ELF vmcore was originally created correctly, but gets truncated
*after* the fact, for example, when downloading it from a customer site,
check_dumpfile_size() will report the error.  But when data from the truncated
region gets read, a read error is returned. 

If zero-filled data is silently returned, I can imagine that there will be all 
kinds of errors that will occur -- both display/content errors, but it could
easily cause SIGSEGV's in the crash utility when it tries to use zero-filled
data thinking that it's legitimate.  How would the user even know whether it's
because the page was truncated or not?

The crash utility already has a --zero_excluded flag for returning zero-filled
memory if by chance a *legitimately* excluded/filtered page is read.  It's
hardly ever used, because those pages should never be required.  But I
wonder whether it could be re-purposed in this case, where by default an error
would be returned when reading missing pages, and zero-filled only if it is
explicitly requested with the --zero_excluded command line option?  Or would
that be impossible with DUMP_DH_COMPRESSED_INCOMPLETE compressed kdumps?

Dave

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Crash-utility] add support for incomplete elf dump file