----- Original Message -----
On 10/23/2014 08:10 PM, Dave Anderson wrote:
>
>
> ----- Original Message -----
>> Hello Dave,
>>
>> On 10/22/2014 09:49 PM, Dave Anderson wrote:
>>>
>>> Hello Wenjian,
>>>
>>> First -- please send patches as email attachments. Even if I cut-and-paste
your
>>> patch from the crash-utility archives (which normally works), there are
>>> still no tabs in your patch.
>>
>> Appreciate your suggestion.
>>
>>> Now, with respect to the patch, it's not clear to me what would happen
if you
>>> make no changes to check_dumpfile_size()?
>>>
>>> It seems to me that a read error would occur regardless whether you change
>>> the PT_LOAD-related contents or not.
>>>
>>> If no changes are made:
>>>
>>> If a readmem() of a truncated page is attempted, read_netdump() would
>>> calculate an offset based upon the original PT_LOAD contents, but then
>>> the subsequent read() would fail, and would return a READ_ERROR.
>>>
>>> If your patch is applied:
>>>
>>> If a readmem() of a truncated page is attempted, read_netdump() would
not
>>> be able to calculate an offset, and would return a READ_ERROR.
>>>
>>> What's the difference? Why even bother making the changes?
>>>
>> If my patch is applied:
>> If a readmem() of a truncated page is attempted,
>>
>> (pls->zero_fill&& (paddr>= pls->phys_end)&&
(paddr< pls->zero_fill)),
>> ,his will be right. So read_netdump() will fill bufptr with zero and
>> return cnt.
>>
>> In my patch, information of PT_LOAD segment is modified, so that the segment
will
>> not go beyond the file size. And use zero to replace the lost part.
>>
>> Previously, we intend to do this work in makedumpfile by modifying the PT_LOAD
>> header of the ELF dump file. But kumagai atsushi thought it's not good to
make the
>> irreversible change. So we think do it in crash maybe a better choice.
>>
>>> I also don't understand what the difference between
"truncated" and "incomplete"
>>> is? Why did you separate the messages into two?
>>>
>> At first, I thought the difference is the incomplete flag.
>> Now, I think it's not necessary to distinguish them.
>>
>>
>>> Anyway, I had already created a patch in preparation for the changes to
>>> makedumpfile for ELF and compressed kdump vmcores. The patch will apply
>>> to the current github master branch. Please apply it alone, and tell me
>>> what happens.
>>
>> The patch can show the incomplete information, but will be interrupted by the
>> READ_ERROR (we don't change the PT_LOAD headers in makedumpfile any more).
>
> Well, it *should* be interrupted. What's worse, "interruption" or the
receipt
> of completely bogus zero-filled data?
>
> This whole scheme is even crazier than I thought. I'm going to have
> to make the initialization-time warning message even more ominous than
> I already have it.
>
Actually, I haven't realized it until doing the work in crash.
I think which is better or worse depends on what people want to do.
If incomplete flag exists, it may indicate people want to analyse the file,
and returning zero-filled data is better. Otherwise, "interruption" may be
expected.
Right, this needs much more discussion... (see below)
> But I still don't think it's necessary to modify the original PT_LOAD
> contents. If read_netdump() finds a legitimate offset, but that offset
> is beyond the end of the incomplete/truncated file, it can just check the
> INCOMPLETE flag and return a zero-filled buffer if it's set. Wouldn't
> that work?
It does work. And now, the question turns to be which way is better,
first: do it in check_dumpfile_size() by modifying relevant information
(not modifying the information in ELF file)
second: do it in read_netdump() by return zero-filled data with
checking offset and flag
isn't it?
Correct, check_dumpfile_size() is meant to be informational in nature.
If zero-filled data gets returned, it should really should be done
in read_netdump().
With the patch I that proposed, you can simplify this part of your patch:
+ unsigned int e_flag = (NULL == nd->elf64) ?
+ (nd->elf32)->e_flags :
(nd->elf64)->e_flags;
+ int status = e_flag & DUMP_ELF_INCOMPLETE;
+ if (status && read_ret >= 0) {
to just this:
if (is_incomplete_dump() && (read_ret >= 0)) {
But what about compressed kdumps? In the case of DUMP_DH_COMPRESSED_INCOMPLETE
dumpfiles, will makedumpfile manipulate the bitmaps and the page_desc.offset values
of all "missing" pages to point to the special "pd_zero" page? Will
it be
impossible to distinguish between legitimate zero-filled pages and "missing"
pages?
I suppose that if that is true, then the behavior should be the same for
both ELF and compressed kdumps, which would be to return zero-filled data.
On the other hand...
What's bothersome about automatically returning zero-filled data is that,
currently, if an ELF vmcore was originally created correctly, but gets truncated
*after* the fact, for example, when downloading it from a customer site,
check_dumpfile_size() will report the error. But when data from the truncated
region gets read, a read error is returned.
If zero-filled data is silently returned, I can imagine that there will be all
kinds of errors that will occur -- both display/content errors, but it could
easily cause SIGSEGV's in the crash utility when it tries to use zero-filled
data thinking that it's legitimate. How would the user even know whether it's
because the page was truncated or not?
The crash utility already has a --zero_excluded flag for returning zero-filled
memory if by chance a *legitimately* excluded/filtered page is read. It's
hardly ever used, because those pages should never be required. But I
wonder whether it could be re-purposed in this case, where by default an error
would be returned when reading missing pages, and zero-filled only if it is
explicitly requested with the --zero_excluded command line option? Or would
that be impossible with DUMP_DH_COMPRESSED_INCOMPLETE compressed kdumps?
Dave