----- Original Message -----
On 10/24/2014 10:05 PM, Dave Anderson wrote:
> With the patch I that proposed, you can simplify this part of your patch:
>
> + unsigned int e_flag = (NULL == nd->elf64) ?
> + (nd->elf32)->e_flags :
(nd->elf64)->e_flags;
> + int status = e_flag& DUMP_ELF_INCOMPLETE;
> + if (status&& read_ret>= 0) {
>
> to just this:
> if (is_incomplete_dump()&& (read_ret>= 0)) {
>
> But what about compressed kdumps? In the case of DUMP_DH_COMPRESSED_INCOMPLETE
> dumpfiles, will makedumpfile manipulate the bitmaps and the page_desc.offset values
> of all "missing" pages to point to the special "pd_zero" page?
Will it be
> impossible to distinguish between legitimate zero-filled pages and
> "missing" pages?
>
> I suppose that if that is true, then the behavior should be the same for
> both ELF and compressed kdumps, which would be to return zero-filled data.
>
> On the other hand...
>
> What's bothersome about automatically returning zero-filled data is that,
> currently, if an ELF vmcore was originally created correctly, but gets truncated
> *after* the fact, for example, when downloading it from a customer site,
> check_dumpfile_size() will report the error. But when data from the truncated
> region gets read, a read error is returned.
>
> If zero-filled data is silently returned, I can imagine that there will be all
> kinds of errors that will occur -- both display/content errors, but it could
> easily cause SIGSEGV's in the crash utility when it tries to use zero-filled
> data thinking that it's legitimate. How would the user even know whether
it's
> because the page was truncated or not?
The patch has been adjusted according to your suggestions. The kdump part has been
fixed. And the warning information is added in both elf part and kdump part.
Thank you for your guiding.
> The crash utility already has a --zero_excluded flag for returning zero-filled
> memory if by chance a *legitimately* excluded/filtered page is read. It's
> hardly ever used, because those pages should never be required. But I
> wonder whether it could be re-purposed in this case, where by default an error
> would be returned when reading missing pages, and zero-filled only if it is
> explicitly requested with the --zero_excluded command line option? Or would
> that be impossible with DUMP_DH_COMPRESSED_INCOMPLETE compressed kdumps?
I think it's a good idea and is completed in patch. It can also work in kdump part.
The attachment is the latest patch.
Thanks
Zhou Wenjian
A couple notes and questions re: your latest patch:
diff --git a/diskdump.c b/diskdump.c
index 1d6f7a5..850b174 100644
--- a/diskdump.c
+++ b/diskdump.c
@@ -999,6 +999,17 @@ cache_page(physaddr_t paddr)
if (FLAT_FORMAT()) {
if (!read_flattened_format(dd->dfd, pd.offset, dd->compressed_page, pd.size))
return READ_ERROR;
What about FLAT_FORMAT()? Should the (is_incomplete_dump()/pd.offset==0) test below be
done before the FLAT_FORMAT() check?
+ } else if (is_incomplete_dump() && (0 == pd.offset)) {
Is this how the makedumpfile patch marks pages that have been truncated, i.e., by
creating a page_desc_t that has a pd.offset == 0?
+ /*
+ * if --zero_excluded is specified and incomplete flag is set in dump file,
+ * zero will be used to fill the lost part.
+ */
+ if (!(*diskdump_flags & ZERO_EXCLUDED))
+ return READ_ERROR;
+ error(WARNING, "%s: data may be lost\n"
+ " pfn:%lld\n\n"
+ ,pc->dumpfile,pfn);
When --zero_excluded is entered on the crash command line, then the user knows exactly
what he is doing, and so there should not be any error/warning messages from the read
command itself. A message should only be displayed if debug mode is turned on. For
example, this is how it is done currently, but only if debug is set to 8:
if (CRASHDEBUG(8))
fprintf(fp, "read_diskdump: zero-fill: "
"paddr/pfn: %llx/%lx\n",
(ulonglong)paddr, pfn);
+ memset(dd->compressed_page, 0, dd->block_size);
} else {
if (lseek(dd->dfd, pd.offset, SEEK_SET) == failed)
return SEEK_ERROR;
diff --git a/netdump.c b/netdump.c
index abc85e0..c8f057e 100644
--- a/netdump.c
+++ b/netdump.c
@@ -608,7 +608,21 @@ read_netdump(int fd, void *bufptr, int cnt, ulong addr, physaddr_t
paddr)
"offset: %llx\n", (ulonglong)offset);
return SEEK_ERROR;
}
- if (read(nd->ndfd, bufptr, cnt) != cnt) {
+ /*
+ * if --zero_excluded is specified and incomplete flag is set in ELF file,
+ * zero will be used to fill the lost part.
+ */
+ ssize_t read_ret = read(nd->ndfd, bufptr, cnt);
+ if (read_ret != cnt) {
+ if (is_incomplete_dump() && (read_ret >= 0 &&
+ *diskdump_flags & ZERO_EXCLUDED)) {
+ error(WARNING, "%s: data may be lost\n"
+ " offset:%llx\n\n",
+ pc->dumpfile, offset);
Same thing here... if a user enters --zero_excluded on the command line, he should
expect strange behavior as a result. Intermingling a bunch of WARNING messages
like the one above will only make it more confusing.
+ bufptr += read_ret;
+ bzero(bufptr, cnt - read_ret);
+ return cnt;
+ }
if (CRASHDEBUG(8))
fprintf(fp, "read_netdump: READ_ERROR: "
"offset: %llx\n", (ulonglong)offset);
Dave