-----Original Message-----
On Fri, Jun 18, 2021 at 06:42:51AM +0000, HAGIO KAZUHITO(萩尾 一仁)
wrote:
> -----Original Message-----
> > crash warns that ELF dump is incomplete like this:
> >
> > WARNING: vmcore: may be truncated or incomplete
> > PT_LOAD p_offset: 218960740
> > p_filesz: 4194304
> > bytes required: 223155044
> > dumpfile size: 219960740
> >
> > The warning saves a lot of time when figuring out what went wrong, but
> > crash misses explicit dump detection for compressed kdumps.
> >
> > The change adds the warning for compressed kdumps:
> >
> > WARNING: vmcore: may be truncated or incomplete
> > data_offset: 262144
> > block_size: 65536
> > total_valid_pages: 2438
> > first_empty_pd: 100
> > bytes required: 157467661
> > dumpfile size: 265982
> >
> > bytes required is computed as:
> >
> > data_offset (kdump header + bitmaps)
> > +
> > total_valid_pages * page descriptor size (page descriptors)
> > +
> > zero page (that immediately follows page descriptors)
> > +
> > compressed size of every present page in the list of valid pages
> > +
> > block_size for every incomplete page (i.e page descriptor has
> > zero offset)
>
> hmm, this will be good for debugging, but the "bytes required" will not
> match with the size of the complete dump at all if it's compressed, and
> it will also be confusing.
>
Hi Kazu,
bytes required is computed using compressed size for each page by going
through the valid page descriptors. It should be quite accurate. I
tested it on the complete and incomplete dumps - handmade and from the
field. With the patch crash clearly reports that a dump is incomplete if
it's incomplete. I haven't seen any false-positives on the complete
cores.
But you're right that "bytes required" may not match size of complete
dump. Complete core might be a bit smaller because some page descriptors
may be missing in the incomplete core and we don't know what's
compressed page size that are referenced by the descriptors.
Right.
On the other side, netdump's meaning of bytes required is also not
exactly accurate so it's not different from what's provided in the
patch.
I can update both netdump and diskdump to report "estimated bytes
required" instead of "bytes required" to make it clear that it's only
estimation. Would that work for you?
My understanding is that netdump's "bytes required" means the required
size of the segment corresponding to the PT_LOAD header currently being
checked, not the whole vmcore size. So it's ambiguous, but it's actually
required size and not estimated size.
(Probably it's hard to calculate the whole size, because makedumpfile -E
writes the PT_LOAD header and segment alternately and the headers of
incomplete segments will be filled by zero..)
So could you update the diskdump to add "estimated"?
And there are a few of additional comments on the patch below.
> So, I would prefer the number of valid page descriptors rather
than
> the "bytes required" and "dumpfile size". What do you think?
>
The number of valid discriptors is first_empty_pd - 1.
Yes. I imagined that I would want to know how much percentage the incomplete
dumpfile was truncated when kdump fails, then the first_empty_pd would be
enough. And the estimated size might be misleading unexpectedly.
I honestly think we should print an estimation of the required bytes to
make crash friendly towards the engineers who don't know
crash/makedumpfile internals :)
OK, there might be engineers who like the size estimation.
@@ -548,6 +550,12 @@ read_dump_header(char *file)
ulong pfn;
int i, j, max_sect_len;
int is_split = 0;
+ struct stat64 stat;
+ page_desc_t pd;
+ ulong page_idx;
+ int first_empty_pd = 0;
+ int zero_page_counted = 0;
+ size_t expected_size = 0;
if (block_size < 0)
return FALSE;
@@ -898,13 +906,72 @@ restart:
pfn = start;
}
+ expected_size = dd->data_offset;
dd->valid_pages = calloc(sizeof(ulong), max_sect_len + 1);
dd->max_sect_len = max_sect_len;
for (i = 1; i < max_sect_len + 1; i++) {
dd->valid_pages[i] = dd->valid_pages[i - 1];
for (j = 0; j < BITMAP_SECT_LEN; j++, pfn++)
- if (page_is_dumpable(pfn))
- dd->valid_pages[i]++;
+ if (page_is_dumpable(pfn)) {
+ page_idx = dd->valid_pages[i]++;
+
+ offset = dd->data_offset +
+ page_idx * sizeof(pd);
You can lengthen a line to 100 chars.
+
+ if (read_pd(dd->dfd, offset, &pd)) {
+ /*
+ * Truncated page descriptor at most
+ * references full page.
+ */
+ expected_size += block_size;
+ goto next;
+ }
+
+ if (pd.offset == 0) {
+ if (!first_empty_pd)
+ first_empty_pd = page_idx;
+ /*
+ * Incomplete pages at most use the
+ * whole page.
+ */
+ expected_size += block_size;
+ } else if (!pd.flags) {
+ /*
+ * Zero page has no compression flags.
+ */
Non-compressed page also has no compression flags.
So something like (pd.offset == dd->data_offset) is needed to determine
whether the page is zero page? although it's not exact without excluding
zero pages.
+ if (!zero_page_counted) {
+ expected_size += block_size;
+ zero_page_counted = 1;
+ }
+ } else if (pd.flags) {
+ /* Regular compressed page */
+ expected_size += pd.size;
+ }
+
+next:
+ expected_size += sizeof(pd);
+ }
+ }
+
+ if (stat64(dd->filename, &stat) < 0) {
+ error(INFO, "%s: cannot stat %s\n",
+ DISKDUMP_VALID() ? "diskdump" : "compressed kdump",
+ dd->filename);
+ goto err;
+ }
+
+ if (expected_size > stat.st_size || first_empty_pd) {
+ error(WARNING,
+ "%s: may be truncated or incomplete\n"
+ " data_offset: %lld\n"
+ " block_size: %lld\n"
+ " total_valid_pages: %lld\n"
+ " first_empty_pd: %lld\n"
+ " bytes required: %lld\n"
Please add "estimated".
Thanks,
Kazu
+ " dumpfile size: %lld\n\n",
+ dd->filename, dd->data_offset,
+ dd->block_size, dd->valid_pages[dd->max_sect_len],
+ first_empty_pd, expected_size, stat.st_size);
}
return TRUE;
--
2.32.0