On Fri, Jul 16, 2021 at 07:00:14AM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote:
-----Original Message-----
> On Fri, Jul 09, 2021 at 06:39:44AM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote:
> > -----Original Message-----
> > > On Mon, Jun 21, 2021 at 06:02:51AM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote:
> > > > -----Original Message-----
> > > > > +
> > > > > + if (read_pd(dd->dfd, offset, &pd)) {
> > > > > + /*
> > > > > + * Truncated page descriptor at most
> > > > > + * references full page.
> > > > > + */
> > > > > + expected_size += block_size;
> > > > > + goto next;
> > > > > + }
> > > > > +
> > > > > + if (pd.offset == 0) {
> > > > > + if (!first_empty_pd)
> > > > > + first_empty_pd = page_idx;
> > > > > + /*
> > > > > + * Incomplete pages at most use the
> > > > > + * whole page.
> > > > > + */
> > > > > + expected_size += block_size;
> > > > > + } else if (!pd.flags) {
> > > > > + /*
> > > > > + * Zero page has no compression flags.
> > > > > + */
> > > >
> > > > Non-compressed page also has no compression flags.
> > > >
> > > > So something like (pd.offset == dd->data_offset) is needed to
determine
> > > > whether the page is zero page? although it's not exact without
excluding
> > > > zero pages.
> > > >
> > >
> > > Hi Kazu,
> > >
> > > Yes, you're right. Thanks for spotting it.
> > >
> > > I've added a code path for a case when zero pages are not it's
excluded.
> > > It's a no brainer. However, I've got some issues figuring out
whether a
> > > page descriptor references zero page when we start to differentiate
> > > zero/non-zero uncompressed pages and calculate expected size accordingly.
> > >
> > > dd->data_offset points to the beginning of page descriptors and it
> > > doesn't help to find zero page:
> > >
> > > data_offset -> pd for pfn 0
> > > pd for pfn 1
> > > ...
> > > pd for pfn N
> > > ... <-some gap ???
> > > zero page
> > > pfn 0
> >
> > Oh, you're right, I misread something.
> >
> > There should be no gap between pd for pfn N and zero page, but anyway
> > we cannot get the number of pds in advance without counting the bitmap..
> >
>
> I don't know may be some parts of bitmap weren't flushed either. I'll
> investigate that further why "valid_pages * descriptor size" is not
> equal to offset from data_offset to zero page on an incomplete dump.
ugh, sorry, bitmap also can be incomplete in cyclic mode. In that case,
page_is_dumpable() cannot be used to estimate the size in the first place,
maybe all we can do is to get total_valid_pages..
Hi Kazu,
Thanks for clarifying this. I think two-pass approach might still work with
this.
I have an incomplete dump where data_offset = 8130000, zero_page_offset = 93d5c90
and total_valid_pages = 476223.
pd_count = (zero_page_offset - data_offset) / 24 (pd size) = 814726.
So, we know that 338503 pages are missed in the bitmap and the dump is
338503 pages smaller than it could be (at most, if every missed page
takes full page).
Either way, it seems you're fine with the first four patches:
https://listman.redhat.com/archives/crash-utility/2021-June/msg00078.html
Do we want to merge them? (v3 of the series only updates the last patch)
Regards,
Roman
Thanks,
Kazu