Hi Kazu,
On Wed, Mar 30, 2022 at 09:27:18AM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote:
-----Original Message-----
> Hi Kazu,
> On Wed, Mar 30, 2022 at 08:28:19AM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote:
> > -----Original Message-----
> > > 1.) The vmcore file maybe very big.
> > >
> > > For example, I have a vmcore file which is over 23G,
> > > and the panic kernel had 767.6G memory,
> > > its max_sect_len is 4468736.
> > >
> > > Current code costs too much time to do the following loop:
> > > ..............................................
> > > for (i = 1; i < max_sect_len + 1; i++) {
> > > dd->valid_pages[i] = dd->valid_pages[i - 1];
> > > for (j = 0; j < BITMAP_SECT_LEN; j++, pfn++)
> > > if (page_is_dumpable(pfn))
> > > dd->valid_pages[i]++;
> > > ..............................................
> > >
> > > For my case, it costs about 56 seconds to finish the
> > > big loop.
> > >
> > > This patch moves the hweightXX macros to defs.h,
> > > and uses hweight64 to optimize the loop.
> > >
> > > For my vmcore, the loop only costs about one second now.
> > >
> > > 2.) Tests result:
> > > # cat ./commands.txt
> > > quit
> > >
> > > Before:
> > >
> > > #echo 3 > /proc/sys/vm/drop_caches;
> > > #time ./crash -i ./commands.txt /root/t/vmlinux /root/t/vmcore >
/dev/null 2>&1
> > > ............................
> > > real 1m54.259s
> > > user 1m12.494s
> > > sys 0m3.857s
> > > ............................
> > >
> > > After this patch:
> > >
> > > #echo 3 > /proc/sys/vm/drop_caches;
> > > #time ./crash -i ./commands.txt /root/t/vmlinux /root/t/vmcore >
/dev/null 2>&1
> > > ............................
> > > real 0m55.217s
> > > user 0m15.114s
> > > sys 0m3.560s
> > > ............................
> >
> > Thank you for the improvement!
> >
> > as far as I tested on x86_64 it did not give such a big gain, but looking at
> > the user time, it will do on arm64. Lianbo, can you reproduce on arm64?
> >
> > with a 192GB x86_64 dumpfile, slightly improved:
> >
> > $ time echo quit | ./crash vmlinux dump >/dev/null
> >
> > real 0m5.632s
> Thanks for the testing.
>
> I am curious why it costs only 5.632s for a 192G dumpfile?
> How much memory of the panic kernel in the dumpfile?
>
> My vmcore has 767.G memory, and the max_sect_len is 4468736.
I got it with makedumpfile -d 0 and tested it without dropping caches
to measure the change of the loop cost. As for memory, which size
are you saying? That machine has 192GB memory.
$ ls -lhs dump
193G -rw-------. 1 root root 193G Mar 30 17:07 dump
$ file dump
dump: Kdump compressed dump v6, system Linux, ...
$ ./crash vmlinux dump
MEMORY: 191.7 GB
crash> help -D
...
block_size: 4096
sub_hdr_size: 10
bitmap_blocks: 3088
max_mapnr: 50593791
...
total_valid_pages: 50178690
max_sect_len: 12352 // added
Ok, it seems your max_sect_len is too
small.
The max_sect_len looks too small comparing yours.. but
12352 * 4096 = 50593792
My max_sect_len is 4468736, so
4468736 / 12352 = 361.78
The (4468736 * 4096) costs 56s on my machine.
Assume our CPU runs at the same speed,
your machine will costs (56/361.78 = 0.1547)s.
So you cannot get big gain. :)
Thanks
Huang Shijie