----- Original Message -----
Hi Dave,
Thank you for your comment!
On 2/13/2018 2:33 PM, Dave Anderson wrote:
>
> Hi Kasuhito,
>
> I am planning on releasing crash-7.2.1 today, so this will have to be deferred
> to 7.2.2.
>
> Because of your questions about ppc64, possible backwards-compatibility issues,
> or potential future changes to page.flags usage, this permanent change to the
> is_page_ptr() function solely for the purposes of SLUB's count_partial()
function
> makes me nervous.
Sorry, my explanation was not enough. Please let me supplement a little.
As far as I know, the count_partial() is the function which could receive the
effect of this patch most. But is_page_ptr() could be called many times also
through the other functions, so this will improve them, too. Moreover, the
is_page_ptr()'s loop could loop NR_MEM_SECTIONS() times, which is 33554432
on x86_64(5level) if I understand correctly, in the worst case.
So I thought that ideally we should improve the is_page_ptr() itself if possible,
rather than find the callers which could call it many times and change them.
Also, I am willing to drop the definition of VMEMMAP_VADDR for ppc64 this
time.
OK, I understand your point. But what concerns me is that the function's
purpose is to absolutely identify whether the incoming page structure address
is a correct page structure address. But if an invalid address gets passed
into is_page_ptr(), your patch would take the invalid address, calculate an
invalid "nr", and continue from there, right?
Dave
>
> There is really no compelling reason that count_partial() absolutely *must* use
> is_page_ptr(), and so I'm thinking that perhaps you could come up with a less
> heavy-handed method for simply testing whether a page.lru entry points to another
> vmemmap'd page. Something along the lines of adding this for
> vmemmap-enabled kernels:
>
> #define IN_VMEMMAP_RANGE(page) ((page >= VMEMMAP_VADDR) && (page <=
> VMEMMAP_END))
>
> and then have count_partial() replace the is_page_ptr() call with another
> slub function that does something like this for vmemmap-enabled kernels:
>
> (IN_VMMEMAP_RANGE(next) && accessible(next))
>
> Or instead of accessible(), it could read "next" as a list_head with
RETURN_ON_ERROR,
> and verify that next->prev points back to the current list_head.
>
> Non-vmemmap-enabled kernels could still use is_page_ptr().
>
> What do you think of doing something like that?
Given possible compatibility issues you said, I think that the way you suggested
might well be enough for now. I'll try a method like the above.
Thanks,
Kazuhito Hagio
>
> Dave
>
>
>
>
> ----- Original Message -----
>> Hi,
>>
>> The "kmem -[sS]" commands can take several minutes to complete with
>> the following conditions:
>> * The system has a lot of memory sections with CONFIG_SPARSEMEM.
>> * The kernel uses SLUB and it has a very long partial slab list.
>>
>> crash> kmem -s dentry | awk '{print strftime("%T"), $0}'
>> 10:18:34 CACHE NAME OBJSIZE ALLOCATED
>> TOTAL
>> SLABS SSIZE
>> 10:19:41 ffff88017fc78a00 dentry 192 9038949
>> 10045728
>> 239184 8k
>> crash> kmem -S dentry | bash -c 'cat >/dev/null ; echo
$SECONDS'
>> 334
>>
>> One of the causes is that is_page_ptr() in count_partial() checks if
>> a given slub page address is a page struct by searching all memory
>> sections linearly for the one which includes it.
>>
>> nr_mem_sections = NR_MEM_SECTIONS();
>> for (nr = 0; nr < nr_mem_sections ; nr++) {
>> if ((sec_addr = valid_section_nr(nr))) {
>> ...
>>
>> With CONFIG_SPARSEMEM{_VMEMMAP}, we can calculate the memory section
>> which includes a page struct with its page.flags, or its address and
>> VMEMMAP_VADDR. With this patch doing so, the computation amount can be
>> significantly reduced in that case.
>>
>> crash> kmem -s dentry | awk '{print strftime("%T"), $0}'
>> 10:34:55 CACHE NAME OBJSIZE ALLOCATED
>> TOTAL
>> SLABS SSIZE
>> 10:34:55 ffff88017fc78a00 dentry 192 9038949
>> 10045728
>> 239184 8k
>> crash> kmem -S dentry | bash -c 'cat >/dev/null ; echo
$SECONDS'
>> 2
>>
>> This patch uses VMEMMAP_VADDR. It is not defined on PPC64, but it looks
>> like PPC64 supports VMEMMAP flag and machdep->machspec->vmemmap_base is
>> it, so this patch also defines it for PPC64. This might need some help
>> from PPC folks.
>>
>> Signed-off-by: Kazuhito Hagio <k-hagio(a)ab.jp.nec.com>
>> ---
>> defs.h | 2 ++
>> memory.c | 15 +++++++++++++++
>> 2 files changed, 17 insertions(+)
>>
>> diff --git a/defs.h b/defs.h
>> index aa17792..84e68ca 100644
>> --- a/defs.h
>> +++ b/defs.h
>> @@ -3861,6 +3861,8 @@ struct efi_memory_desc_t {
>> #define IS_VMALLOC_ADDR(X) machdep->machspec->is_vmaddr(X)
>> #define KERNELBASE machdep->pageoffset
>>
>> +#define VMEMMAP_VADDR (machdep->machspec->vmemmap_base)
>> +
>> #define PGDIR_SHIFT (machdep->pageshift + (machdep->pageshift -3) +
>> (machdep->pageshift - 2))
>> #define PMD_SHIFT (machdep->pageshift + (machdep->pageshift - 3))
>>
>> diff --git a/memory.c b/memory.c
>> index 0df8ecc..0696763 100644
>> --- a/memory.c
>> +++ b/memory.c
>> @@ -13348,10 +13348,25 @@ is_page_ptr(ulong addr, physaddr_t *phys)
>> ulong nr_mem_sections;
>> ulong coded_mem_map, mem_map, end_mem_map;
>> physaddr_t section_paddr;
>> +#ifdef VMEMMAP_VADDR
>> + ulong flags;
>> +#endif
>>
>> if (IS_SPARSEMEM()) {
>> nr_mem_sections = NR_MEM_SECTIONS();
>> +#ifdef VMEMMAP_VADDR
>> + nr = nr_mem_sections;
>> + if (machdep->flags & VMEMMAP)
>> + nr = pfn_to_section_nr((addr - VMEMMAP_VADDR) / SIZE(page));
>> + else if (readmem(addr + OFFSET(page_flags), KVADDR, &flags,
>> + sizeof(ulong), "page.flags", RETURN_ON_ERROR|QUIET))
>> + nr = (flags >> (SIZE(page_flags)*8 - SECTIONS_SHIFT())
>> + & ((1UL << SECTIONS_SHIFT()) - 1));
>> +
>> + if (nr < nr_mem_sections) {
>> +#else
>> for (nr = 0; nr < nr_mem_sections ; nr++) {
>> +#endif
>> if ((sec_addr = valid_section_nr(nr))) {
>> coded_mem_map = section_mem_map_addr(sec_addr);
>> mem_map = sparse_decode_mem_map(coded_mem_map,
>> nr);
>> --
>> 1.8.3.1
>>
>> --
>> Crash-utility mailing list
>> Crash-utility(a)redhat.com
>>
https://www.redhat.com/mailman/listinfo/crash-utility
>>
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
>
https://www.redhat.com/mailman/listinfo/crash-utility
>
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility