----- Original Message -----
 
 
 ----- Original Message -----
 > Hi Dave,
 > 
 > Thank you for your comment!
 > 
 > On 2/13/2018 2:33 PM, Dave Anderson wrote:
 > > 
 > > Hi Kasuhito,
 > > 
 > > I am planning on releasing crash-7.2.1 today, so this will have to be deferred
 > > to 7.2.2.
 > > 
 > > Because of your questions about ppc64, possible backwards-compatibility
issues,
 > > or potential future changes to page.flags usage, this permanent change to the
 > > is_page_ptr() function solely for the purposes of SLUB's count_partial()
function
 > > makes me nervous.
 > 
 > Sorry, my explanation was not enough. Please let me supplement a little.
 > As far as I know, the count_partial() is the function which could receive the
 > effect of this patch most.  But is_page_ptr() could be called many times also
 > through the other functions, so this will improve them, too.  Moreover, the
 > is_page_ptr()'s loop could loop NR_MEM_SECTIONS() times, which is 33554432
 > on x86_64(5level) if I understand correctly, in the worst case.
 > 
 > So I thought that ideally we should improve the is_page_ptr() itself if possible,
 > rather than find the callers which could call it many times and change them.
 > Also, I am willing to drop the definition of VMEMMAP_VADDR for ppc64 this
 > time.
 
 OK, I understand your point.  But what concerns me is that the function's
 purpose is to absolutely identify whether the incoming page structure address
 is a correct page structure address.  But if an invalid address gets passed
 into is_page_ptr(), your patch would take the invalid address, calculate an
 invalid "nr", and continue from there, right? 
Another suggestion/question -- if is_page_ptr() is called with a NULL phys 
argument (as is done most of the time),  could it skip the "if IS_SPARSEMEM()"
section at the top, and still utilize the part at the bottom, where it walks
through the vt->node_table[x] array?  I'm not sure about the "ppend"
calculation
though -- even if there are holes in the node's address space, is it still a 
contiguous chunk of page structure addresses per-node? 
 
 Dave
 
 > 
 > > 
 > > There is really no compelling reason that count_partial() absolutely
 > > *must* use
 > > is_page_ptr(), and so I'm thinking that perhaps you could come up with a
 > > less
 > > heavy-handed method for simply testing whether a page.lru entry points to
 > > another
 > > vmemmap'd page.  Something along the lines of adding this for
 > > vmemmap-enabled kernels:
 > > 
 > >   #define IN_VMEMMAP_RANGE(page) ((page >= VMEMMAP_VADDR) && (page
<=
 > >   VMEMMAP_END))
 > > 
 > > and then have count_partial() replace the is_page_ptr() call with another
 > > slub function that does something like this for vmemmap-enabled kernels:
 > > 
 > >    (IN_VMMEMAP_RANGE(next) && accessible(next))
 > > 
 > > Or instead of accessible(), it could read "next" as a list_head with
 > > RETURN_ON_ERROR,
 > > and verify that next->prev points back to the current list_head.
 > > 
 > > Non-vmemmap-enabled kernels could still use is_page_ptr().
 > > 
 > > What do you think of doing something like that?
 > 
 > Given possible compatibility issues you said, I think that the way you
 > suggested
 > might well be enough for now.  I'll try a method like the above.
 > 
 > Thanks,
 > Kazuhito Hagio
 > 
 > > 
 > > Dave
 > > 
 > > 
 > >       
 > > 
 > > ----- Original Message -----
 > >> Hi,
 > >>
 > >> The "kmem -[sS]" commands can take several minutes to complete
with
 > >> the following conditions:
 > >>   * The system has a lot of memory sections with CONFIG_SPARSEMEM.
 > >>   * The kernel uses SLUB and it has a very long partial slab list.
 > >>
 > >>   crash> kmem -s dentry | awk '{print strftime("%T"),
$0}'
 > >>   10:18:34 CACHE            NAME                 OBJSIZE  ALLOCATED
 > >>   TOTAL
 > >>   SLABS  SSIZE
 > >>   10:19:41 ffff88017fc78a00 dentry                   192    9038949
 > >>   10045728
 > >>   239184     8k
 > >>   crash> kmem -S dentry | bash -c 'cat >/dev/null ; echo
$SECONDS'
 > >>   334
 > >>
 > >> One of the causes is that is_page_ptr() in count_partial() checks if
 > >> a given slub page address is a page struct by searching all memory
 > >> sections linearly for the one which includes it.
 > >>
 > >>         nr_mem_sections = NR_MEM_SECTIONS();
 > >>         for (nr = 0; nr < nr_mem_sections ; nr++) {
 > >>                 if ((sec_addr = valid_section_nr(nr))) {
 > >>                         ...
 > >>
 > >> With CONFIG_SPARSEMEM{_VMEMMAP}, we can calculate the memory section
 > >> which includes a page struct with its page.flags, or its address and
 > >> VMEMMAP_VADDR. With this patch doing so, the computation amount can be
 > >> significantly reduced in that case.
 > >>
 > >>   crash> kmem -s dentry | awk '{print strftime("%T"),
$0}'
 > >>   10:34:55 CACHE            NAME                 OBJSIZE  ALLOCATED
 > >>   TOTAL
 > >>   SLABS  SSIZE
 > >>   10:34:55 ffff88017fc78a00 dentry                   192    9038949
 > >>   10045728
 > >>   239184     8k
 > >>   crash> kmem -S dentry | bash -c 'cat >/dev/null ; echo
$SECONDS'
 > >>   2
 > >>
 > >> This patch uses VMEMMAP_VADDR. It is not defined on PPC64, but it looks
 > >> like PPC64 supports VMEMMAP flag and machdep->machspec->vmemmap_base
is
 > >> it, so this patch also defines it for PPC64. This might need some help
 > >> from PPC folks.
 > >>
 > >> Signed-off-by: Kazuhito Hagio <k-hagio(a)ab.jp.nec.com>
 > >> ---
 > >>  defs.h   |  2 ++
 > >>  memory.c | 15 +++++++++++++++
 > >>  2 files changed, 17 insertions(+)
 > >>
 > >> diff --git a/defs.h b/defs.h
 > >> index aa17792..84e68ca 100644
 > >> --- a/defs.h
 > >> +++ b/defs.h
 > >> @@ -3861,6 +3861,8 @@ struct efi_memory_desc_t {
 > >>  #define IS_VMALLOC_ADDR(X) machdep->machspec->is_vmaddr(X)
 > >>  #define KERNELBASE      machdep->pageoffset
 > >>  
 > >> +#define VMEMMAP_VADDR   (machdep->machspec->vmemmap_base)
 > >> +
 > >>  #define PGDIR_SHIFT     (machdep->pageshift + (machdep->pageshift
-3) +
 > >>  (machdep->pageshift - 2))
 > >>  #define PMD_SHIFT       (machdep->pageshift + (machdep->pageshift -
3))
 > >>  
 > >> diff --git a/memory.c b/memory.c
 > >> index 0df8ecc..0696763 100644
 > >> --- a/memory.c
 > >> +++ b/memory.c
 > >> @@ -13348,10 +13348,25 @@ is_page_ptr(ulong addr, physaddr_t *phys)
 > >>  	ulong nr_mem_sections;
 > >>  	ulong coded_mem_map, mem_map, end_mem_map;
 > >>  	physaddr_t section_paddr;
 > >> +#ifdef VMEMMAP_VADDR
 > >> +	ulong flags;
 > >> +#endif
 > >>  
 > >>  	if (IS_SPARSEMEM()) {
 > >>  		nr_mem_sections = NR_MEM_SECTIONS();
 > >> +#ifdef VMEMMAP_VADDR
 > >> +		nr = nr_mem_sections;
 > >> +		if (machdep->flags & VMEMMAP)
 > >> +			nr = pfn_to_section_nr((addr - VMEMMAP_VADDR) / SIZE(page));
 > >> +		else if (readmem(addr + OFFSET(page_flags), KVADDR, &flags,
 > >> +			sizeof(ulong), "page.flags", RETURN_ON_ERROR|QUIET))
 > >> +			nr = (flags >> (SIZE(page_flags)*8 - SECTIONS_SHIFT())
 > >> +				& ((1UL << SECTIONS_SHIFT()) - 1));
 > >> +
 > >> +		if (nr < nr_mem_sections) {
 > >> +#else
 > >>  	        for (nr = 0; nr < nr_mem_sections ; nr++) {
 > >> +#endif
 > >>  	                if ((sec_addr = valid_section_nr(nr))) {
 > >>  	                        coded_mem_map =
 > >>  	                        section_mem_map_addr(sec_addr);
 > >>  	                        mem_map = sparse_decode_mem_map(coded_mem_map,
 > >>  	                        nr);
 > >> --
 > >> 1.8.3.1
 > >>
 > >> --
 > >> Crash-utility mailing list
 > >> Crash-utility(a)redhat.com
 > >> 
https://www.redhat.com/mailman/listinfo/crash-utility
 > >>
 > > 
 > > --
 > > Crash-utility mailing list
 > > Crash-utility(a)redhat.com
 > > 
https://www.redhat.com/mailman/listinfo/crash-utility
 > > 
 > 
 > --
 > Crash-utility mailing list
 > Crash-utility(a)redhat.com
 > 
https://www.redhat.com/mailman/listinfo/crash-utility
 > 
 
 --
 Crash-utility mailing list
 Crash-utility(a)redhat.com
 
https://www.redhat.com/mailman/listinfo/crash-utility