Re: [Crash-utility] crash: struct command can read irrelevant pages.

Wednesday, 19 February 2014

----- Original Message -----
...
 Hello,

 Finally, I've found the cause of the issue I mentioned as below
 when makedumpfile v1.5.5 was released:

 > 2. At first, the supported kernel will be updated to 3.12, but I
 > found an issue while testing for v1.5.5, which seems that the page
 > filtering works wrongly on kernel 3.12. I couldn't investigate this
 > yet and it will take some time to finish it.
 > Therefore, the latest supported kernel version is 3.11 in v1.5.5.

 This is neither a kernel issue nor a makedumpfile issue, it's a crash's bug.
 It can happen when a slab cache is stored at almost end of a page.

 == Description ==

 At the beginning, I found the error message below when I used crash for
 a dumpfile generated by makedumpfile -d2:

     please wait... (gathering kmem slab cache data)
     crash: page excluded: kernel virtual address: f4e87000  type: "kmem_cache
     buffer"

     crash: unable to initialize kmem slab cache subsystem

 This message indicated that crash failed to get a slab cache during
 kmem_cache_init(), and according to the below, crash failed to get
 the slab cache stored at f4e86f40:

     crash> p kmem_cache
     kmem_cache = $1 = (struct kmem_cache *) 0xc0b1cbc0 <kmem_cache_boot>
     crash>
     crash> list kmem_cache.list -s kmem_cache.name -h 0xc0b1cbc0
     ...
     f4d37840
       name = 0xf4edf540 "uid_cache"
     f4e86f40
     list: page excluded: kernel virtual address: f4e87000  type:
     "gdb_readmem_callback"

 It seems that the slab cache covered two pages, [f4e86000- f4e87000] and
 [f4e87000- f4e88000]. Well, let's confirm the *real* size of it.

 Since slab caches except kmem_cache_boot are allocated as slab objects,
 we can confirm the size like below:

   crash> p kmem_cache
   kmem_cache = $2 = (struct kmem_cache *) 0xc0b1cbc0 <kmem_cache_boot>
   crash> struct kmem_cache.object_size 0xc0b1cbc0
     object_size = 104
   crash>

 In my environment, the size was 104 bytes. Therefore, the slab cache
 stored at f4e86f40 fits in the single page([f4e86000- f4e87000]) and
 the excluded page([f4e87000- f4e88000]) isn't a related page.

 On the other hand, crash get the size from vmlinux by using gdb,
 it was 216 bytes:

     crash> struct kmem_cache
     struct kmem_cache {
         unsigned int batchcount;
         unsigned int limit;
         ...
         struct kmem_cache_node **node;
         struct array_cache *array[33];
     }
     SIZE: 216
     crash>

 So crash mistook the correlative pages of the slab cache as
 [f4e86000- f4e87000] and [f4e87000- f4e88000] even though the latter
 was a irrelevant page.

 This gap came from the fact that the size of slab cache is variable.

     struct kmem_cache {
     ...
             struct kmem_cache_node **node;
             struct array_cache *array[NR_CPUS + MAX_NUMNODES];
             /*
              * Do not add fields after array[]
              */
     };

 The size of "array" is the variable factor of kmem_cache.
 When building vmlinux, the size of kmem_cache will be calculated with
 NR_CPUS and MAX_NUMNODES, and put it into vmlinux as a debug information.
 (Sorry, I don't know gcc well. I may misunderstand this.)
 However, the actual size will be smaller than the defined size because
 the actual size will be decided based on the actual number of CPUs and NODEs.

 void __init kmem_cache_init(void)::
 ...
         /*
          * struct kmem_cache size depends on nr_node_ids & nr_cpu_ids
          */
         create_boot_cache(kmem_cache, "kmem_cache",
                 offsetof(struct kmem_cache, array[nr_cpu_ids]) +
                                   nr_node_ids * sizeof(struct kmem_cache_node
                                   *),  // object_size
                                   SLAB_HWCACHE_ALIGN);
         list_add(&kmem_cache->list, &slab_caches);

 As for kmem_cache, we can get the actual size of it from kmem_cache_boot,
 but I suppose that kmem_cache is not the only struct in kernel whose size
 is variable. So I think we should discuss how to address such issues like
 this.

 By the way, I mentioned the case of *SLAB* in this mail,
 but SLUB seems have the same issue.

 Thanks
 Atsushi Kumagai 

This is a "known" issue has been discussed on the crash-utility list in the
past,
at least with respect to the kmem_cache data structure.  But for any random data
structure that has such a construct, I'm not sure what can be done.

In the case of the CONFIG_SLAB kmem_cache data structure, there is a function
that is supposed to "downsize" the size value of the kmem_cache data structure
that is returned by gdb.  It is called here in kmem_cache_init(), just
prior to cycling through all of the kmem_cache structures, where the
page excluded error shown above occurred:

   8561         if (!(pc->flags & RUNTIME))
   8562                 kmem_cache_downsize();
   8563 
   8564         cache_buf = GETBUF(SIZE(kmem_cache_s));
   8565         hq_open();
   8566 
   8567         do {
   8568                 cache_count++;
   8569 
   8570                 if (!readmem(cache, KVADDR, cache_buf, SIZE(kmem_cache_s),
   8571                         "kmem_cache buffer", RETURN_ON_ERROR)) {
   8572                         FREEBUF(cache_buf);
   8573                         vt->flags |= KMEM_CACHE_UNAVAIL;
   8574                         error(INFO,
   8575                           "%sunable to initialize kmem slab cache
subsystem\n\n",
   8576                                 DUMPFILE() ? "\n" : "");
   8577                         hq_close();
   8578                         return;
   8579                 }

The SIZE(kmem_cache_s) value should have been downsized by that function,
but presumably it did not work.  If CRASHDEBUG(1) was turned on during initialization, 
you would have seen either of these two messages from kmem_cache_downsize():

                if (CRASHDEBUG(1))
                        fprintf(fp, "kmem_cache_downsize: %ld to %ld\n",
                                STRUCT_SIZE("kmem_cache"), SIZE(kmem_cache_s));

or:

                if (CRASHDEBUG(1)) {
                        fprintf(fp,
                            "\nkmem_cache_downsize: SIZE(kmem_cache_s): %ld "
                            "cache_cache.buffer_size: %d\n",
                                STRUCT_SIZE("kmem_cache"), buffer_size);
                        fprintf(fp,
                            "kmem_cache_downsize: nr_node_ids: %ld\n",
                                vt->kmem_cache_len_nodes);
                }

The function failed probably failed due to some kernel change.  In fact, 
I just checked a 3.13 CONFIG_SLAB kernel, and I see that kmem_cache_downsize()
no longer works for that kernel.

I see that kmem_cache_boot would be a good alternative for determining
the size on CONFIG_SLAB kernels, at least on 3.7 and later kernels where
it was introduced.  And for CONFIG_SLUB, which doesn't currently have a
"downsize" function, it looks like its "kmem_cache" cache also has
size
fields that could be used.

By any chance can you make the 32-bit vmlinux/vmcore pair available for
me to download?  Reply to me off-list if you can.

Thanks,
  Dave

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Crash-utility] crash: struct command can read irrelevant pages.