----- "Bob Montgomery" <bob.montgomery(a)hp.com> wrote:
Dave,
Please pardon the direct question, I'm attempting to cash in on my "dis
-l" goodwill :-)
The latest problem I'm working on:
We occasionally get dumps that wake up in crash with:
...
please wait... (gathering kmem slab cache data)
crash-4.0.9-fix: page excluded: kernel virtual address:
ffff88022457a000
type: "kmem_cache_s buffer"
crash-4.0.9-fix: unable to initialize kmem slab cache subsystem
...
These are partial dumps with only kernel pages included.
This problem comes about because readmem fails to read one
of the kmem_cache structs in the list, for example:
crash-4.0.9-fix> struct kmem_cache 0xffff880224579cc0
struct kmem_cache struct: page excluded: kernel virtual address:
ffff88022457a000 type: "gdb_readmem_callback"
Cannot access memory at address 0xffff880224579cc0
This struct starts toward the end of a page (0xffff880224579cc0)
and extends into the next page (0xffff88022457a000) which has
been excluded from the dump because it isn't a kernel page.
That is pretty scary if I assume some bug in the kernel is
giving pages back to user land that still hold parts of kernel
structs. But that's not what's happening.
crash-4.0.9-fix> struct -o kmem_cache
struct kmem_cache {
[0x0] struct array_cache *array[32];
...
[0x158] struct list_head next;
[0x168] struct kmem_list3 *nodelists[64];
}
SIZE: 0x368
Crash thinks the struct is 0x368 in length, making the
apparent end of the struct lie in the next page (...a000
instead of ...9000)
crash-4.0.9-fix> p/x 0xffff880224579cc0+0x368
$3 = 0xffff88022457a028
But the clever kernel folks did this in slab.c:
/*
* We put nodelists[] at the end of kmem_cache, because we want to size
* this array to nr_node_ids slots instead of MAX_NUMNODES
* (see kmem_cache_init())
* We still use [MAX_NUMNODES] and not [1] or [0] because cache_cache
* is statically defined, so we reserve the max number of nodes.
*/
struct kmem_list3 *nodelists[MAX_NUMNODES];
So that means crash needs to curtail the read of kmem_cache
to the actual size of the nodelists array, instead of the
declared size.
I still need to determine if the actual size is determined
once for all instances, or per structure.
This should affect partial dumps with kernels that use slab.c.
I never noticed that before -- the buffer_size of the global "cache_cache"
kmem_cache structure gets downsized here in kmem_cache_init() in 2.6.22
and later:
/*
* struct kmem_cache size depends on nr_node_ids, which
* can be less than MAX_NUMNODES.
*/
cache_cache.buffer_size = offsetof(struct kmem_cache, nodelists) +
nr_node_ids * sizeof(struct kmem_list3 *);
So the fix would be to first determine the cache_cache.buffer_size value,
and use that to initialize the size_table.kmem_cache_s value used by the
"SIZE(kmem_cache_s)" macro. Secondly, "vt->kmem_cache_len_nodes",
which
is also based upon the same MAX_NUMNODES array index value, needs to be
downsized as well. It looks like if the kernel "nr_node_ids" exists as
symbol (instead of a #define), then it should be used.
Any other structs in the kernel like this that crash already
deals with?
None that I'm aware of...
Dave