----- Original Message -----
Hi, Dave
On Wed, Aug 21, 2013 at 12:16 PM, Dave Anderson <anderson(a)redhat.com> wrote:
>
> ----- Original Message -----
>> Hi,
>>
>> Not clear if it is a 3.11 issue or just general memory corruption. But
>> I clearly cannot load slab information from any of my 3.11 dumps. Slab
>> info contain incorrect pointer and "crash" just drops all slab
>> information.
>
> Did all of your 3.11 dumps fail in a similar manner?
Initially I saw the same issue at least on 3 different crashes. So I
thought that it might be 3.11 specific.
But now with a new dump that I got just now I do not see the "invalid
kernel virtual address" message anymore. Instead when I do "kmem -S" I
have following message:
======================================================
kmem: invalid structure member offset: kmem_cache_s_lists
FILE: memory.c LINE: 8955 FUNCTION: do_slab_chain_percpu_v2()
[/usr/local/google/home/anatol/sources/opensource/crash/crash] error
trace: 493f1d => 47eca0 => 517642 => 460b22
CACHE NAME OBJSIZE ALLOCATED TOTAL SLABS
SSIZE
460b22: OFFSET_verify.part.28+71
517642: OFFSET_verify+50
47eca0: do_slab_chain_percpu_v2+96
493f1d: dump_kmem_cache_percpu_v2+2205
kmem: invalid structure member offset: kmem_cache_s_lists
FILE: memory.c LINE: 8955 FUNCTION: do_slab_chain_percpu_v2()
====================================================
I have no idea what does it mean. I'll try to find more time later
this week and look at the problem deeper.
Any "invalid structure member offset" error means that
the upstream kernel structures have changed.
A quick glance at the current upstream kernel shows that
he kmem_cache.nodelists name and pointer has been renamed
as part of the slab/slub/slob unification rework.
In the 3.6 CONFIG_SLAB dumpfile I have, the kmem_cache
structure looks like this:
crash> kmem_cache
struct kmem_cache {
unsigned int batchcount;
unsigned int limit;
unsigned int shared;
unsigned int size;
u32 reciprocal_buffer_size;
unsigned int flags;
unsigned int num;
unsigned int gfporder;
gfp_t allocflags;
size_t colour;
unsigned int colour_off;
struct kmem_cache *slabp_cache;
unsigned int slab_size;
unsigned int dflags;
void (*ctor)(void *);
const char *name;
struct list_head list;
int refcount;
int object_size;
int align;
struct kmem_list3 **nodelists;
struct array_cache *array[4096];
}
SIZE: 32896
crash>
where the "nodelists" pointer points to the end of the
array_cache array[], where there are per-node array_cache
pointers located following the per-cpu array_cache pointers.
Upstream, the kmem_list3 structure looks to have been
absorbed into the CONFIG_SLAB part of the generic
"kmem_cache_node" structure, and its pointer name above
has been changed from "nodelists" to "nodes":
struct kmem_cache {
... [ cut ] ...
/* 6) per-cpu/per-node data, touched during every alloc/free */
/*
* We put array[] at the end of kmem_cache, because we want to size
* this array to nr_cpu_ids slots instead of NR_CPUS
* (see kmem_cache_init())
* We still use [NR_CPUS] and not [1] or [0] because cache_cache
* is statically defined, so we reserve the max number of cpus.
*
* We also need to guarantee that the list is able to accomodate a
* pointer for each node since "nodelists" uses the remainder of
* available pointers.
*/
struct kmem_cache_node **node;
struct array_cache *array[NR_CPUS + MAX_NUMNODES];
/*
* Do not add fields after array[]
*/
};
So hopefully a few more bait-and-switch name changes
similar to the patches you've been posting can handle
the changes.
Thanks,
Dave