----- "ville mattila" <ville.mattila(a)stonesoft.com> wrote:
> From:
>
> Dave Anderson <anderson(a)redhat.com>
>
...
> But your kernel shows cache_cache.buffer_size set to zero -- and the
> ASSIGN_SIZE(kmem_cache_s) above dutifully downsized the data structure
> size from 204 to zero. Later on, that size was used to allocate a
> kmem_cache buffer, which failed when a GETBUF() was called with a zero-size.
>
> I guess a check could be made above for a zero cache_cache.buffer_size,
> but why would that ever be?
>
> Try this:
>
> # crash --no_kmem_cache vmlinux vmcore
>
> which will allow you to get past the kmem_cache initialization.
>
> Then enter:
>
> crash> p cache_cache
>
> Does the "buffer_size" member really show zero?
Yes it seems so!
initialize_task_state: using old defaults
<readmem: 8067a300, KVADDR, "fill_task_struct", 868, (ROE), 86e3f78>
addr: 8067a300 paddr: 67a300 cnt: 868
STATE: TASK_RUNNING (PANIC)
crash> p cache_cache
cache_cache = GETBUF(128 -> 0)
<readmem: 8067f1c0, KVADDR, "gdb_readmem_callback", 204, (ROE), 8ac00d8>
addr: 8067f1c0 paddr: 67f1c0 cnt: 204
$3 = {
array = {0x0, 0x8067f1c4, 0x8067f1c4, 0x0, 0x0, 0x0, 0x0, 0x0,
0xf7813e00, 0xf7849400, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
batchcount = 0,
limit = 0,
shared = 0,
buffer_size = 0,
reciprocal_buffer_size = 0,
flags = 0,
num = 0,
gfporder = 0,
gfpflags = 60,
colour = 120,
colour_off = 8,
slabp_cache = 0x100,
slab_size = 16777216,
dflags = 0,
ctor = 0xf,
name = 0x0,
next = {
next = 0x0,
prev = 0x2
},
nodelists = {0x40}
}
FREEBUF(0)
That's some serious corruption!
>
> BTW, you can work around the problem by commenting out the call
> to kmem_cache_downsize() in vm_init().
This workaround works ok.
But even then, if you comment out the call to kmem_cache_downsize(),
the kmem_cache_init() function could not have done anything useful
because the "cache_cache.next.next" pointer is corrupted with a NULL,
which points to the first of the chain of kmem_cache slab cache headers.
I'm surprised it managed to continue without running into another
roadblock -- did it display the "crash: unable to initialize kmem
slab cache subsystem" error message?
> (And if you're using makedumpfile with excluded pages, hope
that
> the problem I described above doesn't occur...)
>
We are not excluding files so this is not a big issue. Also
the --no_kmem_cache lets me open dump and let me do quite many things
already.
Like I mentioned before, I could put a check in kmem_cache_downsize()
to check for a zero buffer_size, but the odds of that happening are
absurdly small. I suppose I could check whether the value is less
than the kmem_cache.nodelists structure offset.
Dave