On Fri, Sep 7, 2012 at 8:57 PM, Dave Anderson <anderson(a)redhat.com> wrote:
----- Original Message -----
> On Fri, Sep 7, 2012 at 4:28 PM, Dave Anderson <anderson(a)redhat.com>
> wrote:
> >
> >
> > ----- Original Message -----
> >> Hi all,
> >>
> >> I'm wondering about the use of the kernel 'nr_node_ids' variable
in
> >> memory.c. In kmem_cache_downsize(), vt->kmem_cache_len_nodes defaults
> >> to 1 when 'nr_node_ids' isn't present. But in vm_init() an
error
> >> message is printed in the same case. The reason I'm asking is that
I'm
> >> getting that error
> >>
> >> "unable to initialize kmem slab cache subsystem"
> >>
> >> on a 3.4 kernel. Having vm_init() default to
> >>
> >> vt->kmem_cache_len_nodes=1
> >>
> >> as well seems to bring up the slab subsystem, although I'm getting a
> >> couple of
> >>
> >> "kmem: vm_area_struct: full list: slab: <nn1> bad next
pointer: <nn2>"
> >>
> >> mixed into my kmem -S output. I have no idea if it's related.
> >
> > Hi Per,
>
> Hello =o)
>
> >
> > I don't have any recent sample kernels that have the configuration that
your
> > kernel is running, so I can't confidently answer/test this. I presume that
> > your kernel does not configure CONFIG_NODES_SHIFT (or set it to 0), so
> > that nr_node_ids becomes a #define instead of a variable. And to get it
>
> Indeed, that's exactly what happened.
>
> > to work, I'm also presuming that you changed the "else" clause in
vm_init()
> > to something like this:
> >
> > if (MEMBER_TYPE("kmem_cache", "nodelists") ==
TYPE_CODE_PTR) {
> > int nr_node_ids;
> > /*
> > * nodelists now a pointer to an outside array
> > */
> > vt->flags |= NODELISTS_IS_PTR;
> > if (kernel_symbol_exists("nr_node_ids")) {
> > get_symbol_data("nr_node_ids",
sizeof(int),
> > &nr_node_ids);
> > vt->kmem_cache_len_nodes = nr_node_ids;
> > } else {
> > - error(INFO, "nr_node_ids: symbol does not
exist\n");
> > - error(INFO, "unable to initialize kmem slab
cache subsystem\n\n");
> > - vt->flags |= KMEM_CACHE_UNAVAIL;
> > + vt->kmem_cache_len_nodes = 1;
> > }
>
> Again, indeed, that's more or less to the character what I changed it to.
>
> >
> > That looks reasonable to me.
> >
>
> Ok, because that was the main purpose of my first mail, understanding
> whether there was a reason why the
'nr_node_ids'-has-been-turned-into-a-macro-case
> was treated as an error in this context. So, you agree we could change it?
Yep -- it's queued for crash-6.1.0.
>
> > As far as the "kmem -S" output, are you running it on a live system?
> >
>
> Nope, dead as a doornail. Are these messages to be expected then?
Not really. You could follow the vm_area_struct's full-list in question
and verify that something's out of whack, starting from the (single)
kmem_cache->nodelists.slab_full linked list. The list should either
point back to itself (empty) or be a simple list_head linked list,
that leads to a slab with a next value of "nn2". Although, it would
also be interesting to know what the "nn2" value was? In other
words, was it a bogus address entirely, or a maybe an address in
a page that wasn't capture in the dump? (which shouldn't happen...)
It's here in verify_slab_v2():
list_head = (struct kernel_list_head *)(slab_buf + OFFSET(slab_list));
if (!IS_KVADDR((ulong)list_head->next) ||
!accessible((ulong)list_head->next)) {
error(INFO, "%s: %s list: slab: %lx bad next pointer: %lx\n",
si->curname, list, si->slab, (ulong)list_head->next);
errcnt++;
}
It certainly seems completely unrelated to the nr_node_ids question.
I'm guessing it's to do with the state of my dump, which isn't
accessible to me until after the weekend. In the unlikely event that
the fault's in Crash (see what what I did there?) I'm sure I'll be
back.
/Per
> Oh, and sorry for putting "[PATCH]" in the title when
there wasn't
> one. It was by accident.
>
> /Per
No problem...
Thanks,
Dave
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility