Sharyathi Nagesh wrote:

This is the patch after making following changes
o Created a new function gather_cpudata_list_v2_nodes().
o gather_cpudata_list_v2_nodes() will be called for each node and it will update avail with corresponding value of shared memory..
o gather_cpudata_list_v2_nodes() is called inside do_slab_chain_percpu_v2_nodes during SLAB_WALKTHROUGH instead outside as earlier..
o Have removed the commented out section of SLAB_WALKTHROUGH (specified slab).
o updated with FREEBUF at possible exit points.
o updated dump_vm_table() to dump vt->kmem_cache_len_nodes
Opens
o The si->found field was not getting set for the dump I analysed. so if(si->found) part of the code was not getting executed.
Needs to be checked for this case.
Please go through the patch and let me know of your opinions.
Thanks
Sharyathi Nagesh

Hi Sharyathi,

OK -- it's getting there, but there are a couple problems with
this second patch.

The purpose of gather_cpudata_list_v2_nodes() is to fill
several arrays of cached objects:

(1) the list of per-cpu objects attached to the kmem_cache,
referenced by the "array_cache[NR_CPUS]" member:

      struct kmem_cache {
          ...
          struct array_cache *array[NR_CPUS];
          ...

    So, for each cpu, its cached per-cpu objects from the
    above get stored in crash in each dynamically-allocated,
    per-cpu "cpudata" array:

      struct si_meminfo {
          ...
          ulong *cpudata[NR_CPUS];

    Now that gather_cpudata_list_v2_nodes() is being called
    multiple times instead of just once, the loading of each
    of the cpudata[NR_CPUS] array is being done redundantly.
    This is harmless, although you could probably just look
    at the "index" argument you now pass in, and if it's
    non-zero, the redundant reading of the cached per-cpu
    objects above can be avoided.

(2) The second function of the current gather_cpudata_list_v2()
is to gather the "shared" list from the singular kmem_list3
structure:

      struct kmem_list3 {
          ...
          struct array_cache      *shared;

    Since this has (until now) been a single list per kmem_cache,
    crash has been storing the list of shared objects in a singular
    dynamically-allocated array:

      struct si_meminfo {
          ...
          ulong *shared_array_cache;

and that's the problem with your new gather_cpudata_list_v2_nodes().
It's using a singular "shared_array_cache" array multiple times,
once for each "index" of kmem_list3 structure. So it ends up
over-writing the array each time, and so the si->shared_array_cache
ends up containing the list of objects from the *last* "index"
only.

> o The si->found field was not getting set for the dump I analysed.
> so if(si->found) part of the code was not getting executed.

Given the above, this makes sense, since later on, when the
shared_array_cache list is searched for a particular object, it
most likely won't find it unless it happened to be contained in
the last "index'd" set of objects.

Now this can be solved in a couple of ways:

The dynamic allocation of the singular shared_array_cache array
can be increased by multiplying its currently predetermined
max-size *times* the vt->kmem_cache_len_nodes. And then each
time gather_cpudata_list_v2_nodes() is called, it could read
the next set of objects into the location in the array where
it left off the last time, i.e., the first non-zero entry.

Alternatively, a different array could be used, although
it couldn't use NR_CPUS as a delimiter, but would have
to be aware of a maximum-number-of-numa-nodes, which
would be kind of ugly. And if you use a new array, then
the check_shared_list() function would have to be updated
to check all of the lists instead of the current single
list.

BTW, there's a bug in check_shared_list() that I just
noticed now -- it's harmless, but dumb:

static int
check_shared_list(struct meminfo *si, ulong obj)
{
int i;

        if (INVALID_MEMBER(kmem_list3_shared) ||
            !si->shared_array_cache)
                return FALSE;

        for (i = 0; i < si->shared_array_cache[i]; i++) {
                if (si->shared_array_cache[i] == obj)
                        return TRUE;
        }

return FALSE;
}

The for loop should be:

for (i = 0; si->shared_array_cache[i]; i++) {

It works now by dumb luck, stopping at the first non-zero
entry (object address), which is what it's supposed
to do.

Finally, there's one other "gotcha" with this scheme.
During intitialization, the value of vt->kmem_max_limit
is determined in max_cpudata_limit, and later on it's
used to allocate the object array:

si->shared_array_cache = (ulong *)
GETBUF(vt->kmem_max_limit * sizeof(ulong));

vt->kmem_max_limit is calculated during initialization
by determining the maximum cached object list size amongst
both the kmem_cache->array[NR_CPUS] and the singular
kmem_list3->array. Therefore, the max_cpudata_limit()
function is only checking the first one:

        /*
         * If the shared list can be accessed, check its size as well.
         */
        if (VALID_MEMBER(kmem_list3_shared) &&
            VALID_MEMBER(kmem_cache_s_lists) &&
            readmem(cache+OFFSET(kmem_cache_s_lists)+OFFSET(kmem_list3_shared),
            KVADDR, &shared, sizeof(void *), "kmem_list3 shared",
            RETURN_ON_ERROR|QUIET) &&
            readmem(shared+OFFSET(array_cache_limit),
            KVADDR, &limit, sizeof(int), "shared array_cache limit",
            RETURN_ON_ERROR|QUIET)) {
                if (limit > max_limit)
                        max_limit = limit;
        }

Like your other new functions, if there's more than one list,
they should probably all be checked for the largest one.

Clear as mud?

Dave