----- Original Message -----
Initial version of a crash module which can be used to show which
cgroups
is the currently active process member of.
---
Hello this is a simple crash extension that I hacked up over the weekend, in my
case when I look at kernel crash dump I want to quickly understand which cgroup
is the current process member of. Currently it uses the process from the current
context but this might change in the future. Here is an example output:
crash> show_cgroups
subsys: cpuset cgroup: c6666
subsys: cpu cgroup: c6666
subsys: cpuacct cgroup: c6666
subsys: io cgroup: c6666
subsys: memory cgroup: c6666
subsys: devices cgroup: c6666
subsys: freezer cgroup: c6666
subsys: perf_event cgroup: c6666
subsys: pids cgroup: c6666
I have tested this on 4.6-rc2 with and without cgroup support enabled.
I'm just sending this to get an initial idea whether I have used crash's
facilities correctly and canvas for future ideas. I'm aware there are already 2
cgroup modules but when I tried running either they complained of no cgroup
support or the command did nothing. In any case provided that the code is ok I
guess this can be used as a good example of how to traverse structures with crash
TODO:
* Make the command understand either task_struct pointer or pid being passed.
* Add support for pre-3.15 kernels (the cgroup name struct changed to kernfs at that
point)
* Whatever people think might be useful
Hello Nikolay,
I have a few comments and suggestions:
+#include "defs.h"
+
+#define CGROUP_PATH_MAX
What is the purpose of CGROUP_PATH_MAX?
+void proccgroup_init(void);
+void proccgroup_fini(void);
+
... [ cut ] ...
+
+void __attribute__((constructor))
+echo_init(void)
+{
+ register_extension(command_table);
+}
+
+void __attribute__((destructor))
+echo_fini(void) { }
+
You've got forward declarations of proccgroup_init() and proccgroup_fini(),
but the two functions don't exist. The "echo_xxx()" functions in the
echo.c sample module are meant to be replaced by names reflecting your
module.
If I run this module on a 3.10 kernel, I get this:
crash> extend proccgroup.so
./extensions/proccgroup.so: shared object loaded
crash> show_cgroups
Unsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported
kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel
versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel
versionshow_cgroups: invalid kernel virtual address: 0 type:
"cgroup_subsys_state->cgroup"
Unsupported kernel version
crash>
because of this:
+ /* Handle the 2 cases of cgroup_name and the kernfs one */
+ if (MEMBER_EXISTS("cgroup", "kn")) {
+ get_kn_cgroup_name(cgroup, subsys_ptr);
+ } else if (MEMBER_EXISTS("cgroup", "name")) {
+ fprintf(fp, "Unsupported kernel version");
+ }
You should replace the "fprintf(fp, "Unsupported kernel version")"
error message
above with an "error(FATAL, ...)" construct (and add a "\n" to the
end), so that
the command will fail immediately and return to the "crash> " prompt.
Alternatively, you can have your module purposely fail to load if any required
structure members do not exist. Typically modules will have their xxxx_init()
constructor function do some pre-checks of the kernel being analyzed, and if the
command will not be able to run correctly, return before calling register_extension():
void __attribute__((constructor))
xxxx_init(void)
{
if (!MEMBER_EXISTS(...) || !MEMBER_EXISTS(...)) {
fprintf(fp, "some error message\n");
return;
}
register_extension(command_table);
}
I note that all of your readmem() calls use RETURN_ON_ERROR, such as these examples:
+static void get_cgroup_name(ulong cgroup, char *buf, size_t buflen)
+{
+
+ ulong kernfs_node;
+ ulong cgroup_name_ptr;
+ ulong kernfs_parent;
+
+ /* Get cgroup->kn */
+ readmem(cgroup + MEMBER_OFFSET("cgroup", "kn"), KVADDR,
&kernfs_node, sizeof(void *),
+ "cgroup->kn", RETURN_ON_ERROR);
+
+ readmem(kernfs_node + MEMBER_OFFSET("kernfs_node", "parent"),
KVADDR, &kernfs_parent, sizeof(void *),
+ "kernfs_node->parent", RETURN_ON_ERROR);
+
+ if (kernfs_parent == 0) {
+ sprintf(buf, "/");
+ return;
+ }
+
+ /* Get kn->name */
+ readmem(kernfs_node + MEMBER_OFFSET("kernfs_node", "name"),
KVADDR, &cgroup_name_ptr, sizeof(void *),
+ "kernfs_node->name", RETURN_ON_ERROR);
+
+ read_string(cgroup_name_ptr, buf, buflen-1);
+}
Unless you can do something useful in the case of a readmem() error, it's best to
use FAULT_ON_ERROR. You will get an error message automatically generated by readmem(),
but similar to "error(FATAL, ...)" it will return immediately to the
"crash> " prompt.
In your module above, you would just get a stream of confusing error messages.
You might consider changing the name of the command to something that
flows off the fingertips a bit easier:
+static struct command_table_entry command_table[] = {
+ { "show_cgroups", show_proc_cgroups, help_proc_cgroups, 0},
+ { NULL },
+};
Maybe something like "cgrp", or "showcg", or anything short but to
the point.
Thanks,
Dave