On Tue, Oct 29, 2013 at 1:36 AM, Dave Anderson <anderson(a)redhat.com> wrote:
Hi Vinayak,
Good catch -- it certainly should be fixed. The devil is in the details of how
best to accomplish it...
On the face of it, it would seem that the task group list gathering via do_list()
would be the best way. The problem is that it could potentially cause the
"ps"
command to fail on live systems if the list becomes invalid, or is invalid with
respect to the pre-gathered set of tasks that get captured prior to execution
of the command.
Prior to each crash command whose command_table_entry structure contains the
REFRESH_TASK_TABLE flag (as does "ps"), the tt->refresh_task_table()
function
is called to scan the kernel's pid_hash[] for a stable set of tasks. And only
those tasks are accessed or used by that crash command instance.
By calling do_list(), it's conceivable (although fairly unlikely) that the command
could go either go off into the weeds following a list that's being modified, or
if the list has changed since the "stable" list was gathered prior to the
command.
So preferably, it would be best if the new functionality could be constrained to
the stable set of pre-gathered tasks -- again if for no other reason than the fact
that the crash utility only uses the pre-gathered task-set whenever it parses tasks.
There's also another potential problem in having get_task_mem_usage() use
open_tmpfile(). Since it's a function that gets called by other functions
or macros (IN_TASK_VMA() for example), it's possible that the caller may
also be operating inside an open_tmpfile() area, and open_tmpfile() is not
recursive. In those internal cases, open_tmpfile2() could be used, but again,
it's probably best to avoid the do_list() call.
So I tinkered with an optional manner of checking the thread group list, i.e.,
like this:
if (VALID_MEMBER(task_struct_rss_stat)) {
int i, cnt;
int sync_rss;
ulong tgid;
struct task_context *tc1;
tgid = task_tgid(task);
tc1 = FIRST_CONTEXT();
for (i = cnt = 0; i < RUNNING_TASKS(); i++, tc1++) {
if (task_tgid(tc1->task) != tgid)
continue;
/* count 0 -> filepages */
if (!readmem(tc1->task +
OFFSET(task_struct_rss_stat) +
OFFSET(task_rss_stat_count), KVADDR,
&sync_rss,
sizeof(int), "task_struct rss_stat 0",
RETURN_ON_ERROR))
continue;
rss += sync_rss;
/* count 1 -> anonpages */
if (!readmem(tc1->task +
OFFSET(task_struct_rss_stat) +
OFFSET(task_rss_stat_count) + sizeof(int),
KVADDR, &sync_rss,
sizeof(int), "task_struct rss_stat 0",
RETURN_ON_ERROR))
continue;
rss += sync_rss;
}
}
It's inefficient in comparison to using do_list(), but it does guarantee
safety on live systems.
A couple other minor issues -- the two readmem() type strings above both
indicate "rss_stat 0", so they should be differentiated with strings
indicating
"task_struct rss_stat MM_FILEPAGES" and "task_struct rss_stat
MM_ANONPAGES".
And lastly, since MEMBER_EXISTS() and MEMBER_OFFSET_INIT() do essentially
the same thing, it simpler in this case to just call MEMBER_OFFSET_INIT()
the 3 times without any MEMBER_EXISTS() conditionalization:
+ if (MEMBER_EXISTS("task_struct", "thread_group"))
+ MEMBER_OFFSET_INIT(task_struct_thread_group, "task_struct",
+ "thread_group");
+
+ if (MEMBER_EXISTS("task_struct", "rss_stat")) {
+ MEMBER_OFFSET_INIT(task_struct_rss_stat, "task_struct",
+ "rss_stat");
+ MEMBER_OFFSET_INIT(task_rss_stat_count, "task_rss_stat",
+ "count");
+ }
+
If the members don't exist, they will just get set to INVALID_OFFSET (-1).
What do you think?
Dave
Hi Dave,
I have attached an updated patch with changes as per your suggestion.
Vinayak