On 2023/08/24 17:29, lijiang wrote:
On Thu, Aug 24, 2023 at 10:01 AM HAGIO KAZUHITO(萩尾 一仁)
<k-hagio-ab(a)nec.com>
wrote:
> On 2023/08/23 14:44, Lianbo Jiang wrote:
>> When a task is exiting, usually kernel marks its flags as 'PF_EXITING',
>> but even so, sometimes the mm_struct has not been freed, it might still
>> be valid. For such tasks, the "ps/vm" commands won't display the
memory
>> usage. For example:
>>
>> crash> ps 47070
>> PID PPID CPU TASK ST %MEM VSZ RSS
> COMM
>> 47070 1 0 ffff9ba7c4910000 UN 0.0 0 0
> ra_ris.parse
>> crash> vm 47070
>> PID: 47070 TASK: ffff9ba7c4910000 CPU: 0 COMMAND:
> "ra_ris.parse"
>> MM PGD RSS TOTAL_VM
>> 0 0 0k 0k
>>
>> To be honest, this is a corner case, but it has already occurred in
>> actual production environments. Given that, let's allow the
"ps/vm"
>> commands to try to display the memory usage for this case, but it does
>> not guarantee that it can work well at any time, which still depends on
>> how far the mm_struct deconstruction has proceeded.
>
> Agree to display it, and looks like the deconstruction is done after
> task->mm is set to NULL, so it looks fine to me.
>
>
Thank you for the comments, Kazu.
> void __noreturn do_exit(long code)
> {
> ...
> exit_signals(tsk); /* sets PF_EXITING */
> ...
> exit_mm();
>
> static void exit_mm(void)
> {
> struct mm_struct *mm = current->mm;
> ...
> current->mm = NULL; ## task->mm is set to NULL here
> ...
> mmput(mm); ## release the resources actually
>
>
> On the other hand, the mm->mm_count is decremented in mmput(), is there
> need to check it?
>
>
Good question. I had the same thoughts, but finally I chose to double check
with the mm_count. Not sure how to ensure the memory synchronization can be
done, when the kernel is panicking.
Ok, it's fine to double check just in case, then ...
If the address of mm pointer is valid and the mm_struct members are always
legitimate, we won't need to double check.
But anyway, this is just my thoughts, maybe it's not correct completely. If
you do not want to have it, I can post v2 and simply remove
the IS_EXITING(task) from get_task_mem_usage().
Thanks.
Lianbo
> void mmput(struct mm_struct *mm)
> {
> might_sleep();
>
> if (atomic_dec_and_test(&mm->mm_users))
> __mmput(mm);
> }
>
> static inline void __mmput(struct mm_struct *mm)
> {
> ...
> exit_mmap(mm);
> ...
> mmdrop(mm);
> }
>
> static inline void mmdrop(struct mm_struct *mm)
> {
> ...
> if (unlikely(atomic_dec_and_test(&mm->mm_count)))
> __mmdrop(mm);
> }
>
> Thanks,
> Kazu
>
>>
>> With the patch:
>> crash> ps 47070
>> PID PPID CPU TASK ST %MEM VSZ RSS
> COMM
>> 47070 1 0 ffff9ba7c4910000 UN 90.8 38461228 31426444
> ra_ris.parse
>> crash> vm 47070
>> PID: 47070 TASK: ffff9ba7c4910000 CPU: 0 COMMAND:
> "ra_ris.parse"
>> MM PGD RSS TOTAL_VM
>> ffff9bad6e873840 ffff9baee0544000 31426444k 38461228k
>> VMA START END FLAGS FILE
>> ffff9bafdbe1d6c8 400000 8c5000 8000875
> /data1/rishome/ra_cu_cn_412/sbin/ra_ris.parse
>> ...
>>
>> Reported-by: Buland Kumar Singh <bsingh(a)redhat.com>
>> Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com>
>> ---
>> memory.c | 13 +++++++++++--
>> 1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/memory.c b/memory.c
>> index 5d76c5d7fe6f..7d59c0555a0e 100644
>> --- a/memory.c
>> +++ b/memory.c
>> @@ -4792,10 +4792,12 @@ get_task_mem_usage(ulong task, struct
> task_mem_usage *tm)
>> {
>> struct task_context *tc;
>> long rss = 0, rss_cache = 0;
>> + int mm_count = 0;
>> + ulong addr;
>>
>> BZERO(tm, sizeof(struct task_mem_usage));
>>
>> - if (IS_ZOMBIE(task) || IS_EXITING(task))
>> + if (IS_ZOMBIE(task))
>> return;
>>
>> tc = task_to_context(task);
>> @@ -4805,7 +4807,14 @@ get_task_mem_usage(ulong task, struct
> task_mem_usage *tm)
>>
>> tm->mm_struct_addr = tc->mm_struct;
>>
>> - if (!task_mm(task, TRUE))
>> + if (!(addr = task_mm(task, TRUE)))
>> + return;
>> +
>> + if (!readmem(addr + OFFSET(mm_struct_mm_count), KVADDR, &mm_count,
>> + sizeof(int), "mm_struct mm_count", RETURN_ON_ERROR))
>> + return;
tt->mm_struct is filled in task_mm(), I think we can use this:
mm_count = INT(tt->mm_struct + OFFSET(mm_struct_mm_count));
The following code in get_task_mem_usage() also use this.
Thanks,
Kazu