Hi Daisuke,

On Tue, Oct 23, 2012 at 4:49 PM, HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:
From: Lei Wen <adrian.wenl@gmail.com>
Subject: Re: GCORE: add directly show backtrace function in crash
Date: Tue, 23 Oct 2012 15:06:30 +0800

> Hi Daisuke,
>
> On Tue, Oct 23, 2012 at 2:17 PM, HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com
>> wrote:
>
>> From: Lei Wen <adrian.wenl@gmail.com>
>> Subject: Re: GCORE: add directly show backtrace function in crash
>> Date: Mon, 22 Oct 2012 15:36:47 +0800
>>
>> > Hi Daisuke,
>> >
>> > On Mon, Oct 22, 2012 at 3:29 PM, HATAYAMA Daisuke <
>> d.hatayama@jp.fujitsu.com
>> >> wrote:
>> >
>> >> From: Lei Wen <adrian.wenl@gmail.com>
>> >> Subject: GCORE: add directly show backtrace function in crash
>> >> Date: Mon, 22 Oct 2012 12:21:49 +0800
>> >>
>> >> > Hi Daisuke,
>> >> >
>> >> > I create a new option "-tT" for gcore, so that it could display bt for
>> >> user
>> >> > space
>> >> > directly inside crash itself, without needing to dump a separated core
>> >> file
>> >> > image,
>> >> > and analyze it in a different gdb env.
>> >> >
>> >> > The attached patch is directly based on below patch, and verify over
>> ARM
>> >> > platform.
>> >> > http://osdir.com/ml/general/2012-10/msg32677.html
>> >> >
>> >> > Before use the corresponding gcore command, we need set env in crash
>> by:
>> >> >
>> >> > crash>> gdb set solib-search-path [system lib path]
>> >> >
>> >> > crash>> gdb file   [user space program symbol file]
>> >> >
>> >> > crash>> gcore -t [user space thread id]
>> >> >
>> >> > Thanks,
>> >> > Lei
>> >>
>> >> Hello Lei,
>> >>
>> >> That must be a useful feature, but I think it's very othogonal to
>> >> gcore command...
>> >>
>> >> Why not releasing your own extension module separately to gcore?
>> >>
>> >
>> > I put this function in gcore, is for gcore already provide the function
>> to
>> > get
>> > the general register for user space thread. If add another module, that
>> > part of function seems a little duplicated...
>> >
>>
>> I now understand why you want to add it in gcore.
>>
>> > Also provide the gcore the capability to either dump into a core file
>> > or directly display, user may have more choice over this. :)
>> >
>>
>> But, users then need to do more work such as loading application's
>> symbol files. There seems not so big difference.
>>
>> gdb has one symbol space only. If you load applications's symbols, it
>> can override kernel symbols. Then, gcore might behave abnormally. Can
>> users reset the loading of applications' symbols in any feature of
>> gdb?
>>
>
> Good question!
> However I am not a gdb expert... Hope someone here could give a solution...

If there's no such feature, users are forced to restart crash
utility. Symbol space are dirtry and the later crash and gcore's
behaviour can no longer be trusted. Not all users can use machine
powerfull enough. Restart should be considered hard for such users, in
particular on large dump files.

I can easily make such a situation where gcore doesn't work well. I
made the proram bellow, where task_struct sturucture is defined.

#include <stdio.h>

struct task_struct
{
        int x;
};

struct task_struct ts;

int main(void)
{
        printf("%p\n", &ts);

        return 0;
}

After loading this binary built with -g to crash, next load gcore
module. Then I saw the following failure of gcore.

crash> gcore 1904

gcore: invalid structure member offset: mm_struct_map_count
       FILE: libgcore/gcore_coredump.c  LINE: 75  FUNCTION: gcore_coredump()

[/pub/repos/crash-6.1.0/crash] error trace: 7fe1cffe7149 => 7fe1cffe740a => 7fe1cffdd3a7 => 54bc43

  54bc43: OFFSET_verify+164

gcore: invalid structure member offset: mm_struct_map_count
       FILE: libgcore/gcore_coredump.c  LINE: 75  FUNCTION: gcore_coredump()

Failed.

You are right, with this test case, I saw the same issue...
 

> Also a silly question, since kernel runs well with user symbol, why gdb
> could not live with the chaos?

The premiss would be wrong. crash can work wrongly if user symbol is
loaded. crash, and gcore, memoize the symbols they frequently refer to
in memory for performance. It is done in early start-up phase before
reaching crash's prompt. Such symbols memoized are not affected. But
there are symbols not memoized in crash and gcore. They are of couruse
affected.


Now I fully understand your concern, the same symbol would destroy kernel's
original cached one... Is there any method to let crash only use those symbol
from kernel, and gcore use those from user space when try to do the backtrace?

Thanks,
Lei