----- Original Message -----
On Mon, Jun 13, 2016 at 11:30:24AM -0400, Dave Anderson wrote:
>
> > Dave,
> >
> > On Fri, Jun 10, 2016 at 04:37:42PM -0400, Dave Anderson wrote:
> > > Hi Takahiro,
> > >
> > > To address my concerns about your patch, I added a few additional
> > > changes and attached
> > > it to this email. The changes are:
> > >
> > > (1) Prevent the stack dump "below" the #0 level. Yes, the stack
data
> > > region is contained within
> > > the incoming frame parameters, but it's ugly and we really
don't
> > > care to see what's before
> > > the #0 crash_kexec and crash_save_cpu #0 frames.
> > > (2) Fill in the missing stack dump at the top of the process stack, up
> > > to, but not including
> > > the user-space exception frame.
> > > (3) Instead of showing the fp of 0 in the top-most frame's stack
> > > address, fill it in with the
> > > address of the user-space exception frame.
> > >
> > > Note that there is no dump of the stack containing the user-space
> > > exception frame, but the
> > > register dump itself should suffice.
> >
> > Well, the essential problem with my patch is that the output from "bt
-f"
> > looks like:
> > #XX ['fp'] 'function' at 'pc' --- (1)
> > <function's stack dump> --- (2)
> > but that (1) and (2) are not printed as a single stack frame in the same
> > iteration of while loop in arm64_back_trace_cmd().
> > (I hope you understand what I mean :)
>
> Actually I prefer your first approach. I find this new one confusing, not
> to mention unlike any of the other architectures in that the "frame
level"
> #X address value is not contiguous with the stack addresses that get filled
> in by -f.
Can you please elaborate a bit here about "is not contiguous"?
I mean that the #X [address] is not contiguous with the stack addresses
above and below it. For example:
ffff8003dc103d60: ffff8003dc103dc0 ffff80000028041c
ffff8003dc103d70: 0000000000000000 0000000000000022
ffff8003dc103d80: ffff8003db846b00 ffff8003db846b00
#3 [ffff8003dc103cf0] schedule_hrtimeout_range_clock at ffff8000007786f0
ffff8003dc103d90: ffff8003dc103dc0 ffff80000028052c
ffff8003dc103da0: 0000000000000000 0000000000000022
ffff8003dc103db0: 0000000000000000 0000000000000000
> Taking your picture into account:
>
> stack grows to lower addresses.
> /|\
> |
> | |
> new sp +------+ <---
> |dyn | |
> | vars | |
> new fp +- - - + |
> |old fp| | a function's stack frame
> |old lr| |
> |static| |
> | vars| |
> old sp +------+ <---
> |dyn |
> | vars |
> old fp +------+
> | |
>
> Your first patch seemed natural to me because for any "#X" line containing
a function
> name, that function's dynamic variables, the "old fp/old lr" pair, and
the function's
> static variables were dumped below it (i.e., at higher stack addresses).
>
>
> > To be consistent with the out format of x86, the output should be
> > <function's stack dump>
> > #XX ['fp'] 'function' at 'pc'
> >
> > Unfortunately, this requires that arm64_back_trace_cmd() and other functions
should be overhauled.
> > Please take a look at my next patch though it is uncompleted and still has room
for improvement.
>
> I don't know what you mean by "consistent with the out format of x86"?
With x86_64,
> each #<level> line is simply the stack address where the function pushed its
return
> address as a result of its making a "callq" to the next function. Any
local variables of
> the calling function would be at the next higher stack addresses:
>
> ...
> #X [stack address] function2 at 'return address'
> <function2's local variables>
> #Y [stack address] function1 at 'return address'
> <functions1's local variables>
> ...
>
> So for digging out local stack variables associated with a function, it's a
simple
> matter of looking "below" it in the "bt -f" output.
That is exactly what I meant by "consistent with x86."
On x86, the output looks like:
<function2's local variables>
#X [stack address] function2 at 'return address'
<functions1's local variables>
#Y [stack address] function1 at 'return address'
...
No, that's not true -- look at my #X and #Y description above -- funcion2's local
variables are at higher stack addresses than the #X "stack entry" address.
They
have to be -- the callq that pushes the return address at the #X stack location is
the last stack manipulation that the function does. Expressed otherwise, a
function's
local variables are displayed "below" or "underneath" its #X line in
the "bt -f"
output.
So users who are familiar with this format may get confused.
(Or do I misunderstand anything?)
In addition, my previous patch displays
<function2's local variables>
#Y [stack address] function1 at 'return address'
in arm64_print_stackframe_entry(), and it sounds odd to me.
BTW, the order in which it is done is based upon the kernel's dump_backtrace()
function, although I'll grant you that the kernel dump is only interested in
the pc.
But, anyhow, it's up to you.
OK! Thanks for giving in... ;-)
Thanks,
Dave