Re: [Crash-utility] arm64: "bt -f" output

Tuesday, 14 June 2016

----- Original Message -----
...
 On Mon, Jun 13, 2016 at 11:30:24AM -0400, Dave Anderson wrote:
 > 
 > > Dave,
 > > 
 > > On Fri, Jun 10, 2016 at 04:37:42PM -0400, Dave Anderson wrote:
 > > > Hi Takahiro,
 > > > 
 > > > To address my concerns about your patch, I added a few additional
 > > > changes and attached
 > > > it to this email.  The changes are:
 > > > 
 > > > (1) Prevent the stack dump "below" the #0 level.  Yes, the stack
data
 > > > region is contained within
 > > >     the incoming frame parameters, but it's ugly and we really
don't
 > > >     care to see what's before
 > > >     the #0 crash_kexec and crash_save_cpu #0 frames.
 > > > (2) Fill in the missing stack dump at the top of the process stack, up
 > > > to, but not including
 > > >     the user-space exception frame.
 > > > (3) Instead of showing the fp of 0 in the top-most frame's stack
 > > > address, fill it in with the
 > > >     address of the user-space exception frame.
 > > > 
 > > > Note that there is no dump of the stack containing the user-space
 > > > exception frame, but the
 > > > register dump itself should suffice.
 > > 
 > > Well, the essential problem with my patch is that the output from "bt
-f"
 > > looks like:
 > >      #XX ['fp'] 'function' at 'pc'  --- (1)
 > >      <function's stack dump>        --- (2)
 > > but that (1) and (2) are not printed as a single stack frame in the same
 > > iteration of while loop in arm64_back_trace_cmd().
 > > (I hope you understand what I mean :)
 > 
 > Actually I prefer your first approach.  I find this new one confusing, not
 > to mention unlike any of the other architectures in that the "frame
level"
 > #X address value is not contiguous with the stack addresses that get filled
 > in by -f.

 Can you please elaborate a bit here about "is not contiguous"? 
I mean that the #X [address] is not contiguous with the stack addresses
above and below it.  For example:

    ffff8003dc103d60: ffff8003dc103dc0 ffff80000028041c 
    ffff8003dc103d70: 0000000000000000 0000000000000022 
    ffff8003dc103d80: ffff8003db846b00 ffff8003db846b00 
 #3 [ffff8003dc103cf0] schedule_hrtimeout_range_clock at ffff8000007786f0
    ffff8003dc103d90: ffff8003dc103dc0 ffff80000028052c 
    ffff8003dc103da0: 0000000000000000 0000000000000022 
    ffff8003dc103db0: 0000000000000000 0000000000000000

...

 > Taking your picture into account:
 > 
 >          stack grows to lower addresses.
 >            /|\
 >             |
 >          |      |
 > new sp   +------+ <---
 >          |dyn   |   |
 >          | vars |   |
 > new fp   +- - - +   |
 >          |old fp|   | a function's stack frame
 >          |old lr|   |
 >          |static|   |
 >          |  vars|   |
 > old sp   +------+ <---
 >          |dyn   |
 >          | vars |
 > old fp   +------+
 >          |      |
 > 
 > Your first patch seemed natural to me because for any "#X" line containing
a function
 > name, that function's dynamic variables, the "old fp/old lr" pair, and
the function's
 > static variables were dumped below it (i.e., at higher stack addresses).
 > 
 > 
 > > To be consistent with the out format of x86, the output should be
 > >      <function's stack dump>
 > >      #XX ['fp'] 'function' at 'pc'
 > > 
 > > Unfortunately, this requires that arm64_back_trace_cmd() and other functions
should be overhauled.
 > > Please take a look at my next patch though it is uncompleted and still has room
for improvement.
 > 
 > I don't know what you mean by "consistent with the out format of x86"?
 With x86_64,
 > each #<level> line is simply the stack address where the function pushed its
return
 > address as a result of its making a "callq" to the next function.  Any
local variables of
 > the calling function would be at the next higher stack addresses:
 > 
 >   ...
 >   #X [stack address] function2 at 'return address'
 >   <function2's local variables>
 >   #Y [stack address] function1 at 'return address'
 >   <functions1's local variables>
 >   ...
 > 
 > So for digging out local stack variables associated with a function, it's a
simple
 > matter of looking "below" it in the "bt -f" output.

 That is exactly what I meant by "consistent with x86."
 On x86, the output looks like:

    <function2's local variables>
    #X [stack address] function2 at 'return address'
    <functions1's local variables>
    #Y [stack address] function1 at 'return address'
    ... 
No, that's not true -- look at my #X and #Y description above -- funcion2's local
variables are at higher stack addresses than the #X "stack entry" address. 
They
have to be -- the callq that pushes the return address at the #X stack location is
the last stack manipulation that the function does.   Expressed otherwise, a
function's
local variables are displayed "below" or "underneath" its #X line in
the "bt -f"
output.

...

 So users who are familiar with this format may get confused.
 (Or do I misunderstand anything?)

 In addition, my previous patch displays
    <function2's local variables>
    #Y [stack address] function1 at 'return address'
 in arm64_print_stackframe_entry(), and it sounds odd to me. 
BTW, the order in which it is done is based upon the kernel's dump_backtrace()
function, although I'll grant you that the kernel dump is only interested in
the pc.

...

 But, anyhow, it's up to you.

OK!  Thanks for giving in...  ;-)  

Thanks,
  Dave

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Crash-utility] arm64: "bt -f" output