Re: [Crash-utility] Re:[RFC] Crash patch for DWARF CFI based unwind support

Monday, 23 October 2006

Rachita Kothiyal wrote:

...
 On Thu, Oct 19, 2006 at 05:15:32PM -0400, Dave Anderson wrote:
 >
 > > There still are a couple of things which need to be done, viz
 > > 1. Extend to obtaining unwind info from modules as well(currently
 > >    doing only for the kernel)
 > > 2. Currently reading the unwind info from eh_frame section only(ie
 > >    __start_unwind to __end_unwind). Need to add facility to read from
 > >    the .debug_frame(if .debug_frame is present in cases where .eh_frame
 > >    is absent. Will have to read from the vmlinux if we want to read the
 > >    .debug_frame info)
 >
 > Hi Rachita,
 >
 > I hope to be able to come up with a new crash version
 > for you to continue working with by tomorrow, Monday at
 > the latest.
 >
 > Off the top of my head, here's what I've done with your
 > initial patch:
 >
 > 1. As Ben mentioned, it need to be made compilable for
 >    other architectures.
 > 2. Renamed unwind_x86_64.c into unwind_x86_32_64.c,
 >    because the unwind code should be architecture
 >    neutral with respect to x86 and x86_64.  It's currently
 >    #ifdef'd to only be compile if X86_64, but when a
 >    new "unwind_x86.h" file is ready to go, it can be
 >    made usable by both arches.
 > 3. Made it capable of reading .eh_frame data from the
 >    vmlinux file if it is not in memory.
 > 4. Made it capable of reading all of the module's unwind
 >    tables.
 > 5. Restored the unwind() function to reflect the kernel
 >    version in that it new uses a new find_table() routine,
 >    which returns a pointer to the local copy of the unwind
 >    that contains the incoming pc.
 > 6. Cleaned up a bunch of cruft...
 >

 Hi Dave

 On the panic task, when we do the following:

    set unwind on
    bt
    set unwind off
    bt

 This last bt does not give us the same backtrace as what we get when crash
 first starts up(ie unwind is off by default). What is happening here is, when
 unwind is set to on, and we do a 'bt', we go to get_netdump_regs_x86_64() to get
rsp and rip, where ASSIGN_SIZE(user_regs_struct) happens, thereby setting
 VALID_STRUCT(user_regs_struct) to 1. Now when we next do 'set unwind off' and
 'bt', we satisfy the following if condition in get_netdump_regs_x86_64() as
 VALID_STRUCT(user_regs_struct) is set:

  if (((NETDUMP_DUMPFILE() || KDUMP_DUMPFILE()) &&
           VALID_STRUCT(user_regs_struct) && (bt->task == tt->panic_task))
||              (KDUMP_DUMPFILE() && (kt->flags & DWARF_UNWIND) &&
           (bt->flags & BT_DUMPFILE_SEARCH))) {

 So this results in it reading the register values from the NT_PRSTATUS.
 Hence the backtrace looks different from what we get from the existing
 non-dwarf mechanism.

 To avoid this, we could use a local variable for the user_regs_struct size
 instead of changing things at the global scope with ASSIGN_SIZE(). Or
 invalidate the user_regs_struct before we leave from get_netdump_regs_x86_64().

 Or, if it is desired that registers be read for the panic task from the
 NT_PRSTATUS section in the normal non-dwarf backtrace mechanism (which
 currently does not work as expected because of the user_regs_struct
 initialisation problem in x86_64), then probably it will have to be fixed
 some other way. 
Hmmm, yeah, good catch...

But what happens the second time around, anyway?  Are the RSP/RIP
starting points so different such that the low_budget tracer's output
is so drastically different?  Or does it go off into the weeds because
the other user_regs_struct register offsets (that don't get initialized)
cause an OFFSET() failure?

Dave

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Crash-utility] Re:[RFC] Crash patch for DWARF CFI based unwind support