> Hi Dave
>
> The following patch adds support for DWARF CFI based stack unwinding
> for crash. Since this method uses the call frame instructions for
> unwinding, it generates better backtraces than the existing backtrace
> mechanism. So when we have the unwind info available, this new method
> will be called, else we fall back to the existing mechanism.
>
> ... <this section moved below>
>
> Please provide your suggestions and comments.
>
> Thanks
> Rachita
 

Hi Rachita,

I've only been able to test this on a live system that has __start_unwind
and __end_unwind symbols, so I don't know what a backtrace with an
in-kernel exception frame, or a backtrace with a transition to the x86_64
IRQ stack or x86_64 exception stacks, would look like.  If you have
an example, I'd be interested in seeing how they get handled.

But what's there, i.e., with only user-mode-to-kernel-mode exception
frames being displayed, looks pretty good!

I'd like to tinker with it a bit, and fold most of what you've
done into 4.0-3.8.  Then you can use that as a source base to
continue.

I do have a few questions/comments.

This netdump.c patch doesn't make sense to me -- we really don't
want to be doing any ELFSTORE operations in the debug-only
netdump_memory_dump() debug function:

@@ -771,8 +772,11 @@ netdump_memory_dump(FILE *fp)
                        dump_Elf64_Phdr(nd->load64 + i, ELFREAD);
                offset64 = nd->notes64->p_offset;
                for (tot = 0; tot < nd->notes64->p_filesz; tot += len) {
+                  if (has_unwind_info)
+                       len = dump_Elf64_Nhdr(offset64, ELFSTORE);
+                  else
                        len = dump_Elf64_Nhdr(offset64, ELFREAD);
-                       offset64 += len;
+                   offset64 += len;
                }
                break;
        }

This patch below looks to only be necessary in dumpfiles, but it seems
like, given that the x86_64 user_regs_struct is unavailable in 2.6
vmlinux files, that the initializations in get_netdump_regs_x86_64()
would never get done -- because VALID_STRUCT(user_regs_struct) would
fail, right?

@@ -1562,8 +1566,10 @@ get_netdump_regs_x86_64(struct bt_info *
         if (is_task_active(bt->task))
                 bt->flags |= BT_DUMPFILE_SEARCH;

-       if ((NETDUMP_DUMPFILE() || KDUMP_DUMPFILE()) &&
-            VALID_STRUCT(user_regs_struct) && (bt->task == tt->panic_task)) {
+       if (((NETDUMP_DUMPFILE() || KDUMP_DUMPFILE()) &&
+             VALID_STRUCT(user_regs_struct) && (bt->task == tt->panic_task)) ||
+             (KDUMP_DUMPFILE() && has_unwind_info && (bt->flags &
+                       BT_DUMPFILE_SEARCH))) {
                if (nd->num_prstatus_notes > 1)
                        note = (Elf64_Nhdr *)
                                nd->nt_prstatus_percpu[bt->tc->processor];
@@ -1574,9 +1580,21 @@ get_netdump_regs_x86_64(struct bt_info *
                 len = roundup(len + note->n_namesz, 4);
                 len = roundup(len + note->n_descsz, 4);

+               if KDUMP_DUMPFILE() {
+                       ASSIGN_SIZE(user_regs_struct) = 27 * sizeof(unsigned long);
+                       ASSIGN_OFFSET(user_regs_struct_rsp) = 19 * sizeof(unsigned long);
+                       ASSIGN_OFFSET(user_regs_struct_rip) = 16 * sizeof(unsigned long);
+               }
                 user_regs = ((char *)note + len)
                         - SIZE(user_regs_struct) - sizeof(long);

+               if KDUMP_DUMPFILE() {
+                       *rspp = *(ulong *)(user_regs + OFFSET(user_regs_struct_rsp));
+                       *ripp = *(ulong *)(user_regs + OFFSET(user_regs_struct_rip));
+                       if (*ripp && *rspp)
+                               return;
+               }
+

But then again, perhaps you never needed the user_regs_struct_rsp and
user_regs_struct_rip offsets in your test scenario?

There does seem to be some unnecessary "kernel-port" left-overs that
should be pruned.  Like the __get_user_nocheck(), __get_user_size()
and __get_user_asm() definitions are superfluous, since they're only
needed by __get_user(), which is not used.

I'll also make it compilable in other than x86_64 environments,
because the unwind_x86_64.c code should be #ifdef'd X86_64.

> There still are a couple of things which need to be done, viz
> 1. Extend to obtaining unwind info from modules as well(currently
>    doing only for the kernel)

Shouldn't pose a major problem -- just requires following the links
from the kernel table AFAICT.  But that leads to another question...

What happens, in dwarf_backtrace(), when it encounters a module
frame?  My guess is that the call to unwind() will fail, and then
the loop will bail out prematurely, and end up truncating the
trace output?  We'll need some type of error-handling there I
would think.

> 2. Currently reading the unwind info from eh_frame section only(ie
>    __start_unwind to __end_unwind). Need to add facility to read from
>    the .debug_frame(if .debug_frame is present in cases where .eh_frame
>    is absent. Will have to read from the vmlinux if we want to read the
>    .debug_frame info)

Definitely -- we've got a plethora of kernels that have the CFI stuff
in the vmlinux file, but not in the kernel.

> 3. Add FRAME_POINTER support.

Personally, I don't much care about this...

And don't forget about x86 support!

And we should probably -- for now anyway -- make it possible to
turn this capability on and off at will.  Also, for now, it should
probably default to off until we pound on it a bit.  For example,
things like "bt -l" is not supported in these scheme.

But, I can't emphasize this enough -- this is a nice piece of work
that you've done here.

I'll try to get something out in the next couple of days.

Thanks,
  Dave