Isaku Yamahata wrote:
On Wed, Dec 20, 2006 at 10:15:32AM -0500, Dave Anderson wrote:

> - Introduced support for xendumps of para-virtualized ia64 kernels.
>   It should be noted that currently the ia64 Xen kernel does not
>   lay down a switch_stack for the panic task, so only raw "bt -t"
>   backtraces can be done on the panic task.  (anderson@redhat.com)

Hi Dave.

The current "xm dump-core" on ia64 loses some registers infomation
which is saved on xen register stack.
e.g. r33, ... aren't saved in domU xendump file.
Probably ia64 specific code would be necessarry for it.
This will be addressed as post-3.0.4 effort and the format will be changed.

--
yamahata

I'm not sure exactly what the ramifications are of an ia64 "xm dump-core"
on a paravirtualized kernel.  It would seem to depend upon what, if anything,
was "active" at the time.

My reference to the switch_stack above was for an ia64 kernel that panicked
on its own account; the test dump I used was killed with a write to
/proc/sysrq-trigger:

.crash> bt
PID: 1554   TASK: e000000000988000  CPU: 0   COMMAND: "bash"
bt: xendump: switch_stack possibly not saved -- try "bt -t"
 #0 [BSP:e000000000988f00] schedule at a0000001005e0420
crash>

It uses the stale information from the last time it called schedule(),
so the backtrace fails.

Using "bt -t" walks the process stack for kernel return addresses,
and the "reverse" BSP information just above the task_struct shows
the path taken:

crash> bt -t
PID: 1554   TASK: e000000000988000  CPU: 0   COMMAND: "bash"
              START: schedule at a0000001005e0420
  [e000000000989238] xen_trace_syscall at a000000100065020
  [e000000000989288] sys_write at a000000100155d30
  [e0000000009892b8] vfs_write at a0000001001551e0
  [e000000000989308] write_sysrq_trigger at a0000001001e3250
  [e000000000989320] __handle_sysrq at a00000010039bca0
  [e000000000989380] sysrq_handle_crashdump at a00000010039c460
  [e0000000009893e8] do_wait at a00000010008b080
  [e000000000989438] schedule_timeout at a0000001005e22a0
  [e000000000989450] ext3_lookup at a0000002001eb7d0
  [e000000000989460] cleanup_module at a000000200202a10
  [e000000000989470] ext3_find_entry at a0000002001e77c0
  [e0000000009894b0] __wait_on_buffer at a00000010015b580
  [e0000000009894c0] ll_rw_block at a00000010015c2b0
  [e000000000989500] out_of_line_wait_on_bit at a0000001005e2940
  [e000000000989520] __wait_on_bit at a0000001005e27d0
  [e000000000989540] sync_buffer at a00000010015b7c0
  [e000000000989558] io_schedule at a0000001005e2170
  [e000000000989588] __delayacct_blkio_start at a0000001000fa4b0
  [e000000000989608] io_schedule at a0000001005e21a0
  [e000000000989630] __do_IRQ at a0000001000f2450
  [e000000000989660] do_softirq at a000000100093100
  [e000000000989698] blkif_int at a000000200152070
  [e000000000989710] end_that_request_first at a00000010027ae30
  [e000000000989748] __end_that_request_first at a00000010027a460
  [e000000000989778] bio_endio at a0000001001622b0
  [e00000000098fca0] schedule at a0000001005e0420
  [e00000000098fd10] vhpt_miss at a000000100000002
  [e00000000098fd60] vhpt_miss at a000000100000002
  [e00000000098fdc8] dummycon_dummy at a0000001002dd380
  [e00000000098fdd0] vhpt_miss at a000000100000003
crash>

Well, it at least shows it going as far as sysrq_handle_crashdump(),
and any further addresses of function calls were never pushed into
the BSP.  (?)

Anyway, the problem is that the ia64 shutdown path in a para-virtualized ia64
kernel does not lay down a switch_stack -- as is done by the netdump,
diskdump and kdump facilities.  Without a switch_stack register dump,
a backtrace is impossible.

It's a simple thing to do -- at some point during the shutdown
path, presumably xen_panic_event(), the panicking process would
need to make a call to the unw_init_running() function, which lays
down a switch_stack on the kernel stack, and then continues on to
the next function in the shutdown path.  For example, the kdump
facility for ia64 does this:

[ system crashes ]
  crash_kexec()
    machine_kexec()
       ...

The ia64 version of machine_kexec() does this:

void machine_kexec(struct kimage *image)
{
        unw_init_running(ia64_machine_kexec, image);
        for(;;);
}

The call to unw_init_running() never returns, but rather
it lays down a switch_stack on the kernel stack, and then
calls the ia64_machine_kexec() function:

extern void *efi_get_pal_addr(void);
static void ia64_machine_kexec(struct unw_frame_info *info, void *arg)
{
        struct kimage *image = arg;
        relocate_new_kernel_t rnk;
        void *pal_addr = efi_get_pal_addr();
        unsigned long code_addr = (unsigned long)page_address(image->control_code_page);
        unsigned long vector;
        int ii;

        if (image->type == KEXEC_TYPE_CRASH) {
                crash_save_this_cpu();
                current->thread.ksp = (__u64)info->sw - 16;
        }

        ... (continue shutdown path)

The address of the switch stack address is found in the unw_frame_info
structure passed in, and gets stored in the current->thread.ksp of
the panicking task.  With that simple procedure, the crash utility
will then have all that it needs to do a backtrace of the panicking task.

Since the para-virtualized ia64 kernel shuts down when panic()
calls atomic_notifier_call_chain(), which in turn goes through the
panic_notifier list -- which leads to the ia64 version of xen_panic_event():

static int
xen_panic_event(struct notifier_block *this, unsigned long event, void *ptr)
{
        HYPERVISOR_shutdown(SHUTDOWN_crash);
        /* we're never actually going to get here... */
        return NOTIFY_DONE;
}

The ia64 would need to "jump through the hoop" of a call to
unw_init_running() before it calls HYPERVISOR_shutdown()

Thanks,
  Dave