Badari Pulavarty wrote:
On Tue, 2006-04-25 at 14:42 -0400, Dave Anderson wrote:
> Badari Pulavarty wrote:

> As far as the determination of the panic task, I'm presuming
> that this was generated from a kdump dumpfile.  The netdump.c
> get_netdump_panic_task() function, which has a bunch of
> kdump-specific code, is failing to find the panic task from the
> data in the ELF header notes.  Running "crash -d1 ..." will indicate
> how crash is trying to determine the panic task.  I don't know
> whether the idle task was even the one that took the sysrq,
> or whether it just defaulted to that task because it couldn't find
> any other likely suspects.  You'll have to debug it from your
> end, starting from get_netdump_panic_task().
>

I did little more digging around :)

crash -d1 shows me

crash: get_netdump_panic_task: crashing_cpu: 1
crash: get_netdump_panic_task: failed
crash: get_active_set_panic_task: failed

Looking at get_netdump_panic_task(), there is no code to handle
EM_X86_64 ? I see checks for

                if (nd->elf64->e_machine == EM_386) {
                ..
                }
                if (nd->elf64->e_machine == EM_PPC64) {
                ...
                }
 
 

Hey -- you're right -- there's nothing EM_X86_64-specific
done in that function.  I don't see why something couldn't
be put in place, but it's been unnecessary since the panic
task would normally be found by one of the follow-up functions,
i.e., in either get_active_set_panic_task(), or if that fails,
the lowest common denominator function, in panic_search().
Now, since after the "crash: get_active_set_panic_task: failed"
message, there are no more error messages, so panic_search()
must have determined that the idle thread (swapper) on cpu 0 was
in fact the running task that took the sysrq.  But the stack
trace shows just a single "schedule()" line...

So, can you do the following, while staying in that same context:

crash> bt -T

...and see if the ".crash_kexec" return address shows up
anywhere in the process stack?  Since that task was selected
as the panic task, it should be there somewhere.  And if
it's there, it should be used as the starting point of the
back trace.  Then do this:

crash> set debug 1
crash> bt
...

What should be displayed for the panic task, prior to the back
trace data, is this message from get_netdump_regs_x86_64():

        if (CRASHDEBUG(1)) {
                rsp = ULONG(user_regs + OFFSET(user_regs_struct_rsp));
                rip = ULONG(user_regs + OFFSET(user_regs_struct_rip));
                netdump_print("ELF prstatus rsp: %lx rip: %lx\n",
                        rsp, rip);
        }

Do you see that message?

Dave