----- Original Message -----
Thanks for the comments, Dave.
From: Dave Anderson <anderson(a)redhat.com>
Subject: Re: [Crash-utility] [PATCH] netdump: Add a helper function to
check if registers is available for a given active task in ELF notes
Date: Fri, 1 Apr 2011 14:22:38 -0400 (EDT)
>
>
> ----- Original Message -----
>> Add a helper function, exist_regs_in_elf_notes(tc), which checks
>> whether or not register values for a given active task is available in
>> ELF notes.
>>
>> I intend to use the helper function in gcore extension module. vmcore
>> generated by diskdump has NT_PRSTATUS for a panic task only, and so
>> specifying get_regs_from_elf_notes() directly to non-panic active
>> tasks leads to a fatal action. So, it's necessary to check, in
>> advance, that an active task can get registers from ELF notes, but the
>> variable holding vmcore's data including ELF notes', nd, is defined as
>> a static global variable in netdump.c and thus the new helper function
>> needs to be introduced.
>
> Why not make the "nd" pointer available to extension modules?
>
I agree. It's a reasonable solution.
> A get_kdump_vmcore_data() function already exists. If you
> create a new get_netdump_vmcore_data() function, it seems that
> most (if not all) of this netdump.c code that is *only* used by
> your extension module could be moved into your extension module
> source code, where it really belongs. The vmcore_data structure
> declaration could be moved into defs.h so that you would not have
> to #include netdump.h.
It's no problem for gcore extension module. But the comment below,
located just above the relevant functions, appears to indicate there's
a potencial user who develops some extension modules but doesn't
distribute them at the website. I guess it will affect at least him,
and other similar users.
/*
* The following set of functions are not used by the crash
* source code, but are available to extension modules for
* gathering register sets from ELF NT_PRSTATUS note sections.
*
* Contributed by: Sharyathi Nagesh (sharyath(a)in.ibm.com)
*/
Ah yes, you are correct. It was introduced back in May 2009,
for an extension module IBM was working on at the time, but
there were some TODO issues that they were working on so it
was never advertised on the extensions web page.
>
> BTW, you seemed to have sent a duplicate patch-set, i.e.,
> 0001-Check-non-support-machine-check-first-01.patch and
> 0001-Check-non-support-machine-check-first.patch, etc.
?? Sorry, I don't know the first patch. Four patchs are attached to
the mail I sent and appear as I intended.
From: Dave Anderson <anderson(a)redhat.com>
Subject: Re: [Crash-utility] [PATCH] netdump: Add a helper function to
check if registers is available for a given active task in ELF notes
Date: Fri, 1 Apr 2011 16:08:42 -0400 (EDT)
Hmmm -- this probably must have something to do with the Zimbra email
client we use? From Zimbra, your email shows 8 attachments, as
cut-and-pasted below:
0004-Introduce-...n_elf_notes.patch (1.9 KB) Download | Briefcase | Remove
0003-Unify-erro...ion-printed.patch (1.8 KB) Download | Briefcase | Remove
0002-Move-all-c...arch_regs_f.patch (6.7 KB) Download | Briefcase | Remove
0001-Check-non-...check-first.patch (1.4 KB) Download | Briefcase | Remove
0004-Introduce-...n_elf_notes.patch (1.9 KB) Download | Briefcase | Remove
0003-Unify-erro...ion-printed.patch (1.8 KB) Download | Briefcase | Remove
0002-Move-all-c...arch_regs_f.patch (6.7 KB) Download | Briefcase | Remove
0001-Check-non-...check-first.patch (1.4 KB) Download | Briefcase | Remove
I can click on any of the 8, but since the duplicates have the
same name, perhaps Zimbra adds the "-01"?
But if I click on "Download all attachments", it says that it's a
single .zip format file, and when I select "save" (instead of "open
with archive manager") it creates a .zip file with the (long) name:
[Crash-utility] [PATCH] netdump Add a helper function to check if registers is available
for a given active task in ELF notes.zip
I don't understand where the .zip part was introduced?
But if I bring up Thunderbird, I just see the 4 attachments.
Anyway, my problem, not yours.
>
> A couple other questions...
>
> As I understand it, if you attempt to do a gcore on an active
> task from a netdump vmcore, but it is not the panic task, the
> registers are not saved, and the command will abort due to the
> fatal error here in get_regs_from_elf_notes():
>
> if (!exist_regs_in_elf_notes(tc))
> error(FATAL, "cannot determine register set "
> "for %s task: %lx comm: \"%s\"\n",
> (tc->task == tt->panic_task) ? "panic" :
> "active",
> tc->task, tc->comm);
>
> In that case, why not fall through and get the registers from
> the kernel-entry exception frame in gcore_get_regs_from_eframe()?
Sorry for confusing you. I should have attached a patch for gcore
extension module together, like below. So, I've considered originally
as you've suggested.
diff --git a/src/libgcore/gcore_x86.c b/src/libgcore/gcore_x86.c
index 340531c..e650053 100644
--- a/src/libgcore/gcore_x86.c
+++ b/src/libgcore/gcore_x86.c
@@ -1471,8 +1471,9 @@ static int get_active_regs(struct task_context
*target,
return TRUE;
}
- if (get_netdump_arch() != EM_NONE) {
- struct user_regs_struct *note = get_regs_from_elf_notes(target);
+ if (exist_regs_in_elf_notes(target)) {
+ struct user_regs_struct *note =
+ get_x86_64_regs_from_elf_notes(target);
memcpy(regs, note, sizeof(struct user_regs_struct));
return TRUE;
}
OK, that makes sense...
>
> But I admit I still don't understand why you don't get
> the registers from the kernel-entry exception frame
> for *all* tasks -- even if the task was the panic task
> or one of the active tasks?
>
Saved at the kenrel-entry exception frame are not all the registers:
note information contains but exception frame doesn't contain some
segment registers; My understanding is: strictly, they could be
changed at runtime and so impossible to resotre from kernel data
only. If note information is available, the problem is easy; we get
all the registers with no difficulty.
The other reason is for dwarf unwinder receiving RSP and RIP as a
starting point of back trace operation. Restoring callee-saved
registers fully relies on dwarf unwinder now. So, it's neccesary to
pass active RSP and RIP to active tasks.
> And related to the above, the fatal error message above
> prints either "panic" or "active". How would it be possible
> for the "panic" task to fail? The exist_regs_in_elf_notes()
> always returns TRUE for the panic task.
I think you are correct.
OK, so where do we stand here? Should I take your defs.h and netdump.c
patches as they are? (perhaps with the error() panic/active argument fix)
Or do you want to export "nd" and move some of the changes back into
the gcore module? (if that makes sense)
Dave