----- "Joe Porter" <joe.porter(a)ccur.com> wrote:
> Sorry -- I didn't make myself clear enough in my question.
Hi Dave,
Don't sweat it.
If I had time to really look and think everything would make much
more
sense. ;)
>
> What I meant was: did the original code in the crash x86_init()
> fall into the "if" clause here:
>
> if (!VALID_STRUCT(user_regs_struct)) {
> /* Use this hardwired version -- sometimes
the
> * debuginfo doesn't pick this up even
though
> * it exists in the kernel; it shouldn't
change.
> */
>
> Since the offset values and structure size required shouldn't have
changed
> (even though the names did),
I think I was ending up in the if statement ... trying to remember
back
to a couple of weeks ago.
I never looked at the VALID_STRUCT code, but I wouldn't be surprised
if
it is failing ... see the gdb vmlinux output below. The data types
are
different and several members got joined up into the long unsigned
ints.
The VALID_STRUCT() simply returns TRUE if the debuginfo is aware
of the user_regs_struct:
#define VALID_STRUCT(X) (size_table.X >= 0)
AIUI, if nothing in the kernel actually instantiates a declared data
structure, then it won't be placed in the debuginfo data of the vmlinux
file. At least that is what has happened over the years, hence the
addition of the "if !(VALID_STRUCT())" kludge in the crash utility.
When that happens, the structure size and the two offset values are
hardwired. And if they were hardwired, then you would not have seen
the error with the kdump.
> I'm presuming that x86_init() did *not* fall
> into that code, because if it did, the offsets and size values would
have
> been assigned, and you wouldn't have seen the ultimate error.
OK
> So my
> guess is that the user_regs_struct *is* in the debuginfo of the new
> kernel.
I assumed that gdb knew about the structure.
Here's what happens on a RHEL5 2.6.18-based kernel:
# gdb /usr/lib/debug/lib/modules/2.6.18-92.el5/vmlinux
GNU gdb Red Hat Linux (6.5-37.el5rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
Using host libthread_db library "/lib64/libthread_db.so.1".
(gdb) ptype struct user_regs_struct
No struct type named user_regs_struct.
(gdb)
The above is an x86_64, but the same thing happens with an x86.
So with RHEL5 kernels, the "if" statement will be taken, and
the size/offset values get hardwired.
On your kdump, especially since your gdb attempts show that the
user_regs_struct debug data exists, then the values apparently were
not hardwired, because so you would *not* have entered the "if" clause.
In your case, the structure size would have been determined OK,
and therefore the VALID_STRUCT() would have subsequently worked.
But the two MEMBER_OFFSET_INIT() attempts would have failed quietly
because of the name change -- and later on the kdump attempt would
fail when it tried to use them.
> That's what I'm trying to confirm here. In other words, if
> you do this:
>
> # gdb vmlinux
> ...
> (gdb) ptype struct user_regs_struct
>
> does it know about the structure?
Here you go (git8 then git7):
# Linux kernel version: 2.6.24-git8
[root@beebo 11-11-08.1814.44]# gdb vmlinux
GNU gdb Red Hat Linux (6.5-37.el5_2.2rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host
libthread_db library "/lib/libthread_db.so.1".
(gdb) ptype struct user_regs_struct
type = struct user_regs_struct {
long unsigned int bx;
long unsigned int cx;
long unsigned int dx;
long unsigned int si;
long unsigned int di;
long unsigned int bp;
long unsigned int ax;
long unsigned int ds;
long unsigned int es;
long unsigned int fs;
long unsigned int gs;
long unsigned int orig_ax;
long unsigned int ip;
long unsigned int cs;
long unsigned int flags;
long unsigned int sp;
long unsigned int ss;
}
(gdb)
# Linux kernel version: 2.6.24-git7
[root@beebo 11-11-08.1750.38]# gdb vmlinux
GNU gdb Red Hat Linux (6.5-37.el5_2.2rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host
libthread_db library "/lib/libthread_db.so.1".
(gdb) ptype struct user_regs_struct
type = struct user_regs_struct {
long int ebx;
long int ecx;
long int edx;
long int esi;
long int edi;
long int ebp;
long int eax;
short unsigned int ds;
short unsigned int __ds;
short unsigned int es;
short unsigned int __es;
short unsigned int fs;
short unsigned int __fs;
short unsigned int gs;
short unsigned int __gs;
long int orig_eax;
long int eip;
short unsigned int cs;
short unsigned int __cs;
long int eflags;
long int esp;
short unsigned int ss;
short unsigned int __ss;
}
(gdb)
> If it does, then all the changes
> you made in the "if" part of the patch are not required.
Gotcha.
What is kind of strange is that nobody has reported this before.
By any chance does the Concurrent kernel add something that
actually instantiates a user_regs_struct?
> The x86_64 would use x86_64_init() instead of x86_init(), so it's
> irrelevant. And the x86_64 code doesn't care about those fields.
>
Understood.
> I appreciate your time -- sorry to drag you down into my world.
> (Please forgive an old "ccur.com" guy...)
Original Concurrent or from the old Harris people?
From the Westford MA branch of the "original" Concurrent
(which was
actually created when Masscomp and Concurrent/New-Jersey merged).
Not long after the Harris/Concurrent merger, the Concurrent/Westford
office closed. (And I moved on to Digital)
No forgiveness necessary ... crash is the best tool out there ... we
should all thank you loudly.
I'm the kdump/crash guy here, but I'm not allowed to spend much time
on
it ... I'm grateful it works as well as it does.
I'm very busy on another project right now that has too much
visibility
at the top for me to appear to be doing much else.
If you like, I can help out testing and providing feedback ... more
when
things calm down.
I'll post a patch that you can run a quick test with.
But just for my own sanity, can you verify that the failing crash
utility did *not* pass through the "if" kludge?
Thanks,
Dave