Re: [PATCH] IA-64 4-level page table support
by Dave Anderson
Hi Troy,
While running a few tests on a RHEL3 ia64 box, I noticed
that the new "ps -a" function was failing to access user
space stack addresses.
As it turns out, it was a regression in the ia64 4-level page
table support put into 4.0-3.1. The 3-level ia64_vtop()
function was working for most user addresses, and for kernel
vmalloc addresses, by dumb luck, but user stack addresses
failed due to the use of an incorrect PGDIR_SHIFT value.
In your patch, you indicate this:
#define PGDIR_SHIFT_4L (PUD_SHIFT + (PTRS_PER_PTD_SHIFT))
#define PGDIR_SHIFT_3L (PMD_SHIFT + (PTRS_PER_PTD_SHIFT))
/* Turns out 4L & 3L PGDIR_SHIFT are the same (for now) */
#define PGDIR_SHIFT PGDIR_SHIFT_4L
But PGDIR_SHIFT_3L is 36 and PGDIR_SHIFT_4L is 47. And in
the 3-level ia64_vtop() function, PGDIR_SHIFT was being
used. By changing ia64_vtop() to use PGDIR_SHIFT_3L instead,
all translations work.
FWIW, the problem only manifested itself with user space
addresses in the last two vm areas, which have bits set
in the address space affected by the shift value:
crash> vm
PID: 1 TASK: e000004040200000 CPU: 0 COMMAND: "init"
MM PGD RSS TOTAL_VM
e00000409ef47b00 e00000404579c000 1168k 2960k
VMA START END FLAGS FILE
e00000409ef3f8b8 0 4000 84011
e00000409ef3fe08 2000000000000000 2000000000038000 875 /lib/ld-2.3.2.so
e00000409ef3fe90 2000000000038000 2000000000040000 100877 /lib/ld-2.3.2.so
e00000409ef3f830 200000000005c000 20000000002a4000 75 /lib/tls/libc-2.3.2.so
e00000409ef3fad8 20000000002a4000 20000000002ac000 70 /lib/tls/libc-2.3.2.so
e00000409ef3f9c8 20000000002ac000 20000000002b8000 100077 /lib/tls/libc-2.3.2.so
e00000409ef3f940 20000000002b8000 20000000002c4000 100077
e00000409ef3fd80 4000000000000000 4000000000010000 1875 /sbin/init
e00000409ef3fc70 600000000000c000 6000000000010000 101877 /sbin/init
e00000409ef3fa50 6000000000010000 6000000000034000 100073
e00000409ef3fcf8 60000fff80000000 60000fff80004000 233
e00000409ef3ff18 60000fffffff8000 60000fffffffc000 100177
...so the incorrect use of 47 worked by mistake for the
other user space and vmalloc addresses.
It's always something... ;-)
Thanks,
Dave
18 years, 2 months
Update of list to accept several options -s x.a -s x.b -s x.c
by Olivier Daudel
Hi Dave,
I hope this does not break anything.
Very usefull for me to be able to view a chosen list of fields.
Later i shall implement :
-s x.a,b,c (as a shortcut for -s x.a -s x.b -s x.c)
Olivier
[root@fedora4 crash-4.0-3.6-patch]# ./crash -s
crash> list file_lock.fl_link -s file_lock.fl_pid -s file_lock.fl_wait -H
c036acf4
f6f795e0
fl_pid = 2962,
fl_wait = {
lock = {
slock = 1,
magic = 3735899821,
break_lock = 0
},
task_list = {
next = 0xf6f79608,
prev = 0xf6f79608
}
},
f6f79b58
fl_pid = 2977,
fl_wait = {
lock = {
slock = 1,
magic = 3735899821,
break_lock = 0
},
task_list = {
next = 0xf6f79b80,
prev = 0xf6f79b80
}
},
[...]
18 years, 2 months
Re: [Crash-utility] sig -g and foreach sig -g
by Dave Anderson
> May be there is a difficulty with free_all_bufs() ?
> If i understand well the foreach() function :
Hi Olivier,
As it turns out, the "foreach sig -g" can be accomplished in a much easier
manner by using crash's built-in hq_xxx() hash queue functionality.
So there's no need to (1) pre-allocate anything, or (2) re-invent the wheel
by creating a new hashing function.
Anyway, I've got the "foreach sig -g" piece pretty much complete -- so
I'll take it from here.
Thanks again for your contribution,
Dave
18 years, 2 months
sig -g and foreach sig -g
by Olivier Daudel
Hello Dave,
I think sig -g is quite OK.
I have a problem with foreach sig -g.
If i use malloc() and free(), it seems to work (may be the algorithmic is
OK).
I have tryed to use hashing on tgid to control if we have already displayed
it.
If i use GETBUF() and FREEBUF(), it crashes.
May be i don't understant some conditions in using GETBUF() and FREEBUF() ?
Thanks for any suggestion.
18 years, 2 months
New option -g with sig
by Olivier Daudel
Hello Dave,
This patch try to apply your new function show_tgid_list() in the signal
context.
The signal part at the task group level is first shown, then we show the
relevant part for each task in the task group.
When it will be OK for you, i'll update the help file.
crash> sig -g 27491
SIGNAL_STRUCT: f78f8380 COUNT: 3
SIG SIGACTION HANDLER MASK FLAGS
[1] c1f3f104 SIG_DFL 0000000000000000 0
[2] c1f3f118 SIG_DFL 0000000000000000 0
[3] c1f3f12c SIG_DFL 0000000000000000 0
[4] c1f3f140 SIG_DFL 0000000000000000 0
[5] c1f3f154 SIG_DFL 0000000000000000 0
[6] c1f3f168 SIG_DFL 0000000000000000 0
[7] c1f3f17c SIG_DFL 0000000000000000 0
[8] c1f3f190 SIG_DFL 0000000000000000 0
[9] c1f3f1a4 SIG_DFL 0000000000000000 0
[10] c1f3f1b8 804877e 0000000000000000 4 (SA_SIGINFO)
[11] c1f3f1cc SIG_DFL 0000000000000000 0
...
[62] c1f3f5c8 SIG_DFL 0000000000000000 0
[63] c1f3f5dc SIG_DFL 0000000000000000 0
[64] c1f3f5f0 SIG_DFL 0000000000000000 0
SHARED_PENDING
SIGNAL: 0000000200000200
SIGQUEUE: SIG SIGINFO
10 f56f3c84
34 f56f390c
34 f56f3878
34 f56f37e4
34 f56f3060
PID: 27489 TASK: f606b560 CPU: 0 COMMAND: "sig_procthread"
SIGPENDING: no
BLOCKED: 0000080200000a00
PRIVATE_PENDING
SIGNAL: 0000080000000800
SIGQUEUE: SIG SIGINFO
12 f56f3f68
44 f56f3ed4
44 f56f3e40
44 f56f3dac
44 f56f3d18
PID: 27490 TASK: f7d09020 CPU: 0 COMMAND: "sig_procthread"
SIGPENDING: no
BLOCKED: 0000000200000200
PRIVATE_PENDING
SIGNAL: 0000000000000000
SIGQUEUE: (empty)
PID: 27491 TASK: f544f020 CPU: 1 COMMAND: "sig_procthread"
SIGPENDING: no
BLOCKED: 0000000200000200
PRIVATE_PENDING
SIGNAL: 0000000000000000
SIGQUEUE: (empty)
18 years, 2 months
sig -g and foreach sig -g
by Olivier Daudel
Sorry, it would be better to see this new attached file.
Hello Dave,
I think sig -g is quite OK.
I have a problem with foreach sig -g.
If i use malloc() and free(), it seems to work (may be the algorithmic is
OK).
I have tryed to use hashing on tgid to control if we have already displayed
it.
If i use GETBUF() and FREEBUF(), it crashes.
May be i don't understant some conditions in using GETBUF() and FREEBUF() ?
Thanks for any suggestion.
--------------------------------------------------------------------------------
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
18 years, 2 months
Crash in student context
by Olivier Daudel
Just for information, we use crash with students.
They don't have root privilege, we have relaxed some securities in the
driver mem.c and allow /dev/mem for reading.
Our purpose is just to prevent big errors.
It's fascinating to have 15 or more crash sessions in // on the same
computer.
Regards.
18 years, 2 months
crash version 4.0-3.6 is available
by Dave Anderson
- Workaround for pre-2.6.17 kernels whose vmlinux file does not
contain debug information for the "pid_hash" array. Without this
patch, the crash session would fail during initialization with the
error message: "crash: cannot determine pid_hash array dimensions".
This problem appears to be limited to kernels built with gcc
version 4.0.0, which had a known regression that omitted debug
information for uninitialized variables. (anderson(a)redhat.com)
Download from: http://people.redhat.com/anderson
18 years, 2 months
Problem with "xm save" x86-64 cores - crash.4.0-3.5
by Tejasvi Aswathanarayana
Crash exits with an error "cannot determine pid_hash array
dimensions". Looking at the crash change log, it appears that it was
fixed in 4.0-2.24 for the 2.6.17 kernel. The core I have is of a
2.6.16.13 xenified kernel. Is the fix even relevant ?
<output>
$ ./crash vmlinux-2.6.16.13-xen test.core
crash 4.0-3.5
Copyright (C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005 Fujitsu Limited
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
please wait... (gathering task table data)
crash: cannot determine pid_hash array dimensions
</output>
Is there a way I can get crash's source at releases 4.0-2.24 and
4.0-2.23 so that I can try a fix for this kernel in case it is a
specific kernel fix ?
Thanks
-Tejasvi
18 years, 2 months
Re: [Crash-utility] Problem with "xm save" x86-64 cores -crash.4.0-3.5
by Dave Anderson
Looking at the vmlinux files with dwarfdump and gdb 6.3 alone, I can
see now that in the target 2.6.16 xen kernel (built with gcc 4.0.0,
if that matters), what is happening is that for static kernel data that is
declared without being initialized, its data type information is being
completely left out of the debug info section.
So, taking kernel/pid.c as an example, symbols such as "pid_hash",
"pidhash_shift" and "last_pid" are declared uninitialized. Looking
at the 2.6.16 xen kernel built with gcc 4.0.0:
# gdb vmlinux
GNU gdb Red Hat Linux (6.3.0.0-1.63rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db library "/lib64/tls/libthread_db.so.1".
(gdb) whatis last_pid
type = <data variable, no debug info>
(gdb) whatis pid_hash
type = <data variable, no debug info>
(gdb) whatis pidhash_shift
type = <data variable, no debug info>
(gdb)
Then looking at a non-xen 2.6.16-era kernel built with gcc 4.1.0:
# gdb /usr/dumps/kdump/vmlinux
GNU gdb Red Hat Linux (6.3.0.0-1.63rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db library "/lib64/tls/libthread_db.so.1".
(gdb) whatis last_pid
type = int
(gdb) whatis pid_hash
type = struct hlist_head *[4]
(gdb) whatis pidhash_shift
type = int
(gdb)
I don't believe the "xen" angle makes a difference. I only have
2.6.17-era xen kernels as a reference -- built with gcc 4.1.1 -- and
they do not exhibit this behavior.
I presume that this affects files other than kernel/pid.c, but haven't
checked any further.
So it would be interesting to know whether an upgrade in compilers
would make a difference.
Regardless of that, I'll still come up with a fix and release for this,
since there's really nothing I can do with kernels of this ilk except
to work around their shortcomings. If the debug data's not there,
it's not there...
Thanks,
Dave
18 years, 2 months