October 2006 - Crash-utility - Crash Utility List Archives

Re: [PATCH] IA-64 4-level page table support

by Dave Anderson

Hi Troy, While running a few tests on a RHEL3 ia64 box, I noticed that the new "ps -a" function was failing to access user space stack addresses. As it turns out, it was a regression in the ia64 4-level page table support put into 4.0-3.1. The 3-level ia64_vtop() function was working for most user addresses, and for kernel vmalloc addresses, by dumb luck, but user stack addresses failed due to the use of an incorrect PGDIR_SHIFT value. In your patch, you indicate this: #define PGDIR_SHIFT_4L (PUD_SHIFT + (PTRS_PER_PTD_SHIFT)) #define PGDIR_SHIFT_3L (PMD_SHIFT + (PTRS_PER_PTD_SHIFT)) /* Turns out 4L & 3L PGDIR_SHIFT are the same (for now) */ #define PGDIR_SHIFT PGDIR_SHIFT_4L But PGDIR_SHIFT_3L is 36 and PGDIR_SHIFT_4L is 47. And in the 3-level ia64_vtop() function, PGDIR_SHIFT was being used. By changing ia64_vtop() to use PGDIR_SHIFT_3L instead, all translations work. FWIW, the problem only manifested itself with user space addresses in the last two vm areas, which have bits set in the address space affected by the shift value: crash> vm PID: 1 TASK: e000004040200000 CPU: 0 COMMAND: "init" MM PGD RSS TOTAL_VM e00000409ef47b00 e00000404579c000 1168k 2960k VMA START END FLAGS FILE e00000409ef3f8b8 0 4000 84011 e00000409ef3fe08 2000000000000000 2000000000038000 875 /lib/ld-2.3.2.so e00000409ef3fe90 2000000000038000 2000000000040000 100877 /lib/ld-2.3.2.so e00000409ef3f830 200000000005c000 20000000002a4000 75 /lib/tls/libc-2.3.2.so e00000409ef3fad8 20000000002a4000 20000000002ac000 70 /lib/tls/libc-2.3.2.so e00000409ef3f9c8 20000000002ac000 20000000002b8000 100077 /lib/tls/libc-2.3.2.so e00000409ef3f940 20000000002b8000 20000000002c4000 100077 e00000409ef3fd80 4000000000000000 4000000000010000 1875 /sbin/init e00000409ef3fc70 600000000000c000 6000000000010000 101877 /sbin/init e00000409ef3fa50 6000000000010000 6000000000034000 100073 e00000409ef3fcf8 60000fff80000000 60000fff80004000 233 e00000409ef3ff18 60000fffffff8000 60000fffffffc000 100177 ...so the incorrect use of 47 worked by mistake for the other user space and vmalloc addresses. It's always something... ;-) Thanks, Dave

18 years, 9 months

1
0
0 / 0

Update of list to accept several options -s x.a -s x.b -s x.c

by Olivier Daudel

Hi Dave, I hope this does not break anything. Very usefull for me to be able to view a chosen list of fields. Later i shall implement : -s x.a,b,c (as a shortcut for -s x.a -s x.b -s x.c) Olivier [root@fedora4 crash-4.0-3.6-patch]# ./crash -s crash> list file_lock.fl_link -s file_lock.fl_pid -s file_lock.fl_wait -H c036acf4 f6f795e0 fl_pid = 2962, fl_wait = { lock = { slock = 1, magic = 3735899821, break_lock = 0 }, task_list = { next = 0xf6f79608, prev = 0xf6f79608 } }, f6f79b58 fl_pid = 2977, fl_wait = { lock = { slock = 1, magic = 3735899821, break_lock = 0 }, task_list = { next = 0xf6f79b80, prev = 0xf6f79b80 } }, [...]

18 years, 9 months

2
1
0 / 0

Re: [Crash-utility] sig -g and foreach sig -g

by Dave Anderson

> May be there is a difficulty with free_all_bufs() ? > If i understand well the foreach() function : Hi Olivier, As it turns out, the "foreach sig -g" can be accomplished in a much easier manner by using crash's built-in hq_xxx() hash queue functionality. So there's no need to (1) pre-allocate anything, or (2) re-invent the wheel by creating a new hashing function. Anyway, I've got the "foreach sig -g" piece pretty much complete -- so I'll take it from here. Thanks again for your contribution, Dave

18 years, 9 months

1
0
0 / 0

sig -g and foreach sig -g

by Olivier Daudel

Hello Dave, I think sig -g is quite OK. I have a problem with foreach sig -g. If i use malloc() and free(), it seems to work (may be the algorithmic is OK). I have tryed to use hashing on tgid to control if we have already displayed it. If i use GETBUF() and FREEBUF(), it crashes. May be i don't understant some conditions in using GETBUF() and FREEBUF() ? Thanks for any suggestion.

18 years, 9 months

2
2
0 / 0

New option -g with sig

by Olivier Daudel

Hello Dave, This patch try to apply your new function show_tgid_list() in the signal context. The signal part at the task group level is first shown, then we show the relevant part for each task in the task group. When it will be OK for you, i'll update the help file. crash> sig -g 27491 SIGNAL_STRUCT: f78f8380 COUNT: 3 SIG SIGACTION HANDLER MASK FLAGS [1] c1f3f104 SIG_DFL 0000000000000000 0 [2] c1f3f118 SIG_DFL 0000000000000000 0 [3] c1f3f12c SIG_DFL 0000000000000000 0 [4] c1f3f140 SIG_DFL 0000000000000000 0 [5] c1f3f154 SIG_DFL 0000000000000000 0 [6] c1f3f168 SIG_DFL 0000000000000000 0 [7] c1f3f17c SIG_DFL 0000000000000000 0 [8] c1f3f190 SIG_DFL 0000000000000000 0 [9] c1f3f1a4 SIG_DFL 0000000000000000 0 [10] c1f3f1b8 804877e 0000000000000000 4 (SA_SIGINFO) [11] c1f3f1cc SIG_DFL 0000000000000000 0 ... [62] c1f3f5c8 SIG_DFL 0000000000000000 0 [63] c1f3f5dc SIG_DFL 0000000000000000 0 [64] c1f3f5f0 SIG_DFL 0000000000000000 0 SHARED_PENDING SIGNAL: 0000000200000200 SIGQUEUE: SIG SIGINFO 10 f56f3c84 34 f56f390c 34 f56f3878 34 f56f37e4 34 f56f3060 PID: 27489 TASK: f606b560 CPU: 0 COMMAND: "sig_procthread" SIGPENDING: no BLOCKED: 0000080200000a00 PRIVATE_PENDING SIGNAL: 0000080000000800 SIGQUEUE: SIG SIGINFO 12 f56f3f68 44 f56f3ed4 44 f56f3e40 44 f56f3dac 44 f56f3d18 PID: 27490 TASK: f7d09020 CPU: 0 COMMAND: "sig_procthread" SIGPENDING: no BLOCKED: 0000000200000200 PRIVATE_PENDING SIGNAL: 0000000000000000 SIGQUEUE: (empty) PID: 27491 TASK: f544f020 CPU: 1 COMMAND: "sig_procthread" SIGPENDING: no BLOCKED: 0000000200000200 PRIVATE_PENDING SIGNAL: 0000000000000000 SIGQUEUE: (empty)

18 years, 9 months

3
3
0 / 0

sig -g and foreach sig -g

by Olivier Daudel

Sorry, it would be better to see this new attached file. Hello Dave, I think sig -g is quite OK. I have a problem with foreach sig -g. If i use malloc() and free(), it seems to work (may be the algorithmic is OK). I have tryed to use hashing on tgid to control if we have already displayed it. If i use GETBUF() and FREEBUF(), it crashes. May be i don't understant some conditions in using GETBUF() and FREEBUF() ? Thanks for any suggestion. -------------------------------------------------------------------------------- -- Crash-utility mailing list Crash-utility(a)redhat.com https://www.redhat.com/mailman/listinfo/crash-utility

18 years, 9 months

1
0
0 / 0

Crash in student context

by Olivier Daudel

Just for information, we use crash with students. They don't have root privilege, we have relaxed some securities in the driver mem.c and allow /dev/mem for reading. Our purpose is just to prevent big errors. It's fascinating to have 15 or more crash sessions in // on the same computer. Regards.

18 years, 9 months

1
0
0 / 0

crash version 4.0-3.6 is available

by Dave Anderson

- Workaround for pre-2.6.17 kernels whose vmlinux file does not contain debug information for the "pid_hash" array. Without this patch, the crash session would fail during initialization with the error message: "crash: cannot determine pid_hash array dimensions". This problem appears to be limited to kernels built with gcc version 4.0.0, which had a known regression that omitted debug information for uninitialized variables. (anderson(a)redhat.com) Download from: http://people.redhat.com/anderson

18 years, 9 months

1
0
0 / 0

Problem with "xm save" x86-64 cores - crash.4.0-3.5

by Tejasvi Aswathanarayana

Crash exits with an error "cannot determine pid_hash array dimensions". Looking at the crash change log, it appears that it was fixed in 4.0-2.24 for the 2.6.17 kernel. The core I have is of a 2.6.16.13 xenified kernel. Is the fix even relevant ? <output> $ ./crash vmlinux-2.6.16.13-xen test.core crash 4.0-3.5 Copyright (C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005 Fujitsu Limited Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb 6.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-unknown-linux-gnu"... please wait... (gathering task table data) crash: cannot determine pid_hash array dimensions </output> Is there a way I can get crash's source at releases 4.0-2.24 and 4.0-2.23 so that I can try a fix for this kernel in case it is a specific kernel fix ? Thanks -Tejasvi

18 years, 9 months

2
11
0 / 0

Re: [Crash-utility] Problem with "xm save" x86-64 cores -crash.4.0-3.5

by Dave Anderson

Looking at the vmlinux files with dwarfdump and gdb 6.3 alone, I can see now that in the target 2.6.16 xen kernel (built with gcc 4.0.0, if that matters), what is happening is that for static kernel data that is declared without being initialized, its data type information is being completely left out of the debug info section. So, taking kernel/pid.c as an example, symbols such as "pid_hash", "pidhash_shift" and "last_pid" are declared uninitialized. Looking at the 2.6.16 xen kernel built with gcc 4.0.0: # gdb vmlinux GNU gdb Red Hat Linux (6.3.0.0-1.63rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db library "/lib64/tls/libthread_db.so.1". (gdb) whatis last_pid type = <data variable, no debug info> (gdb) whatis pid_hash type = <data variable, no debug info> (gdb) whatis pidhash_shift type = <data variable, no debug info> (gdb) Then looking at a non-xen 2.6.16-era kernel built with gcc 4.1.0: # gdb /usr/dumps/kdump/vmlinux GNU gdb Red Hat Linux (6.3.0.0-1.63rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db library "/lib64/tls/libthread_db.so.1". (gdb) whatis last_pid type = int (gdb) whatis pid_hash type = struct hlist_head *[4] (gdb) whatis pidhash_shift type = int (gdb) I don't believe the "xen" angle makes a difference. I only have 2.6.17-era xen kernels as a reference -- built with gcc 4.1.1 -- and they do not exhibit this behavior. I presume that this affects files other than kernel/pid.c, but haven't checked any further. So it would be interesting to know whether an upgrade in compilers would make a difference. Regardless of that, I'll still come up with a fix and release for this, since there's really nothing I can do with kernels of this ilk except to work around their shortcomings. If the debug data's not there, it's not there... Thanks, Dave

18 years, 9 months

1
1
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility October 2006