Eugene Teo wrote:
Hi Dave,
I tried to run crash on Fedora 8's kernel 2.6.24.3-12.fc8 x86_64, and
it has errors that look like the following:
[...]
crash: duplicate task in pid_hash: ffff81012f0811d0
crash: duplicate task in pid_hash: ffff81012f0811d0
crash: duplicate task in pid_hash: ffff81012f0811d0
crash: duplicate task in pid_hash: ffff81012f0811d0
crash: duplicate task in pid_hash: ffff81012f0811d0
crash: cannot gather a stable task list via pid_hash (500 retries)
I ran crash with -d7, and uploaded the log for debugging:
http://hera.kernel.org/~eugeneteo/crash.log
Thanks,
Eugene
Hi Eugene,
I can't reproduce this one -- here on a freshly-installed x86_64
running 2.6.24.3-12.fc8:
# crash
crash 4.0-6.1
Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
KERNEL: /usr/lib/debug/lib/modules/2.6.24.3-12.fc8/vmlinux
DUMPFILE: /dev/crash
CPUS: 8
DATE: Mon Mar 17 11:40:59 2008
UPTIME: 00:17:01
LOAD AVERAGE: 0.50, 0.18, 0.06
TASKS: 169
NODENAME:
hp-dl585g2-01.rhts.boston.redhat.com
RELEASE: 2.6.24.3-12.fc8
VERSION: #1 SMP Tue Feb 26 14:21:30 EST 2008
MACHINE: x86_64 (2812 Mhz)
MEMORY: 8 GB
PID: 2938
COMMAND: "crash"
TASK: ffff8102440f48b0 [THREAD_INFO: ffff81027d912000]
CPU: 1
STATE: TASK_RUNNING (ACTIVE)
crash>
From the debug output, there's no useful info re: task ffff81012f0811d0
other than the fact that it's being seen as a duplicate in a pid_hash
chain, and that it doesn't appear to be a temporary "shifting-sands"
condition because 500 retries are not alleviating the condition.
But you can probably tinker with the task refresh function in the
crash utility to skip it, and then investigate why it's there twice.
The crash function should be refresh_hlist_task_table_v3():
crash> help -t | grep refresh
refresh_task_table: refresh_hlist_task_table_v3()
crash>
This piece here in refresh_hlist_task_table_v3() is what gets
retried a maximum of 500 times:
if (!is_idle_thread(next) && !hq_enter(next)) {
error(INFO, "%sduplicate task in pid_hash: %lx\n",
DUMPFILE() ? "\n" : "", next);
if (DUMPFILE())
break;
hq_close();
retries++;
goto retry_pid_hash;
}
Try modifying it to either "continue", or perhaps just "break".
If that's the only irregularity found, you should be able to
get a "crash>" prompt; and then you can look at task ffff81012f0811d0,
because the first instance found in the pid_hash chain should be
listed by "ps" as a task.
Dave