----- Original Message -----
Hi Dave,
I got a dump where a process "gmain" was incorrectly marked as running:
crash> ps | grep gmain
> 217 1 5 8bec23420 IN 0.0 463276 18240 gmain
The reason was that the "brute force" way parsing the "bt -t -o"
output in panic_search() found the symbol "panic" on the stack:
crash> bt -t -o 8bec23420
PID: 217 TASK: 8bec23420 CPU: 5 COMMAND: "gmain"
START: __schedule at 83f650
[ 8b662b900] (null) at 0
[ 8b662b950] (null) at 0
[ 8b662b978] __schedule at 83f650
[ 8b662b990] (null) at 0
...
[ 8b662bb18] (null) at 0
[ 8b662bb40] panic at 83679a <<<<<--------------
[ 8b662bb58] _ehead at 280da
I guess the obvious question is why "panic" was on the stack?
The real stack trace was as follows:
crash> bt 8bec23420
Detaching after fork from child process 15508.
PID: 217 TASK: 8bec23420 CPU: 5 COMMAND: "gmain"
#0 [8b662b8f0] __schedule at 83f650
#1 [8b662b958] schedule at 83fade
#2 [8b662b970] schedule_hrtimeout_range_clock at 842fc8
#3 [8b662ba10] poll_schedule_timeout at 2c6e8a
#4 [8b662ba30] do_sys_poll at 2c8604
#5 [8b662be40] sys_poll at 2c8852
#6 [8b662bea8] system_call at 843a66
IMHO the "-t" method is quite risky (at least on s390). What about using
the "normal" stack backtrace without the "-t" bt option?
That really worries me -- introducing the usage of normal backtrace on all tasks
instead of simply walking the stack memory looking for text addresses is a huge
change.
Dave
---
task.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/task.c
+++ b/task.c
@@ -6633,7 +6633,7 @@ panic_search(void)
fd = &foreach_data;
fd->keys = 1;
fd->keyword_array[0] = FOREACH_BT;
- fd->flags |= (FOREACH_t_FLAG|FOREACH_o_FLAG);
+ fd->flags |= FOREACH_o_FLAG;
dietask = lasttask = NO_TASK;