Hi, I found a problem on crash-4.0-2.12.
Summary:
bt command does not show stack traces of some CPUs.
Condition:
This problem happens only on ia64 machine.
There are two conditions to reproduce this problem.
1) Diskdump is executed via OS_INIT.
2) The machine has more than 8 CPUs.
Details:
When I executed bt command for vmcore which was created
on the 32 CPU machine, bt didn't show stack traces of some CPU.
Please see attached file(bt_failed.txt). Stack traces from CPU0 to
CPU7 are showed normally, but stack traces from CPU8 to CPU31 are not.
(Please don't worry about a message "unwind: bsp (xxxxxxxxx) out of
range". This is a problem of our platform.)
Cause:
I found a bug in ia64.c.
2679 ms->ia64_init_stack_size = get_array_length("ia64_init_stack",
2680 NULL, 0);
get_array_length() gets the length of stack of OS_INIT, and the
length is stored at ms->ia64_init_stack_size. However, the value
which get_array_length gives is different from actual stack length
because "ia64_init_stack" is declared like this:
u64 ia64_init_stack[NR_CPUS*KERNEL_STACK_SIZE/8];
Therefore, correct length of a stack is this:
get_array_length("ia64_init_stack", NULL, 0) * sizeof(u64)
I don't know how to fix, but it seems that attached patch
(ia64.c.patch) corrects this problem.
Another attached patch(test.patch) also seems to fix the problem,
but I don't know which is better.
Regards,
Takao Indoh