----- Original Message -----
Hello,
I tried running crash-head (HEAD: 5d172b230cf4) against today's linus'
master on a dump obtained via dump-guest-memory in qemu. And I got the
following when the image is loaded:
please wait... (determining panic task)
bt: read error: kernel virtual address: fffffe0000007000 type: "stack
contents"
KERNEL: vmlinux
DUMPFILE: memory-verbatim.img
CPUS: 1
DATE: Wed Apr 4 16:36:47 2018
UPTIME: 00:27:48
LOAD AVERAGE: 31.11, 17.80, 10.43
TASKS: 145
NODENAME: ubuntu-virtual
RELEASE: 4.16.0-rc7-nbor
VERSION: #570 SMP Wed Apr 4 16:03:44 EEST 2018
MACHINE: x86_64 (3392 Mhz)
MEMORY: 4 GB
PANIC: ""
PID: 0
COMMAND: "swapper/0"
TASK: ffffffff82016500 [THREAD_INFO: ffffffff82016500]
CPU: 0
STATE: TASK_RUNNING
WARNING: panic task not found
crash> bt
PID: 0 TASK: ffffffff82016500 CPU: 0 COMMAND: "swapper/0"
#0 [ffffffff82003dc8] __schedule at ffffffff817ea059
bt: invalid RSP: ffffffff82003dc8 bt->stackbase/stacktop:
ffffffff82000000/ffffffff82002000 cpu: 0
So the kernel has been compiled with : gcc (Ubuntu
5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 which has retpoline enabled.
I have KASLR disabled: # CONFIG_RANDOMIZE_BASE is not set and the kernel
is compiled with CONFIG_FRAME_POINTER=y .
This scenario used to work around the 4.10 timeline. Am I doing
something wrong or crash still needs time to work on the latest upstream
kernel code?
Presumably the latter.
If you do a "task -R stack ffffffff82016500", I'm presuming that it
shows the stack base address is ffffffff82000000. And the looking at
the stackbase/stacktop values, the crash utility is presuming an 8K stack:
bt: invalid RSP: ffffffff82003dc8 bt->stackbase/stacktop:
ffffffff82000000/ffffffff82002000 cpu: 0
But the RSP is ffffffff82003dc8, which puts its beyond the 8K stack size,
so I'm presuming that the kernel is actually using 16K stacks. The most
recent kernel I have is 4.16.0-0.rc6.git3.1.fc29.x86_64, which uses 16K stacks.
Here is how the crash utility determines the stack size. The x86_64 stacksize
starts out with a default size of 2 pages, as set here in x86_64_init(PRE_SYMTAB):
case PRE_SYMTAB:
... [ cut ] ...
machdep->stacksize = machdep->pagesize * 2;
...
Then later on in task_init(), it gets resized as shown here, where
the STACKSIZE() macro is machdep->stacksize:
if (VALID_SIZE(task_union) && (SIZE(task_union) != STACKSIZE())) {
error(WARNING, "\nnon-standard stack size: %ld\n",
len = SIZE(task_union));
machdep->stacksize = len;
} else if (VALID_SIZE(thread_union) &&
((len = SIZE(thread_union)) != STACKSIZE()))
machdep->stacksize = len;
The "task_union" no longer exists, and so it checks whether the
"thread_union" is larger than the default stacksize, and resets the
size appropriately.
On my 4.16.0-0.rc6.git3.1.fc29.x86_64 kernel, here is the thread_union:
crash> thread_union
union thread_union {
struct task_struct task;
unsigned long stack[2048];
}
SIZE: 16384
And so it gets reset:
crash> help -m | grep stacksize
stacksize: 16384
crash>
You can debug it from there. Let me know what you find.
Thanks,
Dave