On 01/23/2018 11:19 PM, Dave Anderson wrote:
----- Original Message -----
> Hi Dave,
>
> Recently I was trying crash tool with kdump dumpfile & structure
> layout randomized kernel[*](), and it fails without any surprise. After
> looking into the different errors crash reports, I can confirm it is a
> result from randomized structure layout.
>
> So my questions is, do you ever consider supporting this feature[*] in
> crash?
> If yes, do you have any plan & technique evaluation about it?
> If no, what's the reason?
>
> [*]https://lwn.net/Articles/722293/
> --
> Sincerely,
> Cao jin
I was under the impression that the structure layout was done at
compile-time, and that the vmlinux file's debuginfo data would
represent the randomized layout. And that being the case, the
inconvenience would be that the crash session would show the
randomized layout, while the associated source code would show
the original layout.
Oh, I didn't know the vmlinux's debuginfo would represent the randomized
layout, I thought it is still the original layout from my current
understanding. Look the explanation below.
You didn't give any examples of how/what fails. Is it a major
problem
where fundamental facilities like MEMBER_OFFSET() no longer work?
Or are there places where assumptions are made w/regard to structure
layout without checking the debuginfo data?
MEMBER_OFFSET() is one of the error points I encountered.
The 1st error is from reading "init_uts_ns"(randomized) value in
kernel_init, so kt->kernel_version is wrong, then results invalid
virtual addresses in "case POST_GDB:" of x86_64_init, the symptom of
this error is segment fault results from infinite function call:
[infinite...until stack overflow]
#71456 0x00000000005305d1 in x86_64_kvtop (tc=0x0,
kvaddr=18446744071828443136, paddr=0x7fffffffdf68, verbose=0) at
x86_64.c:2080
#71457 0x0000000000485005 in kvtop (tc=0x0, kvaddr=18446744071828443136,
paddr=0x7fffffffdf68, verbose=0) at memory.c:2937
#71458 0x0000000000482982 in readmem (addr=18446744071828443136,
memtype=1, buffer=0x1048150, size=4096, type=0x90ae53 "init_level4_pgt",
error_handle=1) at memory.c:2192
#71459 0x00000000005305d1 in x86_64_kvtop (tc=0x0,
kvaddr=18446744071828443136, paddr=0x7fffffffe0c8, verbose=0) at
x86_64.c:2080
#71460 0x0000000000485005 in kvtop (tc=0x0, kvaddr=18446744071828443136,
paddr=0x7fffffffe0c8, verbose=0) at memory.c:2937
#71461 0x0000000000482982 in readmem (addr=18446744071828443136,
memtype=1, buffer=0x1048150, size=4096, type=0x90ae53 "init_level4_pgt",
error_handle=1) at memory.c:2192
#71462 0x000000000052d9d8 in x86_64_init_kernel_pgd () at x86_64.c:1440
#71463 0x0000000000529612 in x86_64_init (when=3) at x86_64.c:560
#71464 0x000000000046538d in main_loop () at main.c:769
#71465 0x00000000006f15c3 in captured_command_loop (data=data@entry=0x0)
at main.c:258
#71466 0x00000000006f027a in catch_errors (func=func@entry=0x6f15b0
<captured_command_loop>, func_args=func_args@entry=0x0,
errstring=errstring@entry=0x945dc9 "", mask=mask@entry=6) at
exceptions.c:557
#71467 0x00000000006f25fe in captured_main
(data=data@entry=0x7fffffffe320) at main.c:1064
#71468 0x00000000006f027a in catch_errors (func=func@entry=0x6f1890
<captured_main>, func_args=func_args@entry=0x7fffffffe320,
errstring=errstring@entry=0x945dc9 "", mask=mask@entry=6) at
exceptions.c:557
#71469 0x00000000006f28d7 in gdb_main (args=0x7fffffffe320) at main.c:1079
#71470 gdb_main_entry (argc=<optimized out>, argv=<optimized out>) at
main.c:1099
#71471 0x0000000000518668 in gdb_main_loop (argc=2, argv=0x7fffffffe4a8)
at gdb_interface.c:76
#71472 0x00000000004651f6 in main (argc=3, argv=0x7fffffffe4a8) at
main.c:707
After using gdb "print" command to fill the correct value to
kt->kernel_version, the rest error are basically from MEMBER_OFFSET().
From my limited understanding and observation, the offset value of
structure field retrieved from gdb interface is from the original
layout, so when using this offset to calculate the address of field,
crash won't read the correct value of that field from PT_LOAD segment if
the layout is randomized. The error symptom is like:
crash: invalid task address: ffff97c2710288e0
And the final result is: the tt->running_tasks would be 0, tt->current
would be null, so access tt->current->task results segment fault.
I am not familiar with relation between vmlinux's debuginfo and crash,
but currently I think crash is using gdb_interface to interactive with
debuginfo data(like the things done by macros who wraps
datatype_info()), and the structure layout info in debuginfo is from
original layout. Am I wrong about it?
Anyway, the answer to your question is no, currently I have no
plans.
Dave
--
Sincerely,
Cao jin