Hi Yeping,
On Mon, Jul 29, 2024 at 2:20 PM yp z <wonderzyp(a)gmail.com> wrote:
Hi Lianbo and Tao,
Thank you for your suggestions, and I have rewrote a patch in two cases:
[1] with vmcoreinfo:
+ if (kernel_symbol_exists("kasan_enable_current")) {
+ min_thread_shift += 1;
+ thread_shift = (min_thread_shift < machdep->pageshift) ?
machdep->pageshift : min_thread_shift;
+ }
> BTW: can you help point out the current issue is with or without
the vmcoreinfo? Yeping
My issue is with vmcoreinfo of "SYMBOL("kasan_enable_current")", and
works well with this patch.
[2] without vmcoreinfo:
>
> What about somehow the disassembly gives us hex values like:
> #0xe? It won't work by then. Any ideas for this?
In order to use stol() as expected, I have changed the first non-numeric char to
'\0':
+ if ((pos1 = strstr(buf1, "tbnz"))) {
+ if ((pos2 = strchr(pos1, '#'))) {
+ pos2 += 1;
+ pos1 = pos2;
+ while (*pos2 != '\0') {
+ if (!((*pos2 >= '0' &&
*pos2 <= '9')
+ || (*pos2 >= 'A'
&& *pos2 <= 'F')
+ || (*pos2 >= 'a'
&& *pos2 <= 'f'))) {
There are 'x' 'X' not covered, it will not work for the "#0xe"
case.
Can we make it simpler, since the value we are looking for is the 2nd
oprand of tbnz, there must be a ',' after the value.
tbnz w2, #0x1f, 0xffff80008001bed0
Can we make it like the following? I haven't tested the following code...
if ((pos2 = strchr(pos1, '#'))) {
pos2 += 1;
for (pos1 = pos2; *pos2 != '\0' && *pos2 != ','; pos2++);
*pos2 = '\0';
thread_shift = stol(pos1, RETURN_ON_ERROR|QUIET, &errflag);
if (errflag) {
thread_shift = 0;
}
break;
}
Thanks,
Tao Liu
+ *pos2 =
'\0';
+ break;
+ }
+ ++pos2;
+ }
+ thread_shift = stol(pos1, RETURN_ON_ERROR|QUIET,
&errflag);
Thanks.
Yeping.ZHENG
lijiang <lijiang(a)redhat.com> 于2024年7月26日周五 17:23写道:
>
> On Fri, Jul 19, 2024 at 6:40 AM Tao Liu <ltao(a)redhat.com> wrote:
>>
>> Hi Lianbo,
>>
>> On Thu, Jul 18, 2024 at 10:50 PM Lianbo Jiang <lijiang(a)redhat.com> wrote:
>> >
>> > On 7/16/24 4:22 PM, Tao Liu wrote:
>> >
>> > > Hi Yeping,
>> > >
>> > > Thanks for the fix.
>> > >
>> > > On Thu, Jul 11, 2024 at 1:38 PM <wonderzyp(a)gmail.com> wrote:
>> > >> When using the crash tool to parse the ARM64 dump file with KASAN
enabled, I found that using the bt -a command will cause this tool to crash, the following
is the backtrace infomation.
>> > >>
>> > >> (gdb) bt
>> > >> #0 0x00005635ac2b166b in arm64_unwind_frame (frame=0x7ffdaf35cb70,
bt=0x7ffdaf35d430)
>> > >> at arm64.c:2821
>> > >> #1 arm64_back_trace_cmd (bt=0x7ffdaf35d430) at arm64.c:3306
>> > >> #2 0x00005635ac27b108 in back_trace (bt=bt@entry=0x7ffdaf35d430)
at kernel.c:3239
>> > >> #3 0x00005635ac2880ae in cmd_bt () at kernel.c:2863
>> > >> #4 0x00005635ac1f16dc in exec_command () at main.c:893
>> > >> #5 0x00005635ac1f192a in main_loop () at main.c:840
>> > >> #6 0x00005635ac50df81 in captured_main (data=<optimized
out>) at main.c:1284
>> > >> #7 gdb_main (args=<optimized out>) at main.c:1313
>> > >> #8 0x00005635ac50e000 in gdb_main_entry (argc=<optimized
out>, argv=<optimized out>)
>> > >> at main.c:1338
>> > >> #9 0x00005635ac1ea2a5 in main (argc=5, argv=0x7ffdaf35dde8) at
main.c:721
>> > >> Eventually, I found that it was may caused by not setting
irq_stack_size properly, and provide this patch to solve it.
>> > >>
>> > > Could you please re-draft your commit message? The original one looks
>> > > informal. E.g:
>> > >
>> > > A segfault issue was observed on KASAN enabled arm64 kernel due to the
>> > > incorrect irq_stack_size, see the following stack trace:
>> > > ...
>> > > The issue was caused by ...., and this patch will fix the issue by
....
>> > >
>> > >> From 34b28aa8c11e77d20adec4f7705a14d239c8a55f Mon Sep 17 00:00:00
2001
>> > >> From: wonderzyp <wonderzyp(a)qq.com>
>> > >> Date: Mon, 8 Jul 2024 20:11:38 +0800
>> > >> Subject: [PATCH 1131/1131] set_arm64_irq_stack_size
>> > >>
>> > >> Signed-off-by: Yeping Zheng <wonderzyp(a)gmail.com>
>> > >> ---
>> > >> arm64.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++--
>> > >> 1 file changed, 47 insertions(+), 2 deletions(-)
>> > >>
>> > >> diff --git a/arm64.c b/arm64.c
>> > >> index b3040d7..39d891b 100644
>> > >> --- a/arm64.c
>> > >> +++ b/arm64.c
>> > >> @@ -93,6 +93,7 @@ static void arm64_calc_VA_BITS(void);
>> > >> static int arm64_is_uvaddr(ulong, struct task_context *);
>> > >> static void arm64_calc_KERNELPACMASK(void);
>> > >> static int arm64_get_vmcoreinfo(unsigned long *vaddr, const char
*label, int base);
>> > >> +static ulong arm64_set_irq_stack_size(struct machine_specific
*ms);
>> > >>
>> > >> struct kernel_range {
>> > >> unsigned long modules_vaddr, modules_end;
>> > >> @@ -2223,8 +2224,14 @@ arm64_irq_stack_init(void)
>> > >> if (MEMBER_EXISTS("thread_union",
"stack")) {
>> > >> if ((sz =
MEMBER_SIZE("thread_union", "stack")) > 0)
>> > >> ms->irq_stack_size = sz;
>> > >> - } else
>> > >> - ms->irq_stack_size =
ARM64_IRQ_STACK_SIZE;
>> > >> + } else {
>> > >> + ulong res = arm64_set_irq_stack_size(ms);
>> > >> + if (res > 0){
>> > >> + ms->irq_stack_size = res;
>> > >> + } else {
>> > >> + ms->irq_stack_size =
ARM64_IRQ_STACK_SIZE;
>> > >> + }
>> > >> + }
>> > >>
>> > >> machdep->flags |= IRQ_STACKS;
>> > >>
>> > >> @@ -4921,6 +4928,44 @@ static void arm64_calc_KERNELPACMASK(void)
>> > >> }
>> > >> }
>> > >>
>> > >> +static ulong arm64_set_irq_stack_size(struct machine_specific
*ms)
>> > >> +{
>> > >> + char *string;
>> > >> + int ret;
>> > >> + int KASAN_THREAD_SHIFT = 0;
>> > >> + int MIN_THREAD_SHIFT;
>> > >> + ulong ARM64_PAGE_SHIFT;
>> > >> + ulong THREAD_SHIFT = 0;
>> > >> + ulong THREAD_SIZE;
>> > > I guess the upper case of variable names is not encouraged, though it
>> > > is the variable that comes from kernel config file.
>> > >
>> > >> + if (kt->ikconfig_flags & IKCONFIG_AVAIL) {
>> > >> + if ((ret =
get_kernel_config("CONFIG_KASAN_GENERIC", NULL) == IKCONFIG_Y) ||
>> > >> + (ret =
get_kernel_config("CONFIG_KASAN_SW_TAGS", NULL) == IKCONFIG_Y)) {
>> > >> + KASAN_THREAD_SHIFT = 1;
>> > >> + }
>> > >> + }
>> > >> + MIN_THREAD_SHIFT = 14 + KASAN_THREAD_SHIFT;
>> > >> +
>> > >> + if (kt->ikconfig_flags & IKCONFIG_AVAIL) {
>> > > Could the if condition be merged with the prior one?
>> > >
>> > >> + if ((ret =
get_kernel_config("CONFIG_VMAP_STACK", NULL)) == IKCONFIG_Y){
>> > >> + if ((ret =
get_kernel_config("CONFIG_ARM64_PAGE_SHIFT", &string)) ==
>> >
>> > The "CONFIG_ARM64_PAGE_SHIFT " has been removed since kernel
v6.9-rc1,
>> > so this can not work on the latest kernel. See:
>> >
>> > d3e5bab923d3 ("arch: simplify architecture specific page size
>> > configuration")
>> >
>> >
>> > In addition, the IKCONFIG is not available in most distributions, it
>> > can not cover this case. There is a similar discussion there:
>> >
>> >
https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00880.html
>> >
>> > Can you try to handle the current issue with a similar solution?
>> >
>> >
>> > >> IKCONFIG_STR){
>> > >> + ARM64_PAGE_SHIFT = atol(string);
>> > >> + }
>> > >> + if (MIN_THREAD_SHIFT <
ARM64_PAGE_SHIFT){
>> > >> + THREAD_SHIFT = ARM64_PAGE_SHIFT;
>> > >> + } else {
>> > >> + THREAD_SHIFT = MIN_THREAD_SHIFT;
>> > >> + }
>> > >> + }
>> > >> + }
>> > >> +
>> > >> + if (THREAD_SHIFT == 0) {
>> > >> + return -1;
>> > >> + }
>> > >> +
>> > >> + THREAD_SIZE = ((1UL) << THREAD_SHIFT);
>> > >> + return THREAD_SIZE;
>> > >> +}
>> > > I'm OK with the approach above, since it directly came from the
kernel
>> > > source. However I'm not a fan of checking kernel configs, there
might
>> > > be kernels which are compiled without CONFIG_IKCONFIG.
>> > >
>> > > Could we add an approach here, to get the value from disassembly when
>> > > CONFIG_IKCONFIG is negative?
>> > >
>> > > kernel source: arch/arm64/kernel/entry.S:
>> > >
>> > > .macro kernel_ventry, el:req, ht:req, regsize:req, label:req
>> > > ....
>> > > add sp, sp, x0 // sp' = sp + x0
>> > > sub x0, sp, x0 // x0' = sp' - x0 = (sp + x0) - x0 = sp
>> > > tbnz x0, #THREAD_SHIFT, 0f <<<<<<<<
>> > >
>> > > $ objdump -d vmlinux
>> > > ...
>> > > ffff800080010800 <vectors>:
>> > > ffff800080010800: d10543ff sub sp, sp, #0x150
>> > > ffff800080010804: 8b2063ff add sp, sp, x0
>> > > ffff800080010808: cb2063e0 sub x0, sp, x0
>> > > ffff80008001080c: 37800080 tbnz w0, #16,
>> > > ffff80008001081c <vectors+0x1c>
<<<<<<<<<<
>> > >
>> > > It is easy to get the THREAD_SHIFT value by disassembling the tbnz
>> > > instruction. What do you think @Lianbo Jiang
>> >
>> > This is a good idea, but it still relies on compiler.
>> >
>> > As we discussed, usually I would recommend finding its values via some
>> > kernel symbols.
>>
>> Yeah, but I didn't find a suitable symbol so far... In addition, the
>
>
> It's true. For the KASAN, it is easy to find a suitable symbol, but it's not
easy for the PAGESIZE or PAGESHIFT.
>
> Given that I would suggest covering two cases:
> [1] with vmcoreinfo(see mm/kasan/common.c)
> ...
> int min_thread_shift = 14;
> if (kernel_symbol_exists("kasan_enable_current"))
> min_thread_shift += 1;
>
> thread_shift = (min_thread_shift < machdep->pageshift) ? machdep->pageshift
: min_thread_shift;
> ...
>
> [2] without vmcoreinfo
> Use Tao's solution(disassemble kernel code)
>
> The first one is better than the second one in case of having vmcoreinfo, If I
understand correctly.
> But we still need to cover the above two cases in the crash tools.
>
> BTW: can you help point out the current issue is with or without the vmcoreinfo?
Yeping
>
> Thanks
> Lianbo
>
>> above case is a little different here, the tbnz instruction is not
>> generated from C code, it is written in assembly in
>> arch/arm64/kernel/entry.S file. So I guess there would be a lower
>> possibility if tbnz instruction gets replaced or eliminated by the
>> compilers. Anyway, it's OK to me if you don't want to accept the
>> approach.
>>
>> Thanks,
>> Tao Liu
>>
>>
>> >
>> >
>> > Thanks
>> >
>> > Lianbo
>> >
>> > >
>> > > Thanks,
>> > > Tao Liu
>> > >
>> > >> +
>> > >> #endif /* ARM64 */
>> > >>
>> > >>
>> > >> --
>> > >> 2.25.1
>> > >>
>> >
>>