Hi Tao,

Thanks for your suggestion, and this method works well as expected.

Can we make it simpler, since the value we are looking for is the 2nd
oprand of tbnz, there must be a ',' after the value.

tbnz    w2, #0x1f, 0xffff80008001bed0

Can we make it like the following? I haven't tested the following code...

if ((pos2 = strchr(pos1, '#'))) {
  pos2 += 1;
  for (pos1 = pos2; *pos2 != '\0' && *pos2 != ','; pos2++);
  *pos2 = '\0';
  thread_shift = stol(pos1, RETURN_ON_ERROR|QUIET, &errflag);
  if (errflag) {
    thread_shift = 0;
  }
  break;
}


Thanks.
Yeping.ZHENG 

Tao Liu <ltao@redhat.com> 于2024年7月30日周二 08:55写道:
Hi Yeping,

On Mon, Jul 29, 2024 at 2:20 PM yp z <wonderzyp@gmail.com> wrote:
>
> Hi Lianbo and Tao,
> Thank you for your suggestions, and I have rewrote a patch in two cases:
>
> [1] with vmcoreinfo:
> +       if (kernel_symbol_exists("kasan_enable_current")) {
> +               min_thread_shift += 1;
> +               thread_shift = (min_thread_shift < machdep->pageshift) ? machdep->pageshift : min_thread_shift;
> +       }
>
This looks fine to me.

>> BTW:  can you help point out the current issue is with or without the vmcoreinfo? Yeping
>
> My issue is with vmcoreinfo of "SYMBOL("kasan_enable_current")", and works well with this patch.
>
> [2] without vmcoreinfo:
>>
>> What about somehow the disassembly gives us hex values like:
>> #0xe? It won't work by then. Any ideas for this?
>
>  In order to use stol() as expected, I have changed the first non-numeric char to '\0':
> +                       if ((pos1 = strstr(buf1, "tbnz"))) {
> +                               if ((pos2 = strchr(pos1, '#'))) {
> +                                       pos2 += 1;
> +                                       pos1 = pos2;
> +                                       while (*pos2 != '\0') {
> +                                               if (!((*pos2 >= '0' && *pos2 <= '9')
> +                                                       || (*pos2 >= 'A' && *pos2 <= 'F')
> +                                                       || (*pos2 >= 'a' && *pos2 <= 'f'))) {

There are 'x' 'X' not covered, it will not work for the "#0xe" case.
Can we make it simpler, since the value we are looking for is the 2nd
oprand of tbnz, there must be a ',' after the value.

tbnz    w2, #0x1f, 0xffff80008001bed0

Can we make it like the following? I haven't tested the following code...

if ((pos2 = strchr(pos1, '#'))) {
  pos2 += 1;
  for (pos1 = pos2; *pos2 != '\0' && *pos2 != ','; pos2++);
  *pos2 = '\0';
  thread_shift = stol(pos1, RETURN_ON_ERROR|QUIET, &errflag);
  if (errflag) {
    thread_shift = 0;
  }
  break;
}

Thanks,
Tao Liu

> +                                                       *pos2 = '\0';
> +                                                       break;
> +                                               }
> +                                               ++pos2;
> +                                       }
> +                                       thread_shift = stol(pos1, RETURN_ON_ERROR|QUIET, &errflag);
>
> Thanks.
> Yeping.ZHENG
>
> lijiang <lijiang@redhat.com> 于2024年7月26日周五 17:23写道:
>>
>> On Fri, Jul 19, 2024 at 6:40 AM Tao Liu <ltao@redhat.com> wrote:
>>>
>>> Hi Lianbo,
>>>
>>> On Thu, Jul 18, 2024 at 10:50 PM Lianbo Jiang <lijiang@redhat.com> wrote:
>>> >
>>> > On 7/16/24 4:22 PM, Tao Liu wrote:
>>> >
>>> > > Hi Yeping,
>>> > >
>>> > > Thanks for the fix.
>>> > >
>>> > > On Thu, Jul 11, 2024 at 1:38 PM <wonderzyp@gmail.com> wrote:
>>> > >> When using the crash tool to parse the ARM64 dump file with KASAN enabled, I found that using the bt -a command will cause this tool to crash, the following is the backtrace infomation.
>>> > >>
>>> > >> (gdb) bt
>>> > >> #0  0x00005635ac2b166b in arm64_unwind_frame (frame=0x7ffdaf35cb70, bt=0x7ffdaf35d430)
>>> > >>      at arm64.c:2821
>>> > >> #1  arm64_back_trace_cmd (bt=0x7ffdaf35d430) at arm64.c:3306
>>> > >> #2  0x00005635ac27b108 in back_trace (bt=bt@entry=0x7ffdaf35d430) at kernel.c:3239
>>> > >> #3  0x00005635ac2880ae in cmd_bt () at kernel.c:2863
>>> > >> #4  0x00005635ac1f16dc in exec_command () at main.c:893
>>> > >> #5  0x00005635ac1f192a in main_loop () at main.c:840
>>> > >> #6  0x00005635ac50df81 in captured_main (data=<optimized out>) at main.c:1284
>>> > >> #7  gdb_main (args=<optimized out>) at main.c:1313
>>> > >> #8  0x00005635ac50e000 in gdb_main_entry (argc=<optimized out>, argv=<optimized out>)
>>> > >>      at main.c:1338
>>> > >> #9  0x00005635ac1ea2a5 in main (argc=5, argv=0x7ffdaf35dde8) at main.c:721
>>> > >> Eventually, I found that it was may caused by not setting irq_stack_size properly, and provide this patch to solve it.
>>> > >>
>>> > > Could you please re-draft your commit message? The original one looks
>>> > > informal. E.g:
>>> > >
>>> > > A segfault issue was observed on KASAN enabled arm64 kernel due to the
>>> > > incorrect irq_stack_size, see the following stack trace:
>>> > > ...
>>> > > The issue was caused by ...., and this patch will fix the issue by ....
>>> > >
>>> > >>  From 34b28aa8c11e77d20adec4f7705a14d239c8a55f Mon Sep 17 00:00:00 2001
>>> > >> From: wonderzyp <wonderzyp@qq.com>
>>> > >> Date: Mon, 8 Jul 2024 20:11:38 +0800
>>> > >> Subject: [PATCH 1131/1131] set_arm64_irq_stack_size
>>> > >>
>>> > >> Signed-off-by: Yeping Zheng <wonderzyp@gmail.com>
>>> > >> ---
>>> > >>   arm64.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++--
>>> > >>   1 file changed, 47 insertions(+), 2 deletions(-)
>>> > >>
>>> > >> diff --git a/arm64.c b/arm64.c
>>> > >> index b3040d7..39d891b 100644
>>> > >> --- a/arm64.c
>>> > >> +++ b/arm64.c
>>> > >> @@ -93,6 +93,7 @@ static void arm64_calc_VA_BITS(void);
>>> > >>   static int arm64_is_uvaddr(ulong, struct task_context *);
>>> > >>   static void arm64_calc_KERNELPACMASK(void);
>>> > >>   static int arm64_get_vmcoreinfo(unsigned long *vaddr, const char *label, int base);
>>> > >> +static ulong arm64_set_irq_stack_size(struct machine_specific *ms);
>>> > >>
>>> > >>   struct kernel_range {
>>> > >>          unsigned long modules_vaddr, modules_end;
>>> > >> @@ -2223,8 +2224,14 @@ arm64_irq_stack_init(void)
>>> > >>                  if (MEMBER_EXISTS("thread_union", "stack")) {
>>> > >>                          if ((sz = MEMBER_SIZE("thread_union", "stack")) > 0)
>>> > >>                                  ms->irq_stack_size = sz;
>>> > >> -               } else
>>> > >> -                       ms->irq_stack_size = ARM64_IRQ_STACK_SIZE;
>>> > >> +               } else {
>>> > >> +                       ulong res = arm64_set_irq_stack_size(ms);
>>> > >> +                       if (res > 0){
>>> > >> +                               ms->irq_stack_size = res;
>>> > >> +                       } else {
>>> > >> +                               ms->irq_stack_size = ARM64_IRQ_STACK_SIZE;
>>> > >> +                       }
>>> > >> +               }
>>> > >>
>>> > >>                  machdep->flags |= IRQ_STACKS;
>>> > >>
>>> > >> @@ -4921,6 +4928,44 @@ static void arm64_calc_KERNELPACMASK(void)
>>> > >>          }
>>> > >>   }
>>> > >>
>>> > >> +static ulong arm64_set_irq_stack_size(struct machine_specific *ms)
>>> > >> +{
>>> > >> +       char *string;
>>> > >> +       int ret;
>>> > >> +       int KASAN_THREAD_SHIFT = 0;
>>> > >> +       int MIN_THREAD_SHIFT;
>>> > >> +       ulong ARM64_PAGE_SHIFT;
>>> > >> +       ulong THREAD_SHIFT = 0;
>>> > >> +       ulong THREAD_SIZE;
>>> > > I guess the upper case of variable names is not encouraged, though it
>>> > > is the variable that comes from kernel config file.
>>> > >
>>> > >> +       if (kt->ikconfig_flags & IKCONFIG_AVAIL) {
>>> > >> +               if ((ret = get_kernel_config("CONFIG_KASAN_GENERIC", NULL) == IKCONFIG_Y) ||
>>> > >> +                       (ret = get_kernel_config("CONFIG_KASAN_SW_TAGS", NULL) == IKCONFIG_Y)) {
>>> > >> +                               KASAN_THREAD_SHIFT = 1;
>>> > >> +                       }
>>> > >> +       }
>>> > >> +       MIN_THREAD_SHIFT = 14 + KASAN_THREAD_SHIFT;
>>> > >> +
>>> > >> +       if (kt->ikconfig_flags & IKCONFIG_AVAIL) {
>>> > > Could the if condition be merged with the prior one?
>>> > >
>>> > >> +               if ((ret = get_kernel_config("CONFIG_VMAP_STACK", NULL)) == IKCONFIG_Y){
>>> > >> +                       if ((ret = get_kernel_config("CONFIG_ARM64_PAGE_SHIFT", &string)) ==
>>> >
>>> > The "CONFIG_ARM64_PAGE_SHIFT " has been removed since kernel v6.9-rc1,
>>> > so this can not work on the latest kernel. See:
>>> >
>>> > d3e5bab923d3 ("arch: simplify architecture specific page size
>>> > configuration")
>>> >
>>> >
>>> > In addition, the IKCONFIG is not available in most distributions,  it
>>> > can not cover this case. There is a similar discussion there:
>>> >
>>> > https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00880.html
>>> >
>>> > Can you try to handle the current issue with a similar solution?
>>> >
>>> >
>>> > >> IKCONFIG_STR){
>>> > >> +                               ARM64_PAGE_SHIFT = atol(string);
>>> > >> +                       }
>>> > >> +                       if (MIN_THREAD_SHIFT < ARM64_PAGE_SHIFT){
>>> > >> +                               THREAD_SHIFT = ARM64_PAGE_SHIFT;
>>> > >> +                       } else {
>>> > >> +                               THREAD_SHIFT = MIN_THREAD_SHIFT;
>>> > >> +                       }
>>> > >> +               }
>>> > >> +       }
>>> > >> +
>>> > >> +       if (THREAD_SHIFT == 0) {
>>> > >> +               return -1;
>>> > >> +       }
>>> > >> +
>>> > >> +       THREAD_SIZE = ((1UL) << THREAD_SHIFT);
>>> > >> +       return THREAD_SIZE;
>>> > >> +}
>>> > > I'm OK with the approach above, since it directly came from the kernel
>>> > > source. However I'm not a fan of checking kernel configs, there might
>>> > > be kernels which are compiled without CONFIG_IKCONFIG.
>>> > >
>>> > > Could we add an approach here, to get the value from disassembly when
>>> > > CONFIG_IKCONFIG is negative?
>>> > >
>>> > > kernel source: arch/arm64/kernel/entry.S:
>>> > >
>>> > > .macro kernel_ventry, el:req, ht:req, regsize:req, label:req
>>> > > ....
>>> > > add sp, sp, x0 // sp' = sp + x0
>>> > > sub x0, sp, x0 // x0' = sp' - x0 = (sp + x0) - x0 = sp
>>> > > tbnz x0, #THREAD_SHIFT, 0f <<<<<<<<
>>> > >
>>> > > $ objdump -d vmlinux
>>> > > ...
>>> > > ffff800080010800 <vectors>:
>>> > > ffff800080010800:       d10543ff        sub     sp, sp, #0x150
>>> > > ffff800080010804:       8b2063ff        add     sp, sp, x0
>>> > > ffff800080010808:       cb2063e0        sub     x0, sp, x0
>>> > > ffff80008001080c:       37800080        tbnz    w0, #16,
>>> > > ffff80008001081c <vectors+0x1c> <<<<<<<<<<
>>> > >
>>> > > It is easy to get the THREAD_SHIFT value by disassembling the tbnz
>>> > > instruction. What do you think @Lianbo Jiang
>>> >
>>> > This is a good idea, but it still relies on compiler.
>>> >
>>> > As we discussed, usually I would recommend finding its values via some
>>> > kernel symbols.
>>>
>>> Yeah, but I didn't find a suitable symbol so far... In addition, the
>>
>>
>> It's true. For the KASAN, it is easy to find a suitable symbol, but it's not easy for the PAGESIZE or PAGESHIFT.
>>
>> Given that I would suggest covering two cases:
>> [1] with vmcoreinfo(see mm/kasan/common.c)
>> ...
>> int min_thread_shift = 14;
>> if (kernel_symbol_exists("kasan_enable_current"))
>>     min_thread_shift += 1;
>>
>> thread_shift = (min_thread_shift < machdep->pageshift) ? machdep->pageshift : min_thread_shift;
>> ...
>>
>> [2] without vmcoreinfo
>> Use Tao's solution(disassemble kernel code)
>>
>> The first one is better than the second one in case of having vmcoreinfo, If I understand correctly.
>> But we still need to cover the above two cases in the crash tools.
>>
>> BTW:  can you help point out the current issue is with or without the vmcoreinfo? Yeping
>>
>> Thanks
>> Lianbo
>>
>>> above case is a little different here, the tbnz instruction is not
>>> generated from C code, it is written in assembly in
>>> arch/arm64/kernel/entry.S file. So I guess there would be a lower
>>> possibility if tbnz instruction gets replaced or eliminated by the
>>> compilers. Anyway, it's OK to me if you don't want to accept the
>>> approach.
>>>
>>> Thanks,
>>> Tao Liu
>>>
>>>
>>> >
>>> >
>>> > Thanks
>>> >
>>> > Lianbo
>>> >
>>> > >
>>> > > Thanks,
>>> > > Tao Liu
>>> > >
>>> > >> +
>>> > >>   #endif  /* ARM64 */
>>> > >>
>>> > >>
>>> > >> --
>>> > >> 2.25.1
>>> > >>
>>> >
>>>