Hi all,

I may found a potential bug when using qcom arm64 ramdump to parse backtrace.

Unfortunately, I actually found no processes can use the bt command correctly.

 

Ex: when start crash tool to do analyse: # crash vmlinux --kaslr=xxx DDRCS0_0.BIN@0x0000000080000000,... --machdep vabits_actual=39

 

Then seen below misleading backtrace information :  

crash> bt 16930

PID: 16930    TASK: ffffff89b3eada00  CPU: 2    COMMAND: "Firebase Backgr"

 #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4

 #1 [ffffffc034c43850] __kvm_nvhe_$d.2314 at 6be732e004cf05a0

 #2 [ffffffc034c438b0] __kvm_nvhe_$d.2314 at 86c54c6004ceff80

 #3 [ffffffc034c43950] __kvm_nvhe_$d.2314 at 55d6f96003a7b120

 #4 [ffffffc034c439f0] __kvm_nvhe_$d.2314 at 9ccec46003a80a64

 #5 [ffffffc034c43ac0] __kvm_nvhe_$d.2314 at 8cf41e6003a945c4

 #6 [ffffffc034c43b10] __kvm_nvhe_$d.2314 at a8f181e00372c818

 #7 [ffffffc034c43b40] __kvm_nvhe_$d.2314 at 6dedde600372c0d0

 #8 [ffffffc034c43b90] __kvm_nvhe_$d.2314 at 62cc07e00373d0ac

 #9 [ffffffc034c43c00] __kvm_nvhe_$d.2314 at 72fb1de00373bedc

...

     PC: 00000073f5294840   LR: 00000070d8f39ba4   SP: 00000070d4afd5d0

    X29: 00000070d4afd600  X28: b4000071efcda7f0  X27: 00000070d4afe000

    X26: 0000000000000000  X25: 00000070d9616000  X24: 0000000000000000

    X23: 0000000000000000  X22: 0000000000000000  X21: 0000000000000000

    X20: b40000728fd27520  X19: b40000728fd27550  X18: 000000702daba000

    X17: 00000073f5294820  X16: 00000070d940f9d8  X15: 00000000000000bf

    X14: 0000000000000000  X13: 00000070d8ad2fac  X12: b40000718fce5040

    X11: 0000000000000000  X10: 0000000000000070   X9: 0000000000000001

     X8: 0000000000000062   X7: 0000000000000020   X6: 0000000000000000

     X5: 0000000000000000   X4: 0000000000000000   X3: 0000000000000000

     X2: 0000000000000002   X1: 0000000000000080   X0: b40000728fd27550

    ORIG_X0: b40000728fd27550  SYSCALLNO: ffffffff  PSTATE: 40001000

 

By checking the raw data below, will see the lr (fp+8) data show the pointer which already been replaced by PAC prefix.

 

crash> bt -f

PID: 16930    TASK: ffffff89b3eada00  CPU: 2    COMMAND: "Firebase Backgr"

 #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4

    ffffffc034c437f0: ffffffc034c43850 6be732e004cf05a4

    ffffffc034c43800: ffffffe006186108 a0ed07e004cf09c4

    ffffffc034c43810: ffffff8a1a340000 ffffff8a8d343c00

    ffffffc034c43820: ffffff89b3eada00 ffffff8b780db540

    ffffffc034c43830: ffffff89b3eada00 0000000000000000

    ffffffc034c43840: 0000000000000004 712b828118484a00

 #1 [ffffffc034c43850] __kvm_nvhe_$d.2314 at 6be732e004cf05a0

    ffffffc034c43850: ffffffc034c438b0 86c54c6004ceff84

    ffffffc034c43860: 000000708070f000 ffffffc034c43938

    ffffffc034c43870: ffffff88bd822878 ffffff89b3eada00

...

 

So we check the CONFIG_ARM64_PTR_AUTH and CONFIG_ARM64_PTR_AUTH_KERNEL to double check if pac mechanism been enabled on this ramdump.

Then we use vabits to figure it out.

Fix then show the right backtrace below:

crash> bt 16930

PID: 16930    TASK: ffffff89b3eada00  CPU: 2    COMMAND: "Firebase Backgr"

 #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4

 #1 [ffffffc034c43850] __schedule at ffffffe004cf05a0

 #2 [ffffffc034c438b0] preempt_schedule_common at ffffffe004ceff80

 #3 [ffffffc034c43950] unmap_page_range at ffffffe003a7b120

 #4 [ffffffc034c439f0] unmap_vmas at ffffffe003a80a64

 #5 [ffffffc034c43ac0] exit_mmap at ffffffe003a945c4

 #6 [ffffffc034c43b10] __mmput at ffffffe00372c818

 #7 [ffffffc034c43b40] mmput at ffffffe00372c0d0

 #8 [ffffffc034c43b90] exit_mm at ffffffe00373d0ac

 #9 [ffffffc034c43c00] do_exit at ffffffe00373bedc

     PC: 00000073f5294840   LR: 00000070d8f39ba4   SP: 00000070d4afd5d0

    X29: 00000070d4afd600  X28: b4000071efcda7f0  X27: 00000070d4afe000

    X26: 0000000000000000  X25: 00000070d9616000  X24: 0000000000000000

    X23: 0000000000000000  X22: 0000000000000000  X21: 0000000000000000

    X20: b40000728fd27520  X19: b40000728fd27550  X18: 000000702daba000

    X17: 00000073f5294820  X16: 00000070d940f9d8  X15: 00000000000000bf

    X14: 0000000000000000  X13: 00000070d8ad2fac  X12: b40000718fce5040

    X11: 0000000000000000  X10: 0000000000000070   X9: 0000000000000001

     X8: 0000000000000062   X7: 0000000000000020   X6: 0000000000000000

     X5: 0000000000000000   X4: 0000000000000000   X3: 0000000000000000

     X2: 0000000000000002   X1: 0000000000000080   X0: b40000728fd27550

    ORIG_X0: b40000728fd27550  SYSCALLNO: ffffffff  PSTATE: 40001000

 

Let's use GENMASK to replace the pac pointer to fix it.

gki related commit url here:

https://lore.kernel.org/all/20230412160134.306148-4-mark.rutland@arm.com/

 

 

 

===================================================================================================================================
機密資訊 This email and any attachments to it contain confidential information and are intended solely for the use of the individual to whom it is addressed.If you are not the intended recipient or receive it accidentally, please immediately notify the sender by e-mail and delete the message and any attachments from your computer system, and destroy all hard copies. If any, please be advised that any unauthorized disclosure, copying, distribution or any action taken or omitted in reliance on this, is illegal and prohibited. Furthermore, any views or opinions expressed are solely those of the author and do not represent those of ASUSTeK. Thank you for your cooperation. For pricing information, ASUS is only entitled to set a recommendation resale price. All customers are free to set their own price as they wish.
===================================================================================================================================