On Wed, Sep 20, 2023 at 7:17 PM Aditya Gupta <adityag(a)linux.ibm.com> wrote:
Hello lijiang,
On Wed, Sep 20, 2023 at 10:21:18AM +0800, lijiang wrote:
> On Tue, Sep 19, 2023 at 2:23 PM Aditya Gupta <adityag(a)linux.ibm.com>
wrote:
>
> > Hello lijiang,
> >
> > On Mon, Sep 18, 2023 at 07:34:04PM +0800, lijiang wrote:
> > > Hi, Aditya
> > > Thank you for the patch.
> > >
> > > ...
> > >
> > > Test kernel commit: ce9ecca0238b ("Linux 6.6-rc2")
> > >
> > > # ./crash /home/linux/vmlinux
> >
> > Thanks for testing it.
> >
> > This issue occurs only in case of Radix MMU.
> >
> > Overall, these are all the requirements:
> > 1. Upstream linux (master branch) (your commit will also work,
> > ce9ecca0238b)
> > 2. 'CONFIG_PPC_BOOK3S_64' should be 'y' in kernel config (this
should
be
> > there
> > in default configs)
> >
>
> # grep "CONFIG_PPC_BOOK3S_64" /home/linux/.config
> CONFIG_PPC_BOOK3S_64=y
>
> 3. Check in dmesg of the crashed kernel, if it prints 'hash-mmu' or
> > 'radix-mmu'. It should be 'radix-mmu'.
> >
> >
> # dmesg|grep mmu
> [ 0.000000] hash-mmu: Page sizes from device-tree:
> [ 0.000000] hash-mmu: base_shift=12: shift=12, sllp=0x0000,
> avpnm=0x00000000, tlbiel=1, penc=0
> [ 0.000000] hash-mmu: base_shift=12: shift=16, sllp=0x0000,
> avpnm=0x00000000, tlbiel=1, penc=7
> [ 0.000000] hash-mmu: base_shift=12: shift=24, sllp=0x0000,
> avpnm=0x00000000, tlbiel=1, penc=56
> [ 0.000000] hash-mmu: base_shift=16: shift=16, sllp=0x0110,
> avpnm=0x00000000, tlbiel=1, penc=1
> [ 0.000000] hash-mmu: base_shift=16: shift=24, sllp=0x0110,
> avpnm=0x00000000, tlbiel=1, penc=8
> [ 0.000000] hash-mmu: base_shift=24: shift=24, sllp=0x0100,
> avpnm=0x00000001, tlbiel=0, penc=0
> [ 0.000000] hash-mmu: base_shift=34: shift=34, sllp=0x0120,
> avpnm=0x000007ff, tlbiel=0, penc=3
> [ 0.000000] hash-mmu: Initializing hash mmu with SLB
> [ 0.000000] mmu_features = 0xfc006e01
> [ 0.000000] hash-mmu: ppc64_pft_size = 0x1b
> [ 0.000000] hash-mmu: htab_hash_mask = 0xfffff
>
This seems to using Hash MMU, hence the error doesn't come up.
Since vmemmap_list is NOT empty in case of Hash MMU, so crash works as
expected.
Thank you for the explanation, Aditya.
Can you try it on a system with Radix MMU ? (Rainier/Denali systems
might
have
that by default)
I'm afraid I don't have such machines with Radix MMU. Looks like I can not
test it, let's reback to the discussion of patches.
The 'dmesg | grep mmu' you did is a good way to check if the system is using
'radix-mmu'.
>
> > I guess, the system that was crashed might be using 'hash-mmu'.
> >
> > > also fails in absence of 'vmemmap_list' in upstream linux
> >
> > Yes, it will fail in Hash MMU case, as we depend on 'vmemmap_list' in
that
> > case,
> > as the virtual to physical address mapping is not available in page
table,
> > in
> > case of Hash-MMU.
> >
> > Only in radix MMU case, it will still work, even if 'vmemmap_list' is
> > removed,
> > since we have the mappings in kernel page table, which is used by this
> > patch.
> >
> > Let me know if the issue still doesn't reproduce even after using a
system
> > with
> > Radix MMU.
> >
> >
> Yes, still not reproduce on my side. But, looks like we have the same
> system with Radix MMU, it's strange.
Actually I meant the current MMU should be Radix MMU, according to the
above
system logs, the system is using Hash MMU.
On a system with current MMU as Radix MMU, the error should occur. Since
with
the below commit in upstream kernel:
368a0590d954 ("powerpc/book3s64/vmemmap: switch radix to use a
different vmemmap
handling function")
the way address mapping was stored for vmemmap has changed, for Radix MMU.
In case of Radix MMU, now we have the vmemmap address mapping in kernel
page
tables only. Hence 'vmemmap_list' is empty.
In case of Hash MMU, vmemmap address mapping is still stored in
'vmemmap_list', which crash uses, hence the error will not occur.
Also, due to this reason, if we crash a system which is using Hash MMU,
kernel still populates it in 'vmemmap_list' so we need that symbol.
While, in Radix MMU, even if the symbol is there or is missing, crash will
still work after this patch.
OK, got it. Thank you for explaining the details.
Thanks.
Lianbo
Thanks
- Aditya Gupta