On 21/09/23 11:21, lijiang wrote:
Hi, Aditya
Thank you for the patch.
From: Aditya Gupta <adityag@linux.ibm.com>
To: crash-utility@redhat.com
Cc: Hari Bathini <hbathini@linux.ibm.com>,
Mahesh J Salgaonkar
<mahesh@linux.ibm.com>,
Sourabh Jain <sourabhjain@linux.ibm.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Subject: [Crash-utility] [PATCH] ppc64: do page
traversal if
vmemmap_list not populated
Message-ID: <20230911093032.419388-1-adityag@linux.ibm.com>
Content-Type: text/plain; charset="US-ASCII";
x-default=true
Currently 'crash-tool' fails on vmcore collected on
upstream kernel on
PowerPC64 with the error:
crash: invalid kernel virtual address: 0 type:
"first list entry
Presently the address translation for vmemmap
addresses is done using
the vmemmap_list. But with the below commit in Linux,
vmemmap_list can
be empty, in case of Radix MMU on PowerPC64
368a0590d954: (powerpc/book3s64/vmemmap: switch
radix to use a
different vmemmap handling function)
In case vmemmap_list is empty, then it's head is NULL,
which crash tries
to access and fails due to accessing NULL.
Instead of depending on 'vmemmap_list' for address
translation for
vmemmap addresses, do a kernel pagetable walk to get
the physical
address associated with given virtual address
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
---
Testing
=======
Git tree with patch applied:
https://github.com/adi-g15-ibm/crash/tree/bugzilla-203296-list-v1
This can be tested with '/proc/vmcore' as the vmcore,
since makedumpfile
also fails in absence of 'vmemmap_list' in upstream
linux
The fix for makedumpfile will also been sent to
upstream
---
---
ppc64.c | 47
+++++++++++++++++++++++++++++++++--------------
1 file changed, 33 insertions(+), 14 deletions(-)
diff --git a/ppc64.c b/ppc64.c
index fc34006f4863..1e84c5f56773 100644
--- a/ppc64.c
+++ b/ppc64.c
@@ -1280,8 +1280,30 @@ void ppc64_vmemmap_init(void)
long backing_size, virt_addr_offset,
phys_offset, list_offset;
ulong *vmemmap_list;
char *vmemmap_buf;
- struct machine_specific *ms;
-
+ struct machine_specific *ms =
machdep->machspec;
+
+ ld = &list_data;
+ BZERO(ld, sizeof(struct list_data));
+
+ /*
+ * vmemmap_list is missing or set to 0 in the
kernel would imply
+ * vmemmap region is mapped in the kernel
pagetable. So, read vmemmap_list
+ * anyway and use the translation method
accordingly.
+ */
+ readmem(symbol_value("vmemmap_list"), KVADDR,
&ld->start, sizeof(void *),
+ "vmemmap_list", RETURN_ON_ERROR);
+ if (!ld->start) {
+ /*
+ * vmemmap_list is set to 0 or
missing. Do kernel pagetable walk
+ * for vmemmamp address translation.
+ */
+ ms->vmemmap_list = NULL;
+ ms->vmemmap_cnt = 0;
+
+ machdep->flags |= VMEMMAP_AWARE;
+ return;
+ }
+
I would suggest putting the
'readmem(symbol_value("vmemmap_list"),...)' after the
'kernel_symbol_exists("vmemmap_list")', and also check the
returned value of readmem().
Yes, I considered that, but the if condition checking for
'kernel_symbol_exists("vmemmap_list")' returns from the
function if the symbol is not found, same with the readmem
return value check earlier, we wanted to initialise the
ms->vmemmap_list and machdep->flags before that.
It's true.
For the current patch, if the symbol 'vmemmap_list' is not found, the symbol_value() will invoke the error(FATAL, ...) and ultimately exit from this function. It doesn't have any chance to execute the initialising code.
And, the initialization code relies on the result of ld->start, once the readmem() fails due to some reasons, it may also enter the initialization code. I'm not sure if that is expected behavior(if yes, the "RETURN_ON_ERROR|QUIET " would be good).
If so, the kernel_symbol_exists("vmemmap_list") may be redundant. That can be removed this time. What do you think?
The readmem used to be like this, which returns from function
if readmem failed:
- if (!readmem(symbol_value("vmemmap_list"),
- KVADDR, &ld->start, sizeof(void *),
"vmemmap_list",
- RETURN_ON_ERROR))
- return;
So, intentionally I put it above the if and removed the return
value check, since we don't want to depend on vmemmap_list
symbol being there for newer kernels anymore.
And to be future proof in case the address mapping moves to
page table for Hash MMU case also, and since vmemmap_list is
PowerPC specific, the symbol might also be removed.
Ok, got it, thanks for the information.
But, if it's a coding guidelines requirement to handle the
return values, I can think of how to handle the return value
of readmem, any suggestions ?
No, it's not a coding guidelines requirement. :-)
And other changes are fine to me.
Thank you for your reviews Lianbo :)
No problem.
Thanks.
Lianbo