----- Original Message -----
Hi Dave,
On Tue, 01 May 2012 15:24:17 -0400 (EDT)
Dave Anderson <anderson(a)redhat.com> wrote:
> With respect to the 3rd "vm -p" bug, I did some cursory debugging,
> and
> here's what I found.
>
> In all cases, the readmem() failure occurs in
> _kl_pg_table_deref_s390x() as a result of transitioning from one
> page
> of PTEs to the next, because the pointer to the "next" page of PTES
> contains 0x20, which looks to be _SEGMENT_ENTRY_INV or
> _REGION_ENTRY_INV? (not sure of the s390x nomenclature...)
>
> So you'll see something like this in the page table that points
> to the pages of PTEs:
>
> ...
> c6386e0: 0000000000000020
> 0000000000000020 ....... ....... c6386f0: 000000001608c800
> 0000000000000020 ............... c638700: 0000000000000020
> 0000000000000020 ....... ....... ...
>
> The vaddr's in the page of PTEs pointed to by c6386f0 (at
> 000000001608c800) all resolve as expected, but when the virtual
> address bumps it to c6386f8, it reads the 0x20, and passes it to
> _kl_pg_table_deref_s390x(). The user vaddr(s) that resolve to that
> next page of PTEs are legitimate, given that they are in the virtual
> region defined by the vm_area_struct. But they certainly may not be
> mapped.
>
> Anyway, it seems that there should be something that catches the
> invalid entry in s390x_vtop() -- prior to calling
> _kl_pg_table_deref_s390x()-- and return FALSE at that point.
>
> So if I make this kludge:
>
> ...
>
> /* Check if this is a large page. */
> if (entry & 0x400ULL) {
> /* Add the 1MB page offset and return the final
> value. */ *phys_addr = table + (vaddr & 0xfffffULL);
> return TRUE;
> }
>
> ======> if (entry == 0x20) return FALSE;
>
> /* Get the page table entry */
> entry = _kl_pg_table_deref_s390x(vaddr, entry & ~0x7ffULL);
> if (!entry)
> return FALSE;
>
> /* Isolate the page origin from the page table entry. */
> paddr = entry & ~0xfffULL;
>
> /* Add the page offset and return the final value. */
> *phys_addr = paddr + (vaddr & 0xfffULL);
>
> return TRUE;
> }
>
> then everything seems to work OK.
>
> So unless the calculation of the next page of PTEs is incorrect,
> which seems unlikely, it seems that the 0x20 is legitimate, and
> should be recognized? What do you think?
0x20 means that the invalid bit of a region table entry is set (bit 58).
The problem is a wrong check in _kl_rsg_table_deref_s390x() where we
check for 0x40 (instead of 0x20).
The following is the correct fix for the problem:
---
s390x.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/s390x.c
+++ b/s390x.c
@@ -568,7 +568,7 @@ static ulong _kl_rsg_table_deref_s390x(u
if ((entry & 0xcULL) != (level << 2))
return 0;
/* Check if the region table entry has the invalid bit set. */
- if (entry & 0x40ULL)
+ if (entry & 0x20ULL)
return 0;
/* Region table entry is valid and well formed. */
return entry;
Much better -- queued for crash-6.0.7.
Thanks,
Dave