On Sun, Dec 04, 2016 at 01:07:03PM -0800, Sagar Borikar wrote:
On Sun, Dec 4, 2016 at 8:16 AM, Rabin Vincent <rabin(a)rab.in>
wrote:
>> read_netdump: READ_ERROR: offset not found for paddr: 271e9cc0
>>
>> crash: read error: kernel virtual address: c0bc9cc0 type: "module
struct"
>
> Here's the error. Either 271e9cc0 is a valid physical address and the dump is
> incomplete, or it's not and the page table translation is returning a bogus
> physical address for c0bc9cc0.
crash> vtop c0bc9cc0
VIRTUAL PHYSICAL
c0bc9cc0 271e9cc0
SEGMENT: ksseg
PAGE DIRECTORY: 82c30000
PGD: 82c300c0 => 83018000
PTE: 03018bc8 => 271e87cf
PAGE: 271e8000
PTE PHYSICAL FLAGS
271e87cf 271e8000 (PRESENT|WRITE|ACCESSED|MODIFIED|GLOBAL|VALID|DIRTY)
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
82dc0f40 271e8000 0 0 1 40000000
0x271e9cc0 is a valid address but it belongs to high mem(0x20000000
onwards for this platform). Also I don't think there is any problem in
dump as I have done several testing of crash without modules and every
time I have got correct result.
Have a look at the segments at the start of the log. The 271e9cc0
physical address is apparently not included in the dump:
pt_load_segment[0]:
file_offset: 8000
phys_start: 100000
phys_end: 703fff
zero_fill: 0
pt_load_segment[1]:
file_offset: 60c000
phys_start: 44000
phys_end: 144000
zero_fill: 0
pt_load_segment[2]:
file_offset: 70c000
phys_start: 144000
phys_end: 4300000
zero_fill: 0
pt_load_segment[3]:
file_offset: 48c8000
phys_start: d200000
phys_end: d200000
> To check the page table translation, use "vtop
<addr>" (example below)
> to see how crash comes to its result. You'll have to then manually walk
> the page tables for this particular virtual address and verify that the
> correct PGD and PTE entries are being read. It could be easier if use
> vmalloc_to_page() and page_address() first in your kernel to print out
> the correct physical address for some known vmalloc'd address.
As the driver works fine, I think kernel translation looks ok. Wrong
physical address translation would have failed the nvme driver to run.
Stress testing with the driver is fine. But still would go through the
PTE entries.
I wasn't implying that the kernel's virt-to-phys translation was wrong,
but rather that the crash utility's translation might be. But if
271e9cc0 is a valid physical address on your platform then the
translation itself is probably fine.