kernel module parsing failure - mips - Crash-utility

kernel module parsing failure - mips

gcore: Segmentation fault due to...

[PATCH] Add support for SLAB...

Sagar Borikar

Friday, 2 December 2016 Fri, 2 Dec '16

6:06 p.m.

Hi Dave, With 7.1.7, crash is working for MIPS when all drivers are embedded inside kernel. When I make the driver loadable and panic the kernel, crash doesn't locate some symbols correctly. please wait... (gathering module symbol data) crash: invalid size request: 0 type: "pgd page" debugged further and find that PGD_ORDER provides incorrect number due to which the PGD_SIZE macro results in 0. Just for fun, I replaced PGD_ORDER with 0(I know its incorrect) and it went ahead but couldn't run "mod" command successfully as it threw following error crash> mod mod: cannot access vmalloc'd module memory Any idea? Thanks Sagar

Show replies by date

Dave Anderson

Friday, 2 December Fri, 2 Dec

9:45 p.m.

----- Original Message -----

...

Clearly it's failing to translate any kernel virtual address that is not unity-mapped. I have 3 sample 3.19-based MIPS vmcores on hand, and crash-7.1.7 can translate both mapped user-space and vmalloc'd module addresses. But I am completely unfamiliar with the particulars of the MIPS architecture. MIPS support in the crash utility was written by, and its maintenance is done by, Rabin Vincent. He is a member of this list, but just in case he missed your post, I've cc'd this message to both of his registered email addresses. Dave

...

Thanks Sagar -- Crash-utility mailing list Crash-utility(a)redhat.com https://www.redhat.com/mailman/listinfo/crash-utility

Rabin Vincent

Saturday, 3 December Sat, 3 Dec

3:59 a.m.

On Fri, Dec 02, 2016 at 04:06:20PM -0800, Sagar Borikar wrote:

...

With 7.1.7, crash is working for MIPS when all drivers are embedded inside kernel. When I make the driver loadable and panic the kernel, crash doesn't locate some symbols correctly. please wait... (gathering module symbol data) crash: invalid size request: 0 type: "pgd page" debugged further and find that PGD_ORDER provides incorrect number due to which the PGD_SIZE macro results in 0. Just for fun, I replaced PGD_ORDER with 0(I know its incorrect) and it went ahead but couldn't run "mod" command successfully as it threw following error crash> mod mod: cannot access vmalloc'd module memory

In order to access vmalloc'd memory we need interpret the page tables correctly. This isn't needed when the modules are built in since then the memory will be in the direct-mapped kseg0 segment. So the "mod" failure is just a consequence of replacing PGD_ORDER with 0. So the first error should be fixed properly before attempting "mod". What kernel version is this and what page size do to use? Try running the "help -m" and "mach" commands (you can skip module loading with --no_modules to get to the crash> prompt) and check if the values for the various page table sizes and bits match what your kernel is using. crash> help -m ... pagesize: 4096 pageshift: 12 pagemask: fffffffffffff000 pageoffset: fff pgdir_shift: 22 ptrs_per_pgd: 1024 ptrs_per_pte: 1024 ... crash-mips> mach PAGE SIZE: 4096 _PAGE_PRESENT: 00000001 _PAGE_READ: 00000002 _PAGE_WRITE: 00000004 _PAGE_ACCESSED: 00000008 _PAGE_MODIFIED: 00000010 _PAGE_GLOBAL: 00000020 _PAGE_VALID: 00000040 _PAGE_DIRTY: 00000080

Sagar Borikar

3:15 p.m.

On Sat, Dec 3, 2016 at 1:59 AM, Rabin Vincent <rabin(a)rab.in> wrote:

...

On Fri, Dec 02, 2016 at 04:06:20PM -0800, Sagar Borikar wrote: > With 7.1.7, crash is working for MIPS when all drivers are embedded > inside kernel. > When I make the driver loadable and panic the kernel, crash doesn't > locate some symbols correctly. > > please wait... (gathering module symbol data) > crash: invalid size request: 0 type: "pgd page" > > debugged further and find that PGD_ORDER provides incorrect number > due to which the PGD_SIZE macro results in 0. > > Just for fun, I replaced PGD_ORDER with 0(I know its incorrect) and it > went ahead but couldn't run "mod" command successfully as it threw > following error > > crash> mod > mod: cannot access vmalloc'd module memory In order to access vmalloc'd memory we need interpret the page tables correctly. This isn't needed when the modules are built in since then the memory will be in the direct-mapped kseg0 segment. So the "mod" failure is just a consequence of replacing PGD_ORDER with 0.

Yes, mod will not work as I said earlier. Crash was exiting hence I wanted to have quick workaround.

...

So the first error should be fixed properly before attempting "mod". What kernel version is this and what page size do to use?

4.4.20 kernel and page size is 16K

...

Try running the "help -m" and "mach" commands (you can skip module loading with --no_modules to get to the crash> prompt) and check if the values for the various page table sizes and bits match what your kernel is using. crash> help -m ... pagesize: 4096 pageshift: 12 pagemask: fffffffffffff000 pageoffset: fff pgdir_shift: 22 ptrs_per_pgd: 1024 ptrs_per_pte: 1024 ... crash-mips> mach PAGE SIZE: 4096 _PAGE_PRESENT: 00000001 _PAGE_READ: 00000002 _PAGE_WRITE: 00000004 _PAGE_ACCESSED: 00000008 _PAGE_MODIFIED: 00000010 _PAGE_GLOBAL: 00000020 _PAGE_VALID: 00000040 _PAGE_DIRTY: 00000080

This appears to be correct info for my platform. Thanks Sagar

...

-- Crash-utility mailing list Crash-utility(a)redhat.com https://www.redhat.com/mailman/listinfo/crash-utility

Dave Anderson

Sunday, 4 December Sun, 4 Dec

7:17 a.m.

----- Original Message -----

...

On Sat, Dec 3, 2016 at 1:59 AM, Rabin Vincent <rabin(a)rab.in> wrote: > On Fri, Dec 02, 2016 at 04:06:20PM -0800, Sagar Borikar wrote: >> With 7.1.7, crash is working for MIPS when all drivers are embedded >> inside kernel. >> When I make the driver loadable and panic the kernel, crash doesn't >> locate some symbols correctly. >> >> please wait... (gathering module symbol data) >> crash: invalid size request: 0 type: "pgd page" >> >> debugged further and find that PGD_ORDER provides incorrect number >> due to which the PGD_SIZE macro results in 0. >> >> Just for fun, I replaced PGD_ORDER with 0(I know its incorrect) and it >> went ahead but couldn't run "mod" command successfully as it threw >> following error >> >> crash> mod >> mod: cannot access vmalloc'd module memory > > In order to access vmalloc'd memory we need interpret the page tables > correctly. This isn't needed when the modules are built in since then > the memory will be in the direct-mapped kseg0 segment. So the "mod" > failure is just a consequence of replacing PGD_ORDER with 0. Yes, mod will not work as I said earlier. Crash was exiting hence I wanted to have quick workaround. > So the first error should be fixed properly before attempting "mod". What > kernel version is this and what page size do to use? 4.4.20 kernel and page size is 16K > > Try running the "help -m" and "mach" commands (you can skip module loading > with > --no_modules to get to the crash> prompt) and check if the values for the > various page table sizes and bits match what your kernel is using. > > crash> help -m > ... > pagesize: 4096 > pageshift: 12 > pagemask: fffffffffffff000 > pageoffset: fff > pgdir_shift: 22 > ptrs_per_pgd: 1024 > ptrs_per_pte: 1024 > ... > > > crash-mips> mach > PAGE SIZE: 4096 > > _PAGE_PRESENT: 00000001 > _PAGE_READ: 00000002 > _PAGE_WRITE: 00000004 > _PAGE_ACCESSED: 00000008 > _PAGE_MODIFIED: 00000010 > _PAGE_GLOBAL: 00000020 > _PAGE_VALID: 00000040 > _PAGE_DIRTY: 00000080 This appears to be correct info for my platform. Thanks

If your kernel has a 16k page size and "help -m" shows 4096 as above, then there's one major problem. The calculation of page size in mips.c is based upon the difference between the the "swapper_pg_dir" symbol and the next symbol above it: crash> sym -n swapper_pg_dir 806d0000 (B) swapper_pg_dir 806d1000 (B) invalid_pte_table crash> What does the command above on your kernel show? Dave

...

Sagar > > -- > Crash-utility mailing list > Crash-utility(a)redhat.com > https://www.redhat.com/mailman/listinfo/crash-utility -- Crash-utility mailing list Crash-utility(a)redhat.com https://www.redhat.com/mailman/listinfo/crash-utility

Rabin Vincent

8:05 a.m.

On Sat, Dec 03, 2016 at 01:15:23PM -0800, Sagar Borikar wrote:

...

> So the first error should be fixed properly before attempting "mod". What > kernel version is this and what page size do to use? 4.4.20 kernel and page size is 16K

I'd only tested 4K pages before but I've now provisioned a QEMU Malta machine with 16K pages. I've now sent a patch to the list which fixes the definition of __PGD_ORDER. With only that patch, the pgd size errors are gone and "mod" works for my 16K kernel/dump: crash> sym -n swapper_pg_dir 80540000 (B) swapper_pg_dir 80544000 (B) invalid_pte_table crash> sys KERNEL: vmlinux-16k-unload DUMPFILES: /var/tmp/ramdump_elf_wZxP65 [temporary ELF header] rawdump-16k-unload CPUS: 1 DATE: Sun Dec 4 14:46:08 2016 UPTIME: 00:00:21 LOAD AVERAGE: 0.31, 0.07, 0.02 TASKS: 54 NODENAME: buildroot RELEASE: 4.4.27-dirty VERSION: #12 SMP Sun Dec 4 14:45:32 CET 2016 MACHINE: mips (unknown Mhz) MEMORY: 128 MB PANIC: "Kernel panic - not syncing: Fatal exception" crash> help -m | grep -A7 pagesize pagesize: 16384 pageshift: 14 pagemask: ffffffffffffc000 pageoffset: 3fff pgdir_shift: 26 ptrs_per_pgd: 64 ptrs_per_pte: 4096 stacksize: 16384 crash> mod MODULE NAME SIZE OBJECT FILE c01050c0 null_blk 5333 (not loaded) [CONFIG_KALLSYMS] With the fixed __PGD_ORDER, PGD_ORDER gets the value zero and that is correct for my kernel. __PGD_ORDER isn't used anywhere else that in PGD_ORDER, so even your hack of forcing PGD_ORDER to zero would have fixed "mod" for me. So there could be some other additional problem in your case besides the incorrect __PGD_ORDER. Could you please apply the fix and run crash with the -d8 argument and post the full log? Thanks.

Sagar Borikar

9:10 a.m.

On Sun, Dec 4, 2016 at 6:05 AM, Rabin Vincent <rabin(a)rab.in> wrote:

...

On Sat, Dec 03, 2016 at 01:15:23PM -0800, Sagar Borikar wrote: > > So the first error should be fixed properly before attempting "mod". What > > kernel version is this and what page size do to use? > > 4.4.20 kernel and page size is 16K I'd only tested 4K pages before but I've now provisioned a QEMU Malta machine with 16K pages. I've now sent a patch to the list which fixes the definition of __PGD_ORDER. With only that patch, the pgd size errors are gone and "mod" works for my 16K kernel/dump: crash> sym -n swapper_pg_dir 80540000 (B) swapper_pg_dir 80544000 (B) invalid_pte_table crash> sys KERNEL: vmlinux-16k-unload DUMPFILES: /var/tmp/ramdump_elf_wZxP65 [temporary ELF header] rawdump-16k-unload CPUS: 1 DATE: Sun Dec 4 14:46:08 2016 UPTIME: 00:00:21 LOAD AVERAGE: 0.31, 0.07, 0.02 TASKS: 54 NODENAME: buildroot RELEASE: 4.4.27-dirty VERSION: #12 SMP Sun Dec 4 14:45:32 CET 2016 MACHINE: mips (unknown Mhz) MEMORY: 128 MB PANIC: "Kernel panic - not syncing: Fatal exception" crash> help -m | grep -A7 pagesize pagesize: 16384 pageshift: 14 pagemask: ffffffffffffc000 pageoffset: 3fff pgdir_shift: 26 ptrs_per_pgd: 64 ptrs_per_pte: 4096 stacksize: 16384 crash> mod MODULE NAME SIZE OBJECT FILE c01050c0 null_blk 5333 (not loaded) [CONFIG_KALLSYMS] With the fixed __PGD_ORDER, PGD_ORDER gets the value zero and that is correct for my kernel. __PGD_ORDER isn't used anywhere else that in PGD_ORDER, so even your hack of forcing PGD_ORDER to zero would have fixed "mod" for me. So there could be some other additional problem in your case besides the incorrect __PGD_ORDER. Could you please apply the fix and run crash with the -d8 argument and post the full log? Thanks.

With your earlier patches, its same result as I was getting. i.e please wait... (gathering module symbol data) WARNING: cannot access vmalloc'd module memory crash> mod mod: cannot access vmalloc'd module memory Here is detailed info of the help -m and mach output flags: 1 (KSYMS_START) [0/153] kvbase: 80000000 identity_map_base: 80000000 pagesize: 16384 pageshift: 14 pagemask: ffffffffffffc000 pageoffset: 3fff pgdir_shift: 26 ptrs_per_pgd: 64 ptrs_per_pte: 4096 stacksize: 16384 hz: 1000 memsize: 261865472 (0xf9bc000) bits: 32 nr_irqs: 360 eframe_search: mips_eframe_search() back_trace: mips_back_trace_cmd() processor_speed: mips_processor_speed() uvtop: mips_uvtop() kvtop: mips_kvtop() get_task_pgd: mips_get_task_pgd() dump_irq: generic_dump_irq() show_interrupts: generic_show_interrupts() get_irq_affinity: generic_get_irq_affinity() get_stack_frame: mips_get_stack_frame() get_stackbase: generic_get_stackbase() get_stacktop: generic_get_stacktop() translate_pte: mips_translate_pte() memory_size: generic_memory_size() vmalloc_start: mips_vmalloc_start() is_task_addr: mips_is_task_addr() verify_symbol: mips_verify_symbol() dis_filter: generic_dis_filter() cmd_mach: mips_cmd_mach() get_smp_cpus: mips_get_smp_cpus() is_kvaddr: generic_is_kvaddr() is_uvaddr: generic_is_uvaddr() verify_paddr: generic_verify_paddr() init_kernel_pgd: NULL value_to_symbol: generic_machdep_value_to_symbol() line_number_hooks: NULL last_pgd_read: 82c30000 last_pmd_read: 0 last_ptbl_read: 3018000 pgd: 8a71b70 pmd: 0 ptbl: 8a75b78 section_size_bits: 26 max_physmem_bits: 32 sections_per_root: 0 machspec: 8696240 crash> mach HZ: 1000 PAGE SIZE: 16384 _PAGE_PRESENT: 00000001 _PAGE_READ: 00000020 _PAGE_WRITE: 00000002 _PAGE_ACCESSED: 00000004 _PAGE_MODIFIED: 00000008 _PAGE_GLOBAL: 00000040 _PAGE_VALID: 00000080 _PAGE_NO_READ: 00000020 _PAGE_NO_EXEC: 00000010 _PAGE_DIRTY: 00000100 Please find the attached log with -d8 output. Thanks Sagar

...

-- Crash-utility mailing list Crash-utility(a)redhat.com https://www.redhat.com/mailman/listinfo/crash-utility

Rabin Vincent

10:16 a.m.

On Sun, Dec 04, 2016 at 07:10:38AM -0800, Sagar Borikar wrote:

...

<readmem: 806d2c68, KVADDR, "modules", 4, (FOE), 868e5ec> <read_kdump: addr: 806d2c68 paddr: 6d2c68 cnt: 4> read_netdump: addr: 806d2c68 paddr: 6d2c68 cnt: 4 offset: 5dac68 GETBUF(416 -> 0)

please wait... (gathering module symbol data)module: c0bc9cc0

...

<readmem: c0bc9cc0, KVADDR, "module struct", 416, (ROE|Q), 86b25e0> <readmem: 82c30000, KVADDR, "pgd page", 16384, (FOE), 948a2a0> <read_kdump: addr: 82c30000 paddr: 2c30000 cnt: 16384> read_netdump: addr: 82c30000 paddr: 2c30000 cnt: 16384 offset: 31f8000 <readmem: 3018000, PHYSADDR, "page table", 16384, (FOE), 948e2a8> <read_kdump: addr: 3018000 paddr: 3018000 cnt: 16384> read_netdump: addr: 0 paddr: 3018000 cnt: 16384 offset: 35e0000 <read_kdump: addr: c0bc9cc0 paddr: 271e9cc0 cnt: 416> read_netdump: READ_ERROR: offset not found for paddr: 271e9cc0 crash: read error: kernel virtual address: c0bc9cc0 type: "module struct"

Here's the error. Either 271e9cc0 is a valid physical address and the dump is incomplete, or it's not and the page table translation is returning a bogus physical address for c0bc9cc0. To check the page table translation, use "vtop <addr>" (example below) to see how crash comes to its result. You'll have to then manually walk the page tables for this particular virtual address and verify that the correct PGD and PTE entries are being read. It could be easier if use vmalloc_to_page() and page_address() first in your kernel to print out the correct physical address for some known vmalloc'd address. crash> vtop c01050c0 VIRTUAL PHYSICAL c01050c0 71550c0 SEGMENT: ksseg PAGE DIRECTORY: 80540000 PGD: 805400c0 => 872b0000 PTE: 072b0104 => 071545ef PAGE: 07154000 PTE PHYSICAL FLAGS 71545ef 7154000 (PRESENT|READ|WRITE|ACCESSED|MODIFIED|GLOBAL|VALID|NO_READ|DIRTY) PAGE PHYSICAL MAPPING INDEX CNT FLAGS 81038e60 7154000 0 0 1 40000000

Sagar Borikar

3:07 p.m.

On Sun, Dec 4, 2016 at 8:16 AM, Rabin Vincent <rabin(a)rab.in> wrote:

...

On Sun, Dec 04, 2016 at 07:10:38AM -0800, Sagar Borikar wrote: > <readmem: 806d2c68, KVADDR, "modules", 4, (FOE), 868e5ec> > <read_kdump: addr: 806d2c68 paddr: 6d2c68 cnt: 4> > read_netdump: addr: 806d2c68 paddr: 6d2c68 cnt: 4 offset: 5dac68 > GETBUF(416 -> 0) > please wait... (gathering module symbol data)module: c0bc9cc0 > <readmem: c0bc9cc0, KVADDR, "module struct", 416, (ROE|Q), 86b25e0> > <readmem: 82c30000, KVADDR, "pgd page", 16384, (FOE), 948a2a0> > <read_kdump: addr: 82c30000 paddr: 2c30000 cnt: 16384> > read_netdump: addr: 82c30000 paddr: 2c30000 cnt: 16384 offset: 31f8000 > <readmem: 3018000, PHYSADDR, "page table", 16384, (FOE), 948e2a8> > <read_kdump: addr: 3018000 paddr: 3018000 cnt: 16384> > read_netdump: addr: 0 paddr: 3018000 cnt: 16384 offset: 35e0000 > <read_kdump: addr: c0bc9cc0 paddr: 271e9cc0 cnt: 416> > read_netdump: READ_ERROR: offset not found for paddr: 271e9cc0 > > crash: read error: kernel virtual address: c0bc9cc0 type: "module struct" Here's the error. Either 271e9cc0 is a valid physical address and the dump is incomplete, or it's not and the page table translation is returning a bogus physical address for c0bc9cc0.

crash> vtop c0bc9cc0 VIRTUAL PHYSICAL c0bc9cc0 271e9cc0 SEGMENT: ksseg PAGE DIRECTORY: 82c30000 PGD: 82c300c0 => 83018000 PTE: 03018bc8 => 271e87cf PAGE: 271e8000 PTE PHYSICAL FLAGS 271e87cf 271e8000 (PRESENT|WRITE|ACCESSED|MODIFIED|GLOBAL|VALID|DIRTY) PAGE PHYSICAL MAPPING INDEX CNT FLAGS 82dc0f40 271e8000 0 0 1 40000000 0x271e9cc0 is a valid address but it belongs to high mem(0x20000000 onwards for this platform). Also I don't think there is any problem in dump as I have done several testing of crash without modules and every time I have got correct result. Are you accounting for high memory?

...

To check the page table translation, use "vtop <addr>" (example below) to see how crash comes to its result. You'll have to then manually walk the page tables for this particular virtual address and verify that the correct PGD and PTE entries are being read. It could be easier if use vmalloc_to_page() and page_address() first in your kernel to print out the correct physical address for some known vmalloc'd address.

As the driver works fine, I think kernel translation looks ok. Wrong physical address translation would have failed the nvme driver to run. Stress testing with the driver is fine. But still would go through the PTE entries. Dave,

...

crash> sym -n swapper_pg_dir 806d0000 (B) swapper_pg_dir 806d1000 (B) invalid_pte_table crash> What does the command above on your kernel show?

crash> sym -n swapper_pg_dir 82c30000 (B) swapper_pg_dir 82c34000 (B) invalid_pte_table Thanks Sagar

...

crash> vtop c01050c0 VIRTUAL PHYSICAL c01050c0 71550c0 SEGMENT: ksseg PAGE DIRECTORY: 80540000 PGD: 805400c0 => 872b0000 PTE: 072b0104 => 071545ef PAGE: 07154000 PTE PHYSICAL FLAGS 71545ef 7154000 (PRESENT|READ|WRITE|ACCESSED|MODIFIED|GLOBAL|VALID|NO_READ|DIRTY) PAGE PHYSICAL MAPPING INDEX CNT FLAGS 81038e60 7154000 0 0 1 40000000 -- Crash-utility mailing list Crash-utility(a)redhat.com https://www.redhat.com/mailman/listinfo/crash-utility

Rabin Vincent

3:47 p.m.

On Sun, Dec 04, 2016 at 01:07:03PM -0800, Sagar Borikar wrote:

...

On Sun, Dec 4, 2016 at 8:16 AM, Rabin Vincent <rabin(a)rab.in> wrote: >> read_netdump: READ_ERROR: offset not found for paddr: 271e9cc0 >> >> crash: read error: kernel virtual address: c0bc9cc0 type: "module struct" > > Here's the error. Either 271e9cc0 is a valid physical address and the dump is > incomplete, or it's not and the page table translation is returning a bogus > physical address for c0bc9cc0. crash> vtop c0bc9cc0 VIRTUAL PHYSICAL c0bc9cc0 271e9cc0 SEGMENT: ksseg PAGE DIRECTORY: 82c30000 PGD: 82c300c0 => 83018000 PTE: 03018bc8 => 271e87cf PAGE: 271e8000 PTE PHYSICAL FLAGS 271e87cf 271e8000 (PRESENT|WRITE|ACCESSED|MODIFIED|GLOBAL|VALID|DIRTY) PAGE PHYSICAL MAPPING INDEX CNT FLAGS 82dc0f40 271e8000 0 0 1 40000000 0x271e9cc0 is a valid address but it belongs to high mem(0x20000000 onwards for this platform). Also I don't think there is any problem in dump as I have done several testing of crash without modules and every time I have got correct result.

Have a look at the segments at the start of the log. The 271e9cc0 physical address is apparently not included in the dump: pt_load_segment[0]: file_offset: 8000 phys_start: 100000 phys_end: 703fff zero_fill: 0 pt_load_segment[1]: file_offset: 60c000 phys_start: 44000 phys_end: 144000 zero_fill: 0 pt_load_segment[2]: file_offset: 70c000 phys_start: 144000 phys_end: 4300000 zero_fill: 0 pt_load_segment[3]: file_offset: 48c8000 phys_start: d200000 phys_end: d200000

...

> To check the page table translation, use "vtop <addr>" (example below) > to see how crash comes to its result. You'll have to then manually walk > the page tables for this particular virtual address and verify that the > correct PGD and PTE entries are being read. It could be easier if use > vmalloc_to_page() and page_address() first in your kernel to print out > the correct physical address for some known vmalloc'd address. As the driver works fine, I think kernel translation looks ok. Wrong physical address translation would have failed the nvme driver to run. Stress testing with the driver is fine. But still would go through the PTE entries.

I wasn't implying that the kernel's virt-to-phys translation was wrong, but rather that the crash utility's translation might be. But if 271e9cc0 is a valid physical address on your platform then the translation itself is probably fine.

Sagar Borikar

4:24 p.m.

On Sun, Dec 4, 2016 at 1:47 PM, Rabin Vincent <rabin(a)rab.in> wrote:

...

On Sun, Dec 04, 2016 at 01:07:03PM -0800, Sagar Borikar wrote: > On Sun, Dec 4, 2016 at 8:16 AM, Rabin Vincent <rabin(a)rab.in> wrote: > >> read_netdump: READ_ERROR: offset not found for paddr: 271e9cc0 > >> > >> crash: read error: kernel virtual address: c0bc9cc0 type: "module struct" > > > > Here's the error. Either 271e9cc0 is a valid physical address and the dump is > > incomplete, or it's not and the page table translation is returning a bogus > > physical address for c0bc9cc0. > > crash> vtop c0bc9cc0 > VIRTUAL PHYSICAL > c0bc9cc0 271e9cc0 > > SEGMENT: ksseg > PAGE DIRECTORY: 82c30000 > PGD: 82c300c0 => 83018000 > PTE: 03018bc8 => 271e87cf > PAGE: 271e8000 > > PTE PHYSICAL FLAGS > 271e87cf 271e8000 (PRESENT|WRITE|ACCESSED|MODIFIED|GLOBAL|VALID|DIRTY) > > PAGE PHYSICAL MAPPING INDEX CNT FLAGS > 82dc0f40 271e8000 0 0 1 40000000 > > 0x271e9cc0 is a valid address but it belongs to high mem(0x20000000 > onwards for this platform). Also I don't think there is any problem in > dump as I have done several testing of crash without modules and every > time I have got correct result. Have a look at the segments at the start of the log. The 271e9cc0 physical address is apparently not included in the dump: pt_load_segment[0]: file_offset: 8000 phys_start: 100000 phys_end: 703fff zero_fill: 0 pt_load_segment[1]: file_offset: 60c000 phys_start: 44000 phys_end: 144000 zero_fill: 0 pt_load_segment[2]: file_offset: 70c000 phys_start: 144000 phys_end: 4300000 zero_fill: 0 pt_load_segment[3]: file_offset: 48c8000 phys_start: d200000 phys_end: d200000

Right. Looked at it in more detail. kexec looks for "System RAM" under /proc/iomem to build the memory segments which are passed to crash notes. But for some reason MIPS kernel doesn't add high mem in /proc/iomem resource list. Refer to arch/mips/kernel/setup.c resource_init() function 729 730 start = boot_mem_map.map[i].addr; 731 end = boot_mem_map.map[i].addr + boot_mem_map.map[i].size - 1; 732 if (start >= HIGHMEM_START) 733 continue; 734 if (end >= HIGHMEM_START) 735 end = HIGHMEM_START - 1; 736 Because of which highmem segment is not passed by kexec. Sagar

...

> > To check the page table translation, use "vtop <addr>" (example below) > > to see how crash comes to its result. You'll have to then manually walk > > the page tables for this particular virtual address and verify that the > > correct PGD and PTE entries are being read. It could be easier if use > > vmalloc_to_page() and page_address() first in your kernel to print out > > the correct physical address for some known vmalloc'd address. > > As the driver works fine, I think kernel translation looks ok. Wrong > physical address translation would have failed the nvme driver to run. > Stress testing with the driver is fine. But still would go through the > PTE entries. I wasn't implying that the kernel's virt-to-phys translation was wrong, but rather that the crash utility's translation might be. But if 271e9cc0 is a valid physical address on your platform then the translation itself is probably fine. -- Crash-utility mailing list Crash-utility(a)redhat.com https://www.redhat.com/mailman/listinfo/crash-utility

3359

days inactive

3360

days old

devel@lists.crash-utility.osci.io

Manage subscription

10 comments

3 participants

tags (0)

participants (3)

Dave Anderson
Rabin Vincent
Sagar Borikar

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

kernel module parsing failure - mips