On Fri, Mar 6, 2020 at 6:13 AM Dave Anderson <anderson(a)redhat.com> wrote:
----- Original Message -----
> On Thu, Mar 5, 2020 at 1:07 PM Santosh <ysan99(a)gmail.com> wrote:
> >
> > On Thu, Mar 5, 2020 at 12:54 PM Dave Anderson <anderson(a)redhat.com>
wrote:
> > >
> > > > > I suspect that it's a problem with either the --kaslr offset
and/or
> > > > > the phys_base value that you have used.
> > > >
> > > > Is there method to know or print kaslr & phy_base in a running
Linux
> > > > system?
> > >
> > > They are normally passed in the VMCOREINFO data that is contained in an
> > > ELF PT_NOTE
> > > in the dumpfile header. For example, here's a dump of the normal
> > > VMCOREINFO data,
> > > where the phys_base and KASLR offsets are down near the bottom:
> > >
> > > OSRELEASE=4.18.0-185.el8.x86_64
> > > PAGESIZE=4096
> > > SYMBOL(init_uts_ns)=ffffffffbd812540
> > > SYMBOL(node_online_map)=ffffffffbda0f520
> > > SYMBOL(swapper_pg_dir)=ffffffffbd80a000
> > > SYMBOL(_stext)=ffffffffbc600000
> > > SYMBOL(vmap_area_list)=ffffffffbd8d78b0
> > > SYMBOL(mem_section)=ffff956a3ffd2000
> > > LENGTH(mem_section)=2048
> > > SIZE(mem_section)=16
> > > OFFSET(mem_section.section_mem_map)=0
> > > SIZE(page)=64
> > > SIZE(pglist_data)=171968
> > > SIZE(zone)=1472
> > > SIZE(free_area)=88
> > > SIZE(list_head)=16
> > > SIZE(nodemask_t)=128
> > > OFFSET(page.flags)=0
> > > OFFSET(page._refcount)=52
> > > OFFSET(page.mapping)=24
> > > OFFSET(page.lru)=8
> > > OFFSET(page._mapcount)=48
> > > OFFSET(page.private)=40
> > > OFFSET(page.compound_dtor)=16
> > > OFFSET(page.compound_order)=17
> > > OFFSET(page.compound_head)=8
> > > OFFSET(pglist_data.node_zones)=0
> > > OFFSET(pglist_data.nr_zones)=171232
> > > OFFSET(pglist_data.node_start_pfn)=171240
> > > OFFSET(pglist_data.node_spanned_pages)=171256
> > > OFFSET(pglist_data.node_id)=171264
> > > OFFSET(zone.free_area)=192
> > > OFFSET(zone.vm_stat)=1296
> > > OFFSET(zone.spanned_pages)=112
> > > OFFSET(free_area.free_list)=0
> > > OFFSET(list_head.next)=0
> > > OFFSET(list_head.prev)=8
> > > OFFSET(vmap_area.va_start)=0
> > > OFFSET(vmap_area.list)=48
> > > LENGTH(zone.free_area)=11
> > > SYMBOL(log_buf)=ffffffffbd85b140
> > > SYMBOL(log_buf_len)=ffffffffbd85b13c
> > > SYMBOL(log_first_idx)=ffffffffbe319778
> > > SYMBOL(clear_idx)=ffffffffbe319744
> > > SYMBOL(log_next_idx)=ffffffffbe319768
> > > SIZE(printk_log)=16
> > > OFFSET(printk_log.ts_nsec)=0
> > > OFFSET(printk_log.len)=8
> > > OFFSET(printk_log.text_len)=10
> > > OFFSET(printk_log.dict_len)=12
> > > LENGTH(free_area.free_list)=5
> > > NUMBER(NR_FREE_PAGES)=0
> > > NUMBER(PG_lru)=5
> > > NUMBER(PG_private)=12
> > > NUMBER(PG_swapcache)=9
> > > NUMBER(PG_swapbacked)=18
> > > NUMBER(PG_slab)=8
> > > NUMBER(PG_hwpoison)=22
> > > NUMBER(PG_head_mask)=32768
> > > NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-129
> > > NUMBER(HUGETLB_PAGE_DTOR)=2
> > > NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE)=-257
> > > ===============> NUMBER(phys_base)=16437477376
> > > SYMBOL(init_top_pgt)=ffffffffbd80a000
> > > NUMBER(pgtable_l5_enabled)=0
> > > SYMBOL(node_data)=ffffffffbda0ad20
> > > LENGTH(node_data)=1024
> > > ===============> KERNELOFFSET=3b600000
> > > NUMBER(KERNEL_IMAGE_SIZE)=1073741824
> > > NUMBER(sme_mask)=0
> > > CRASHTIME=1583350919
> > >
> > > But in your Azure-generated dumpfile, I note that each cpu's
NT_PRSTATUS
> > > note
> > > contains junk data, and while does have a VMCOREINFO note, it contains
> > > this:
> > >
> > > Elf64_Nhdr:
> > > n_namesz: 11 ("VMCOREINFO")
> > > n_descsz: 42
> > > n_type: 0 (unused)
> > > FAKE1=IGNORE1
> > > FAKE2=IGNORE2
> > > FAKE3=IGNORE3
> > >
> > > So that's why you need to pass in the two arguments.
> > >
> > > Now, the crash utility should be able to be brought up successfully
> > > on a live system without passing the arguments. And once you've done
> > > that, you could get the values like this:
> > >
> > > crash> help -m | grep phys_base
> > > phys_base: 3d3c00000
> > > crash> help -k | grep relocate
> > > relocate: ffffffffc4a00000 (KASLR offset: 3b600000 / 950MB)
> > > crash>
> > >
> > > But since they change with each reboot, you would have to capture them
> > > while running on the live system, and save them somewhere for a
> > > subsequent
> > > crash. So that goes back to my question -- how did you get the numbers
> > > that you used?
> >
> > The number I had got by simply grepping through coredump strings.
> > $ strings vm1_numa_4gb_5cpu.coredump | grep -v strings | grep
> > 'KERNELOFFSET=\|NUMBER(phys_base)='
> >
> > Machine is still running and I cross verified those numbers with crash
> > and those were correct.
> >
> > crash> p vmcoreinfo_data+1600
> > $1 = (unsigned char *) 0xffff917d3cde1640
> >
"poison)=22\nNUMBER(PG_head_mask)=32768\nNUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-128\nNUMBER(HUGETLB_PAGE_DTOR)=2\nNUMBER(phys_base)=4355784704\nSYMBOL(init_top_pgt)=ffffffff82a0a000\nSYMBOL(node_data)=ffffffff82c5d780\nLENGTH(node_data)=1024\nKERNELOFFSET=600000\nNUMBER"...
> >
> > Now it appears to me that something wrong in Azure generated dump file.
>
> Something to do with numa:
>
> santosh@u1804lts:~$ cat /proc/sys/kernel/numa_balancing
> 1
>
> HyperV VM with 1 numa node (numa_balancing = 0) -- Linux with nokaslr
> -- vm2core -- ELF coredump -- crash tool -- Ok
> HyperV VM with 1 numa node (numa_balancing = 0) -- LInux with kaslr --
> vm2core -- ELF coredump -- crash tool -- Ok
> HyperV VM with 2 numa nodes (numa_balancing = 1) -- Linux with nokaslr
> -- vm2core -- ELF coredump -- crash tool -- Ok
> HyperV VM with 2 numa nodes (numa_balancing = 1) -- LInux with kaslr
> -- vm2core -- ELF coredump -- crash tool -- Not ok
>
> Do we have to specify the numa topology somehow to crash tool or it
> should already be handled in coredump file?
Definitely not. The crash utility is only interested in:
1. kernel virtual address values -- which KASLR modifies from the values
compiled into the vmlinux file,
2. translating those kernel virtual addresses into physical addresses, and
3. accessing those physical addresses from the memory source.
As I understand it, numa_balancing is concerned with user-space virtual
address mapping, where the kernel may re-map an underlying physical
address from one NUMA node to another. User-space memory is never
accessed by the crash utility unless requested by a run-time command
that specifically specifies it.
Dave
Hi Dave,
I did some more experiments and found that it is nothing to do with numa.
I also found that the issue gets resolved when I insert
"SYMBOL(_stext)=" into vmcoreinfo.
Meaning sometime crash needs _stext value along with kaslr & phys_base.
Thanks,
Santosh
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility