> I suspect that it's a problem with either the --kaslr offset
and/or
> the phys_base value that you have used.
Is there method to know or print kaslr & phy_base in a running Linux system?
They are normally passed in the VMCOREINFO data that is contained in an ELF PT_NOTE
in the dumpfile header. For example, here's a dump of the normal VMCOREINFO data,
where the phys_base and KASLR offsets are down near the bottom:
OSRELEASE=4.18.0-185.el8.x86_64
PAGESIZE=4096
SYMBOL(init_uts_ns)=ffffffffbd812540
SYMBOL(node_online_map)=ffffffffbda0f520
SYMBOL(swapper_pg_dir)=ffffffffbd80a000
SYMBOL(_stext)=ffffffffbc600000
SYMBOL(vmap_area_list)=ffffffffbd8d78b0
SYMBOL(mem_section)=ffff956a3ffd2000
LENGTH(mem_section)=2048
SIZE(mem_section)=16
OFFSET(mem_section.section_mem_map)=0
SIZE(page)=64
SIZE(pglist_data)=171968
SIZE(zone)=1472
SIZE(free_area)=88
SIZE(list_head)=16
SIZE(nodemask_t)=128
OFFSET(page.flags)=0
OFFSET(page._refcount)=52
OFFSET(page.mapping)=24
OFFSET(page.lru)=8
OFFSET(page._mapcount)=48
OFFSET(page.private)=40
OFFSET(page.compound_dtor)=16
OFFSET(page.compound_order)=17
OFFSET(page.compound_head)=8
OFFSET(pglist_data.node_zones)=0
OFFSET(pglist_data.nr_zones)=171232
OFFSET(pglist_data.node_start_pfn)=171240
OFFSET(pglist_data.node_spanned_pages)=171256
OFFSET(pglist_data.node_id)=171264
OFFSET(zone.free_area)=192
OFFSET(zone.vm_stat)=1296
OFFSET(zone.spanned_pages)=112
OFFSET(free_area.free_list)=0
OFFSET(list_head.next)=0
OFFSET(list_head.prev)=8
OFFSET(vmap_area.va_start)=0
OFFSET(vmap_area.list)=48
LENGTH(zone.free_area)=11
SYMBOL(log_buf)=ffffffffbd85b140
SYMBOL(log_buf_len)=ffffffffbd85b13c
SYMBOL(log_first_idx)=ffffffffbe319778
SYMBOL(clear_idx)=ffffffffbe319744
SYMBOL(log_next_idx)=ffffffffbe319768
SIZE(printk_log)=16
OFFSET(printk_log.ts_nsec)=0
OFFSET(printk_log.len)=8
OFFSET(printk_log.text_len)=10
OFFSET(printk_log.dict_len)=12
LENGTH(free_area.free_list)=5
NUMBER(NR_FREE_PAGES)=0
NUMBER(PG_lru)=5
NUMBER(PG_private)=12
NUMBER(PG_swapcache)=9
NUMBER(PG_swapbacked)=18
NUMBER(PG_slab)=8
NUMBER(PG_hwpoison)=22
NUMBER(PG_head_mask)=32768
NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-129
NUMBER(HUGETLB_PAGE_DTOR)=2
NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE)=-257
===============> NUMBER(phys_base)=16437477376
SYMBOL(init_top_pgt)=ffffffffbd80a000
NUMBER(pgtable_l5_enabled)=0
SYMBOL(node_data)=ffffffffbda0ad20
LENGTH(node_data)=1024
===============> KERNELOFFSET=3b600000
NUMBER(KERNEL_IMAGE_SIZE)=1073741824
NUMBER(sme_mask)=0
CRASHTIME=1583350919
But in your Azure-generated dumpfile, I note that each cpu's NT_PRSTATUS note
contains junk data, and while does have a VMCOREINFO note, it contains this:
Elf64_Nhdr:
n_namesz: 11 ("VMCOREINFO")
n_descsz: 42
n_type: 0 (unused)
FAKE1=IGNORE1
FAKE2=IGNORE2
FAKE3=IGNORE3
So that's why you need to pass in the two arguments.
Now, the crash utility should be able to be brought up successfully
on a live system without passing the arguments. And once you've done
that, you could get the values like this:
crash> help -m | grep phys_base
phys_base: 3d3c00000
crash> help -k | grep relocate
relocate: ffffffffc4a00000 (KASLR offset: 3b600000 / 950MB)
crash>
But since they change with each reboot, you would have to capture them
while running on the live system, and save them somewhere for a subsequent
crash. So that goes back to my question -- how did you get the numbers
that you used?
Dave