----- Original Message -----
> -----Original Message-----
> From: Dave Anderson <anderson(a)redhat.com>
> Sent: Wednesday, June 5, 2019 2:56 PM
> To: Nuno Das Neves <Nuno.Das(a)microsoft.com>
> Cc: Nuno Das Neves <Nuno.Das(a)microsoft.com>
> Subject: Fwd: [Crash-utility] [PATCH] Fix unsigned signed comparison
> causing segfault for small VMCOREINFO notes
>
>
>
> ----- Forwarded Message -----
> From: "Dave Anderson" <anderson(a)redhat.com>
> To: "Discussion list for crash utility usage, maintenance and
development"
> <crash-utility(a)redhat.com>
> Sent: Wednesday, June 5, 2019 4:18:02 PM
> Subject: Re: [Crash-utility] [PATCH] Fix unsigned signed comparison causing
> segfault for small VMCOREINFO notes
>
>
>
> ----- Original Message -----
> > Hi,
> >
> > This is a fix for a signed/unsigned comparison bug in vmcoreinfo_read_string.
> > The bug causes a segmentation fault if size_vmcoreinfo + 1 is smaller than
> > the length of the key string passed in.
>
> I suppose that's true, but can you describe the instance where that actually
happened?
I'm debugging some issues with vm2core
(
https://github.com/Azure/azure-linux-utils/tree/master/vm2core), a tool for
converting Hyper-V snapshots to elf core files that can be analyzed with Crash.
There was a problem where versions of Crash newer than 7.1.5 crashed with elf
files generated by vm2core due to this issue.
> Can you show the actual note contents as shown by "help -D"?
Elf64_Nhdr:
n_namesz: 11 ("VMCOREINFO")
n_descsz: 10
n_type: 0 (unused)
VMCOREINFO
So it seems the note doesn't really contain valid data as defined in
Documentation/kdump/vmcoreinfo.txt.
I've done some additional testing and it appears this note isn't needed by
Crash at all - I can simply patch vm2core so it doesn't add the note.
So, I suppose this fix isn't required to solve my particular issue.
I don't think you should avoid adding the VMCOREINFO note contents from the kernel.
While the vast majority of VMCOREINFO items are only used by makedumpfile(8),
several items are used by the crash utility. For example, the x86_64 requires
things like the kernel's phys_offset value (at least if it's non-zero), either
the relocated "_stext" symbol value or KERNELOFFSET value for KASLR kernels,
possibly the KERNEL_IMAGE_SIZE, and to be able to recognize 5-level page tables.
However, could the VMCOREINFO note legitimately contain a single field e.g.
"PAGE_SIZE=4096"?
If so, this is just 14 characters; 14 + 1 <
strlen("NUMBER(KERNEL_IMAGE_SIZE)") - this string is used as the argument to
vmcoreinfo_read_string on line 202 of x86_64.c.
No, it would never contains a single field. Here's a typical dump on a 5.0.0 x86_64
kernel:
crash> help -D
... [ cut ] ...
n_namesz: 11 ("VMCOREINFO")
n_descsz: 1980
n_type: 0 (unused)
OSRELEASE=5.0.0+
PAGESIZE=4096
SYMBOL(init_uts_ns)=ffffffffa9615540
SYMBOL(node_online_map)=ffffffffa976e068
SYMBOL(swapper_pg_dir)=ffffffffa960e000
SYMBOL(_stext)=ffffffffa8200000
SYMBOL(vmap_area_list)=ffffffffa9666230
SYMBOL(mem_section)=ffff90adbbff8000
LENGTH(mem_section)=2048
SIZE(mem_section)=16
OFFSET(mem_section.section_mem_map)=0
SIZE(page)=64
SIZE(pglist_data)=14016
SIZE(zone)=1280
SIZE(free_area)=72
SIZE(list_head)=16
SIZE(nodemask_t)=8
OFFSET(page.flags)=0
OFFSET(page._refcount)=52
OFFSET(page.mapping)=24
OFFSET(page.lru)=8
OFFSET(page._mapcount)=48
OFFSET(page.private)=40
OFFSET(page.compound_dtor)=16
OFFSET(page.compound_order)=17
OFFSET(page.compound_head)=8
OFFSET(pglist_data.node_zones)=0
OFFSET(pglist_data.nr_zones)=13344
OFFSET(pglist_data.node_start_pfn)=13352
OFFSET(pglist_data.node_spanned_pages)=13368
OFFSET(pglist_data.node_id)=13376
OFFSET(zone.free_area)=192
OFFSET(zone.vm_stat)=1088
OFFSET(zone.spanned_pages)=112
OFFSET(free_area.free_list)=0
OFFSET(list_head.next)=0
OFFSET(list_head.prev)=8
OFFSET(vmap_area.va_start)=0
OFFSET(vmap_area.list)=48
LENGTH(zone.free_area)=11
SYMBOL(log_buf)=ffffffffa964b250
SYMBOL(log_buf_len)=ffffffffa964b24c
SYMBOL(log_first_idx)=ffffffffa9d71668
SYMBOL(clear_idx)=ffffffffa9d71634
SYMBOL(log_next_idx)=ffffffffa9d71658
SIZE(printk_log)=16
OFFSET(printk_log.ts_nsec)=0
OFFSET(printk_log.len)=8
OFFSET(printk_log.text_len)=10
OFFSET(printk_log.dict_len)=12
LENGTH(free_area.free_list)=4
NUMBER(NR_FREE_PAGES)=0
NUMBER(PG_lru)=4
NUMBER(PG_private)=13
NUMBER(PG_swapcache)=10
NUMBER(PG_swapbacked)=19
NUMBER(PG_slab)=9
NUMBER(PG_head_mask)=65536
NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-129
NUMBER(HUGETLB_PAGE_DTOR)=2
NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE)=-257
NUMBER(phys_base)=-497025024
SYMBOL(init_top_pgt)=ffffffffa960e000
NUMBER(pgtable_l5_enabled)=0
SYMBOL(node_data)=ffffffffa976d680
LENGTH(node_data)=64
KERNELOFFSET=27200000
NUMBER(KERNEL_IMAGE_SIZE)=1073741824
NUMBER(sme_mask)=0
CRASHTIME=1555324290
I would guess the problem is how much awareness your vm2core facility has
concerning the target kernel, e.g., can it can find the physical
memory containing the VMCOREINFO data. On the target system,
the physical address and a page-granularity size can be located here:
$ cat /sys/kernel/vmcoreinfo
2a21197c0 1000
$
The kernel symbols of the data is "vmcoreinfo_data", and the actual
size of the note is "vmcoreinfo_size". Or in 4.19 and later kernels,
/proc/kcore now has the VMCOREINFO note appended.
The KVM developers went through the same process as you are, and they
eventually implemented a mechanism where they pass the vmcoreinfo data from
the target kernel to the KVM host that is doing a "virsh dump".
In any case, even though it's theoretically impossible to have a zero-length
or very small VMCOREINFO note, your patch does makes logical sense, so I will
apply it.
Thanks,
Dave
Regards,
Nuno
>
> Thanks,
> Dave
>
>
> >
> > Signed-off-by: Nuno Das Neves <nudasnev(a)microsoft.com>
> > ---
> > netdump.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/netdump.c b/netdump.c
> > index 40f9cde..d257ecd 100644
> > --- a/netdump.c
> > +++ b/netdump.c
> > @@ -1838,7 +1838,7 @@ vmcoreinfo_read_string(const char *key)
> > return NULL;
> >
> > /* the '+ 1' is the equal sign */
> > - for (i = 0; i < (size_vmcoreinfo - key_length + 1); i++) {
> > + for (i = 0; i < (int)(size_vmcoreinfo - key_length + 1); i++) {
> > /*
> > * We must also check if we're at the beginning of VMCOREINFO
> > * or the separating newline is there, and of course if we
> > --
> > 1.8.3.1
> >
> > --
> > Crash-utility mailing list
> > Crash-utility(a)redhat.com
> >
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.red...
>
utility&data=02%7C01%7CNuno.Das%40microsoft.com%7C3a91428c54b34ed7855d08d6ea008afe%7C72f988bf86f141af91ab2d7cd0
>
11db47%7C1%7C0%7C636953685382744097&sdata=BVID2b0QYkDmBip0bzPjX%2Fl4vltJtf78JrTE7pnqSIM%3D&reserved=0
> >