----- Original Message -----
> On Thu, 2019-04-18 at 15:02 +0100, Pierguido Lambri wrote:
> > Hello,
> >
> > Today while I was looking into a vmcore, I got suddenly the
> > message
> > in $SUBJECT.
> > It started after I did a search into the process stack pages
> > (search
> > -t)
> > and for each command I run afterwards I kept getting that
> > message.
> > For example:
> >
> > $ retrace-server-interact 603967269 crash
> > ...
> > crash> search -t ffff88040a0d5280
> >
> > search: invalid list entry: 0
> >
> > search: invalid list entry: 0
> >
> > search: invalid list entry: 0
> > PID: 606 TASK: ffff88082d226eb0 CPU: 5 COMMAND:
> > "xfsaild/dm-0"
> > ffff88083ff5b948: ffff88040a0d5280
> > ffff88083ff5b990: ffff88040a0d5280
> > ffff88083ff5baa8: ffff88040a0d5280
> > ffff88083ff5baf0: ffff88040a0d5280
> > ffff88083ff5bcf0: ffff88040a0d5280
> > ffff88083ff5bd38: ffff88040a0d5280
> > ffff88083ff5bd98: ffff88040a0d5280
> >
> >
> > WARNING: malloc/free mismatch (29/32)
> >
> > crash> ps -m | grep UN
> > [ 0 00:00:00.146] [UN] PID: 1811 TASK: ffff880c17bd1fa0 CPU:
> > 1 COMMAND: "cp"
> > WARNING: malloc/free mismatch (29/32)
> >
> > I guess this comes from a possible corrupted vmcore (I just got
> > it
> > from this vmcore),
> > but I wonder why every new command keeps returning the same
> > message.
> >
> > Thanks,
> >
> > Pier
> >
> > --
> > Crash-utility mailing list
> > Crash-utility(a)redhat.com
> >
https://www.redhat.com/mailman/listinfo/crash-utility
>
> FWIW, I just pulled this up after plambri pinged me. This is the
> backtrace that is being hit though I've not dug in more:
>
> Breakpoint 3, do_list (ld=0x7ffffffea6c0) at tools.c:3820
> 3820 error(INFO, "\ninvalid list
> entry:
> 0\n");
> (gdb) list
> 3815 return -1;
> 3816 }
> 3817
> 3818 if (next == 0) {
> 3819 if (ld->flags & LIST_HEAD_FORMAT) {
> 3820 error(INFO, "\ninvalid list
> entry:
> 0\n");
> 3821 if (close_hq_on_return)
> 3822 hq_close();
> 3823 return -1;
> 3824 }
> (gdb) bt
> #0 do_list (ld=0x7ffffffea6c0) at tools.c:3820
> #1 0x000000000047ec82 in dump_vmap_area (vi=0x7ffffffed0d0) at
> memory.c:8724
> #2 dump_vmlist (vi=0x7ffffffed0d0) at memory.c:8590
> #3 0x000000000047f3eb in last_vmalloc_address () at memory.c:16792
> #4 0x0000000000515e6b in x86_64_get_kvaddr_ranges
> (vrp=0x7fffffffd340) at x86_64.c:8706
> #5 0x000000000049c6ae in cmd_search () at memory.c:13988
> #6 0x0000000000465f9c in exec_command () at main.c:879
> #7 0x00000000004661ca in main_loop () at main.c:826
> #8 0x00000000006b21a3 in captured_command_loop (data=<value
> optimized out>) t main.c:258
> #9 0x00000000006b0a8b in catch_errors (func=0x6b2190
> <captured_command_loop>, func_args=0x0, errstring=0x90c106 "",
> mask=6) at exceptions.c:557
> #10 0x00000000006b3076 in captured_main (data=<value optimized
> out>) at main.c:1064
> #11 0x00000000006b0a8b in catch_errors (func=0x6b22b0
> <captured_main>, func_args=0x7fffffffe2e0, errstring=0x90c106 "",
> mask=6) at exceptions.c:557
> #12 0x00000000006b1fa4 in gdb_main (args=<value optimized out>) at
> main.c:1079
> #13 0x00000000006b1fde in gdb_main_entry (argc=<value optimized
> out>, argv=<value optimized out>) at main.c:1099
> #14 0x0000000000467030 in main (argc=3, argv=0x7fffffffe458) at
> main.c:707
Hmmm, the vmap_area list is a list_head type list, so there should
never be
a NULL "next" pointer.
I'm guessing that "kmem -v" also fails? The last vmap_area entry
should point back to
the global "vmap_area_list" list header, for example:
crash> kmem -v | tail
ffff96e7ecaaca80 ffff96e54c89c400 ffffffffc0e54000 -
ffffffffc0e5a000 24576
ffff96e757ffe380 ffff96e4be98f3c0 ffffffffc0e5d000 -
ffffffffc0e6d000 65536
ffff96e467b33400 ffff96e6a3ae1a00 ffffffffc0e6d000 -
ffffffffc0e73000 24576
ffff96e85cf4e600 ffff96e752c52b40 ffffffffc0e77000 -
ffffffffc0e7c000 20480
ffff96e85cf4e380 ffff96e5506c6c00 ffffffffc0e7c000 -
ffffffffc0e81000 20480
ffff96e802baa500 ffff96e5506c69c0 ffffffffc0e81000 -
ffffffffc0e86000 20480
ffff96e802baac00 ffff96e5506c6cc0 ffffffffc0e86000 -
ffffffffc0e8c000 24576
ffff96e574196f80 ffff96e55ffd6c80 ffffffffc0e90000 -
ffffffffc0e95000 20480
ffff96e574196680 ffff96e55ffd6880 ffffffffc0e95000 -
ffffffffc0e9a000 20480
ffff96e87c222800 ffff96e5496ca680 ffffffffc0e9a000 -
ffffffffc0ea4000 40960
crash> vmap_area ffff96e87c222800
struct vmap_area {
va_start = 18446744072651120640,
va_end = 18446744072651161600,
flags = 4,
rb_node = {
__rb_parent_color = 18446628510972342169,
rb_right = 0x0,
rb_left = 0xffff96e574196698
},
list = {
next = 0xffffffffae69af90,
prev = 0xffff96e5741966b0
},
purge_list = {
next = 0x0,
prev = 0xdead000000000200
},
vm = 0xffff96e5496ca680,
callback_head = {
next = 0x0,
func = 0xffff96e71d51aa00
}
}
crash> sym 0xffffffffae69af90
ffffffffae69af90 (D) vmap_area_list
crash>
Dave
Yeah kmem -v fails as well:
crash> kmem -v
kmem: invalid list entry: 0
WARNING: malloc/free mismatch (29/30)
crash>
There's no indicating of an error when crash loads though - only after
running these commands. Do you think this a damaged vmcore that is not
obvious?