On Mon, Aug 19, 2024 at 3:19 PM Tao Liu <ltao@redhat.com> wrote:
Hi Lianbo,

Thanks for the patch, I have run the test and no regressions found. So ack.


Thank you for helping with the test, Tao.

Applied:
https://github.com/crash-utility/crash/commit/79b93ecb2e72ec211918c07b0a857b11a18726fc

Thanks
Lianbo
 
Thanks,
Tao Liu

On Fri, Aug 16, 2024 at 2:30 PM Lianbo Jiang <lijiang@redhat.com> wrote:
>
> Sometimes, in a production environment, there are still some vmcores
> that are incomplete, such as partial header or the data is corrupted.
> When crash tool attempts to parse such vmcores, it may fail as below:
>
>   $ ./crash --osrelease vmcore
>   Bus error (core dumped)
>
> or
>
>   $ crash vmlinux vmcore
>   ...
>   Bus error (core dumped)
>  $
>
> Gdb calltrace:
>
>   $ gdb /home/lijiang/src/crash/crash /tmp/core.126301
>   Core was generated by `./crash --osrelease /home/lijiang/src/39317/vmcore'.
>   Program terminated with signal SIGBUS, Bus error.
>   #0  __memcpy_evex_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:831
>   831             LOAD_ONE_SET((%rsi), PAGE_SIZE, %VMM(4), %VMM(5), %VMM(6), %VMM(7))
>   (gdb) bt
>   #0  __memcpy_evex_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:831
>   #1  0x0000000000651096 in read_dump_header (file=0x7ffc59ddff5f "/home/lijiang/src/39317/vmcore") at diskdump.c:820
>   #2  0x0000000000651cf3 in is_diskdump (file=0x7ffc59ddff5f "/home/lijiang/src/39317/vmcore") at diskdump.c:1042
>   #3  0x0000000000502ac9 in get_osrelease (dumpfile=0x7ffc59ddff5f "/home/lijiang/src/39317/vmcore") at main.c:1938
>   #4  0x00000000004fb2e8 in main (argc=3, argv=0x7ffc59dde3a8) at main.c:271
>   (gdb) frame 1
>   #1  0x0000000000651096 in read_dump_header (file=0x7ffc59ddff5f "/home/lijiang/src/39317/vmcore") at diskdump.c:820
>   820                   memcpy(dd->dumpable_bitmap, dd->bitmap + bitmap_len/2,
>
> This may happen on attempting access to a page of the buffer that lies
> beyond the end of the mapped file(see the mmap() man page).
>
> Let's add a check to avoid such issues as much as possible, but still
> not guarantee that it can work well in any extreme situation.
>
> Fixes: a3344239743b ("diskdump: use mmap/madvise to improve the start-up")
> Reported-by: Buland Kumar Singh <bsingh@redhat.com>
> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> ---
>  diskdump.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/diskdump.c b/diskdump.c
> index 1f7118cacfc6..ce3cbb7b12dd 100644
> --- a/diskdump.c
> +++ b/diskdump.c
> @@ -805,6 +805,22 @@ restart:
>                         goto err;
>                 }
>         } else {
> +               struct stat sbuf;
> +               if (fstat(dd->dfd, &sbuf) != 0) {
> +                       error(INFO, "Cannot fstat the dump file\n");
> +                       goto err;
> +               }
> +
> +               /*
> +                * For memory regions mapped with the mmap(), attempts access to
> +                * a page of the buffer that lies beyond the end of the mapped file,
> +                * which may cause SIGBUS(see the mmap() man page).
> +                */
> +               if (bitmap_len + offset > sbuf.st_size) {
> +                       error(INFO, "Mmap: Beyond the end of mapped file, corrupted?\n");
> +                       goto err;
> +               }
> +
>                 dd->bitmap = mmap(NULL, bitmap_len, PROT_READ,
>                                         MAP_SHARED, dd->dfd, offset);
>                 if (dd->bitmap == MAP_FAILED)
> --
> 2.45.1