February 2024 - Crash-utility - Crash Utility List Archives

[RFC PATCH 0/9] Add feature to validate page descriptor table in kdump-compressed format

by HATAYAMA Daisuke

I've made a RFC patch set to make sanity check of page descriptor table in kdump compressed format strict. This work has arose from the past issue that a produced crash dump file was broken not only in data segment but also in headers including page descriptor table. I've ever explained a bit this on crash-devel in the following thread: https://listman.redhat.com/archives/crash-utility/2023-September/010957.html In this past issue, I couldn't find out its root cause after all because when I began investigating the issue, problematic system where the issue was reproduced was already unavailable. This patch set is aimed at allowing us to figure out this kind of issue more quickly and in more detail using crash utility. The code is based on the tool I made to analyze the broken crash dump in the past issue. This is still incomplete, for example, the sanity check has not yet supported split dump files in the kdump-compressed format and old header version up to 5. It would be appreciated if I can get comments for this RFC version. HATAYAMA Daisuke (9): diskdump: Add stat object in diskdump_data diskdump: Add function sanity_check_page_desc() that sanity checks an entry of page descriptor table diskdump: Add function check_kdump_headers() that validates page descriptor table defs.h: Introduce flag VALIDATE_KDUMP_HEADERS diskdump, main: Add --validate_kdump_headers command-line option diskdump: Make sanity check in cache_page() strict help: Add description of --validate_kdump_headers command-line option man: Add description of --validate_kdump_headers command-line option diskdump, debug: Print elapsed time consumed in validation of page descriptor table crash.8 | 4 +++ defs.h | 1 + diskdump.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++++++++- help.c | 4 +++ main.c | 5 ++++ 5 files changed, 87 insertions(+), 1 deletion(-) -- 2.43.1

1 year, 4 months

2
10
0 / 0

[PATCH v4] arm64: Add vmemmap support

by Huang Shijie

If the kernel exports the vmmemap then we can use that symbol in crash to optimize access. vmmemap is just an array of page structs after all. This patch tries to: 1.) Get the "vmemmap" from the vmcore file. If we can use the "vmemmap", we implement the arm64_vmemmap_is_page_ptr and set it to machdep->is_page_ptr. 2.) We implement the fast page_to_pfn code in arm64_vmemmap_is_page_ptr. 3.) Dump it in "help -m" Test result: Without the this patch: #files -p xxx > /dev/null (xxx is the inode of vmlinux which is 441M) This costed about 185 seconds. With the this patch: #files -p xxx > /dev/null (xxx is the inode of vmlinux which is 441M) This costed 3 seconds. Signed-off-by: Huang Shijie <shijie(a)os.amperecomputing.com> --- v3 --> v4: Use "files -p" to measure the time. Dump it in "help -m" --- arm64.c | 26 ++++++++++++++++++++++++++ defs.h | 1 + 2 files changed, 27 insertions(+) diff --git a/arm64.c b/arm64.c index 57965c6..fc4ba64 100644 --- a/arm64.c +++ b/arm64.c @@ -117,6 +117,28 @@ static void arm64_calc_kernel_start(void) ms->kimage_end = (sp ? sp->value : 0); } +static int +arm64_vmemmap_is_page_ptr(ulong addr, physaddr_t *phys) +{ + ulong size = SIZE(page); + ulong pfn, nr; + + + if (IS_SPARSEMEM() && (machdep->flags & VMEMMAP) && + (addr >= VMEMMAP_VADDR && addr <= VMEMMAP_END) && + !((addr - VMEMMAP_VADDR) % size)) { + + pfn = (addr - machdep->machspec->vmemmap) / size; + nr = pfn_to_section_nr(pfn); + if (valid_section_nr(nr)) { + if (phys) + *phys = PTOB(pfn); + return TRUE; + } + } + return FALSE; +} + /* * Do all necessary machine-specific setup here. This is called several times * during initialization. @@ -382,6 +404,9 @@ arm64_init(int when) machdep->stacksize = ARM64_STACK_SIZE; machdep->flags |= VMEMMAP; + /* If vmemmap exists, it means kernel enabled CONFIG_SPARSEMEM_VMEMMAP */ + if (arm64_get_vmcoreinfo(&ms->vmemmap, "SYMBOL(vmemmap)", NUM_HEX)) + machdep->is_page_ptr = arm64_vmemmap_is_page_ptr; machdep->uvtop = arm64_uvtop; machdep->is_uvaddr = arm64_is_uvaddr; @@ -1096,6 +1121,7 @@ arm64_dump_machdep_table(ulong arg) fprintf(fp, " vmemmap_vaddr: %016lx\n", ms->vmemmap_vaddr); fprintf(fp, " vmemmap_end: %016lx\n", ms->vmemmap_end); if (machdep->flags & NEW_VMEMMAP) { + fprintf(fp, " vmemmap: %016lx\n", ms->vmemmap); fprintf(fp, " kimage_text: %016lx\n", ms->kimage_text); fprintf(fp, " kimage_end: %016lx\n", ms->kimage_end); fprintf(fp, " kimage_voffset: %016lx\n", ms->kimage_voffset); diff --git a/defs.h b/defs.h index 0558d13..3431a32 100644 --- a/defs.h +++ b/defs.h @@ -3486,6 +3486,7 @@ struct machine_specific { ulong CONFIG_ARM64_KERNELPACMASK; ulong physvirt_offset; ulong struct_page_size; + ulong vmemmap; }; struct arm64_stackframe { -- 2.40.1

1 year, 4 months

5
8
0 / 0

Re: Adding the zram decompression algorithm "lzo-rle" to support kernel versions >= 5.1

by Tao Liu

Hi Yulong, Thanks for your patch! On Mon, Feb 26, 2024 at 3:20 PM Yulong TANG 汤玉龙 <yulong.tang(a)nio.com> wrote: > > In Linux 5.1, the ZRAM block driver has changed its default compressor from "lzo" to "lzo-rle" to enhance LZO compression support. However, crash does not support the improved LZO algorithm, resulting in failure when reading memory. > > change default compressor : ce82f19fd5809f0cf87ea9f753c5cc65ca0673d6 > > > The issue was discovered when using the extension 'gcore' to generate a process coredump, which was found to be incomplete and unable to be opened properly with gdb. > > > This patch is for Crash-utility tool, it enables the Crash-utility to support decompression of the "lzo-rle" compression algorithm used in zram. The patch has been tested with vmcore files from kernel version 5.4, and successfully allows reading of memory compressed with the zram compression algorithm. I have no objection to the lzo-rle decompression feature for crash. However I have some concern of your patch: The patch you attached is a "lzorle_decompress_safe" implementation which is copied from kernel source code. One of the drawbacks of copying kernel source code is, kernel is constantly evolving, the code you copied here today maybe updated someday later, and in support of different kernel versions, we need to keep a bunch of switch(kernel_version) and case code to keep the compatibility, which is what we are trying to avoid. In addition, the code you copied has deliberately deleted the "if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)" part, which may also cause some problem, and as far as I know, there is no good way in crash to determine the kernel config status, please feel free to correct me if I'm wrong. I'm thinking of another way to implement this, by copying the related kernel function's binary to crash and execute it in crash, of course the kernel function needs to meet some limitations, but at least it can work for some simple functions as my test. So could you please give the following trial patch some modification and try? diff --git a/memory.c b/memory.c index b84e974..998ccbd 100644 --- a/memory.c +++ b/memory.c @@ -1555,6 +1555,14 @@ cmd_rd(void) } display_memory(addr, count, flag, memtype, outputfile); + + char (*code)(char *, char *) = (char (*)(char *, char *))copy_kernel_function("strcat"); + if (code) { + char buf[64] = "ABCD"; + char src[] = "abcd"; + fprintf(fp, ">>>>>>>>> %p %s\n", strcat(buf, src), buf); + free(code); + } } /* diff --git a/tools.c b/tools.c index 1f8448c..e57bf87 100644 --- a/tools.c +++ b/tools.c @@ -7006,3 +7006,48 @@ get_subsys_private(char *kset_name, char *target_name) return private; } + +void *copy_kernel_code(ulong kvaddr_start, ulong kvaddr_end) +{ + void *code_buf = NULL; + int res; + + res = posix_memalign(&code_buf, machdep->pagesize, + kvaddr_end - kvaddr_start); + if (res) + goto fail; + res = mprotect(code_buf, kvaddr_end - kvaddr_start, + PROT_READ|PROT_WRITE|PROT_EXEC); + if (res) + goto fail; + memset(code_buf, 0, kvaddr_end - kvaddr_start); + readmem(kvaddr_start, KVADDR, code_buf, kvaddr_end - kvaddr_start, + "read kernel code", FAULT_ON_ERROR); + + return code_buf; +fail: + if (code_buf) + free(code_buf); + return NULL; +} + +void *copy_kernel_function(char *func_name) +{ + struct syment *sp_start, *sp_end; + void *code; + + if (!symbol_exists(func_name)) + error(FATAL, "kernel function %s not exist!\n", func_name); + + sp_start = symbol_search(func_name); + sp_end = next_symbol(NULL, sp_start); + if (!sp_start || !sp_end) + goto fail; + + code = copy_kernel_code(sp_start->value, sp_end->value); + if (!code) + goto fail; + return code; +fail: + return NULL; +} You can modify "char (*code)(char *, char *) = (char (*)(char *, char *))copy_kernel_function("strcat");" part and use it in diskdump.c:try_zram_decompress() as something like: int (*code)(unsigned char *, size_t, unsigned char *, size_t *) = (int (*)(unsigned char *, size_t, unsigned char *, size_t *))copy_kernel_function("lzo1x_decompress_safe"); code(arg1, arg2, arg3, arg4); ... So we don't need to maintain lzo1x_decompress_safe() source code, and always get the lzo1x_decompress_safe() function in binary form at runtime, which is compatible with the current kernel. Thanks, Tao Liu > > Thanks and regards, > Yulong > > > -- > Crash-utility mailing list -- devel(a)lists.crash-utility.osci.io > To unsubscribe send an email to devel-leave(a)lists.crash-utility.osci.io > https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ > Contribution Guidelines: https://github.com/crash-utility/crash/wiki

1 year, 4 months

4
13
0 / 0

[Crash-utility][PATCH] LoongArch64: Fixed link errors when build on LOONGARCH64 machine

by Ming Wang

The following link error exists when building with LOONGARCH64 machine: /usr/bin/ld: proc-service.o: in function `.LVL71': proc-service.c:(.text+0x324): undefined reference to `fill_gregset ... /usr/bin/ld: proc-service.o: in function `.LVL77': proc-service.c:(.text+0x364): undefined reference to `supply_gregset ... /usr/bin/ld: proc-service.o: in function `.LVL87': proc-service.c:(.text+0x3c4): undefined reference to `fill_fpregset ... /usr/bin/ld: proc-service.o: in function `.LVL93': proc-service.c:(.text+0x404): undefined reference to `supply_fpregset collect2: error: ld returned 1 exit status The cause of the error is that the definition of a function such as fill_gregset is not implemented. This patch is used to fix this error. Reported-by: Xiujie Jiang <jiangxiujie(a)kylinos.cn> Signed-off-by: Ming Wang <wangming01(a)loongson.cn> --- gdb-10.2.patch | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/gdb-10.2.patch b/gdb-10.2.patch index a7018a2..a418209 100644 --- a/gdb-10.2.patch +++ b/gdb-10.2.patch @@ -16057,3 +16057,43 @@ exit 0 m10200-dis.c m10200-opc.c m10300-dis.c +--- gdb-10.2/gdb/loongarch-linux-tdep.c.orig ++++ gdb-10.2/gdb/loongarch-linux-tdep.c +@@ -707,3 +707,37 @@ _initialize_loongarch_linux_tdep () + gdbarch_register_osabi (bfd_arch_loongarch, bfd_mach_loongarch64, + GDB_OSABI_LINUX, loongarch_linux_init_abi); + } ++ ++/* Wrapper functions. These are only used by libthread_db. */ ++#include <sys/procfs.h> ++void ++supply_gregset (struct regcache *regcache, ++ const prgregset_t *gregset) ++{ ++ loongarch_elf_gregset.supply_regset (NULL, regcache, -1, gregset, ++ sizeof (prgregset_t)); ++} ++ ++void ++fill_gregset (const struct regcache *regcache, ++ prgregset_t *gregset, int regno) ++{ ++ loongarch_elf_gregset.collect_regset (NULL, regcache, regno, gregset, ++ sizeof (prgregset_t)); ++} ++ ++void ++supply_fpregset (struct regcache *regcache, ++ const prfpregset_t *fpregset) ++{ ++ loongarch_elf_fpregset.supply_regset (NULL, regcache, -1, fpregset, ++ sizeof (prfpregset_t)); ++} ++ ++void ++fill_fpregset (const struct regcache *regcache, ++ prfpregset_t *fpregset, int regno) ++{ ++ loongarch_elf_fpregset.collect_regset (NULL, regcache, regno, fpregset, ++ sizeof (prfpregset_t)); ++} -- 2.39.2

1 year, 5 months

2
2
0 / 0

[PATCH v9 0/5] Improve stack unwind on ppc64

by Aditya Gupta

The Problem: ============ Currently crash is unable to show function arguments and local variables, as gdb can do. And functionality for moving between frames ('up'/'down') is not working in crash. Crash has 'gdb passthroughs' for things gdb can do, but the gdb passthroughs 'bt', 'frame', 'info locals', 'up', 'down' are not working either, due to gdb not getting the register values from `crash_target::fetch_registers`, which then uses `machdep->get_cpu_reg`, which is not implemented for PPC64 Proposed Solution: ================== Fix the gdb passthroughs by implementing "machdep->get_cpu_reg" for PPC64. This way, "gdb mode in crash" will support this feature for both ELF and kdump-compressed vmcore formats, while "gdb" would only have supported ELF format This way other features of 'gdb', such as seeing backtraces/registers/variables/arguments/local variables, moving up and down stack frames, can be used with any ppc64 vmcore, irrespective of being ELF format or kdump-compressed format. Note: This doesn't support live debugging on ppc64, since registers are not available to be read Implications on Architectures: ==================================== No architecture other than PPC64 has been affected, other than in case of 'frame' command As mentioned in patch #2, since frame will not be prohibited, so it will print: crash> frame #0 <unavailable> in ?? () Instead of before prohibited message: crash> frame crash: prohibited gdb command: frame Major change will be in 'gdb mode' on PPC64, that it will print the frames, and local variables, instead of failing with errors showing no frame, or showing that couldn't get PC, it will be able to give all this information. Testing: ======== Git tree with this patch series applied: https://github.com/adi-g15-ibm/crash/tree/stack-unwind-v9 To test various gdb passthroughs: (crash) set (crash) set gdb on gdb> thread gdb> bt gdb> info threads gdb> info threads gdb> info locals gdb> info variables irq_rover_lock gdb> info args gdb> thread 2 gdb> set gdb off (crash) set (crash) set -c 6 (crash) gdb thread (crash) bt (crash) gdb bt (crash) frame (crash) gdb up (crash) gdb down (crash) info locals Known Issues: ============= 1. In gdb mode, 'bt' might fail to show backtrace in few vmcores collected from older kernels. This is a known issue due to register mismatch, and its fix has been merged upstream: This can also cause some 'invalid kernel virtual address' errors during gdb unwinding the stack registers Commit: https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef78581... Fixing GDB passthroughs on other architectures ============================================== Much of the work for making gdb passthroughs like 'gdb bt', 'gdb thread', 'gdb info locals' etc. has been done by the patches introducing 'machdep->get_cpu_reg' and this series fixing some issues in that. Other architectures should be able to fix these gdb functionalities by simply implementing 'machdep->get_cpu_reg (cpu, regno, ...)'. The reasoning behind that has been explained with a diagram in commit description of patch #1 I will assist with my findings/observations fixing it on ppc64 whenever needed. Changelog: ========== V9: + minor change in patch #5: sync gdb context on a 'set' and 'set -p' + add taoliu's patch for using current context, and fixes in ppc64_get_cpu_reg V8: + use get_active_task instead of depending on CURRENT_CONTEXT in ppc64_get_cpu_reg + rebase to upstream/master (5977936c0a91) V7: + move changes in gdb-10.2.patch to the end (minor change in patch #3,4,5) + fix a memory leak in ppc64_get_cpu_reg (minor change in patch #1) + use ascii diagram in patch #1 description V6: + changes in patch #5: fix bug introduced in v5 that caused initial gdb thread to be thread 1 V5: + changes in patch #1: made ppc64_get_cpu_reg static, and remove unreachable code + changes in patch #3: fixed typo 'ppc64_renum' instead of 'ppc64_regnum', remove unneeded if condition + changes in patch #5: implement refresh regcache on per thread, instead of all threads at once V4: + fix segmentation fault in live debugging (change in patch #1) + mention live debugging not supported in cover letter and patch #1 + fixed some checkpatch warnings (change in patch #5) V3: + default gdb thread will be the crashing thread, instead of being thread '0' + synchronise crash cpu and gdb thread context + fix bug in gdb_interface, that replaced gdb's output stream, losing output in some cases, such as info threads and extra output in info variables + fix 'info threads' RFC V2: - removed patch implementing 'frame', 'up', 'down' in crash - updated the cover letter by removing the mention of those commands other than the respective gdb passthrough Aditya Gupta (5): ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg remove 'frame' from prohibited commands list synchronise cpu context changes between crash/gdb fix gdb_interface: restore gdb's output streams at end of gdb_interface fix 'info threads' command crash_target.c | 44 ++++++++++++++++ defs.h | 130 +++++++++++++++++++++++++++++++++++++++++++++++- gdb-10.2.patch | 110 +++++++++++++++++++++++++++++++++++++++- gdb_interface.c | 2 +- kernel.c | 47 +++++++++++++++-- ppc64.c | 95 +++++++++++++++++++++++++++++++++-- task.c | 14 ++++++ tools.c | 2 +- 8 files changed, 434 insertions(+), 10 deletions(-) -- 2.41.0

1 year, 5 months

3
24
0 / 0

crash8.0.4 cannot get source line nums of functions from ko on android15-k6.6

by 王天明 (Tianming Wang)

Hi I use crash8.0.4_arm64 to parse the ramdump of android15-k6.6, load the symbol of ko, and disassemble the functions in ko through the "dis -lx" command. I can get the assembly instructions, but I cannot get the corresponding assembly instructions. lines of source code. The following error is reported when using the command GNU_RESOLVE_TEXT_ADDR: returned via gdb_error_hook (1 buffer in use) GNU_GET_FUNCTION_RANGE: returned via gdb_error_hook (1 buffer in use I use the disassembly tool that comes with Android, such as objdump, to disassemble ko. I can see the code lines corresponding to the functions in ko.. So, is this a problem with the gdb tool or is there something wrong with my ko symbol? Thanks a lot. ________________________________ This email (including its attachments) is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. Unauthorized use, dissemination, distribution or copying of this email or the information herein or taking any action in reliance on the contents of this email or the information herein, by anyone other than the intended recipient, or an employee or agent responsible for delivering the message to the intended recipient, is strictly prohibited. If you are not the intended recipient, please do not read, copy, use or disclose any part of this e-mail to others. Please notify the sender immediately and permanently delete this e-mail and any attachments if you received it in error. Internet communications cannot be guaranteed to be timely, secure, error-free or virus-free. The sender does not accept liability for any errors or omissions. 本邮件及其附件具有保密性质，受法律保护不得泄露，仅发送给本邮件所指特定收件人。严禁非经授权使用、宣传、发布或复制本邮件或其内容。若非该特定收件人，请勿阅读、复制、使用或披露本邮件的任何内容。若误收本邮件，请从系统中永久性删除本邮件及所有附件，并以回复邮件的方式即刻告知发件人。无法保证互联网通信及时、安全、无误或防毒。发件人对任何错漏均不承担责任。

1 year, 5 months

3
14
0 / 0

[PATCH] vmware_guestdump: Various format versions support

by Alexey Makhalov

There are several versions of debug.guest format. Current version of the code is able to parse only version 4. Improve parser to support other known versions. Split data structures on sub-structures and introduce a helper functions to calculate a gap between them based on the version number. Implement additional data structure (struct mainmeminfo_old) and logic specifically for original (version 1) format support. Signed-off-by: Alexey Makhalov <alexey.makhalov(a)broadcom.com> --- vmware_guestdump.c | 316 ++++++++++++++++++++++++++++++++------------- 1 file changed, 229 insertions(+), 87 deletions(-) diff --git a/vmware_guestdump.c b/vmware_guestdump.c index 5be26c8..5c7ee4d 100644 --- a/vmware_guestdump.c +++ b/vmware_guestdump.c @@ -2,6 +2,8 @@ * vmware_guestdump.c * * Copyright (c) 2020 VMware, Inc. + * Copyright (c) 2024 Broadcom. All Rights Reserved. The term "Broadcom" + * refers to Broadcom Inc. and/or its subsidiaries. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -13,7 +15,7 @@ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * - * Author: Alexey Makhalov <amakhalov(a)vmware.com> + * Author: Alexey Makhalov <alexey.makhalov(a)broadcom.com> */ #include "defs.h" @@ -21,20 +23,31 @@ #define LOGPRX "vmw: " -#define GUESTDUMP_VERSION 4 -#define GUESTDUMP_MAGIC1 1 -#define GUESTDUMP_MAGIC2 0 - +/* + * debug.guest file layout + * 00000000: guest dump header, it includes: + * 1. Version (4 bytes) \ + * 2. Number of Virtual CPUs (4 bytes) } - struct guestdumpheader + * 3. Reserved gap + * 4. Main Memory information - struct mainmeminfo{,_old} + * (use get_vcpus_offset() to get total size of guestdumpheader) + * vcpus_offset: ---------\ + * 1. struct vcpu_state1 \ + * 2. reserved gap } num_vcpus times + * 3. struct vcpu_state2 / + * 4. 4KB of reserved data / + * --------/ + * + */ struct guestdumpheader { uint32_t version; uint32_t num_vcpus; - uint8_t magic1; - uint8_t reserved1; - uint32_t cpu_vendor; - uint64_t magic2; +} __attribute__((packed)) hdr; + +struct mainmeminfo { uint64_t last_addr; uint64_t memsize_in_pages; - uint32_t reserved2; + uint32_t reserved1; uint32_t mem_holes; struct memhole { uint64_t ppn; @@ -42,14 +55,36 @@ struct guestdumpheader { } holes[2]; } __attribute__((packed)); -struct vcpu_state { +/* Used by version 1 only */ +struct mainmeminfo_old { + uint64_t last_addr; + uint32_t memsize_in_pages; + uint32_t reserved1; + uint32_t mem_holes; + struct memhole1 { + uint32_t ppn; + uint32_t pages; + } holes[2]; + /* There are additional fields, see get_vcpus_offset() calculation. */ +} __attribute__((packed)); + +/* First half of vcpu_state */ +struct vcpu_state1 { uint32_t cr0; uint64_t cr2; uint64_t cr3; uint64_t cr4; uint64_t reserved1[10]; uint64_t idt_base; - uint16_t reserved2[21]; +} __attribute__((packed)); + +/* + * Unused fields between vcpu_state1 and vcpu_state2 swill be skipped. + * See get_vcpu_gapsize() calculation. + */ + +/* Second half of vcpu_state */ +struct vcpu_state2 { struct x86_64_pt_regs { uint64_t r15; uint64_t r14; @@ -76,9 +111,41 @@ struct vcpu_state { uint8_t reserved3[65]; } __attribute__((packed)); +/* + * Returns the size of the guest dump header. + */ +static inline long +get_vcpus_offset(uint32_t version, int mem_holes) +{ + switch (version) { + case 1: /* ESXi 6.7 and older */ + return sizeof(struct guestdumpheader) + 13 + sizeof(struct mainmeminfo_old) + + (mem_holes == -1 ? 0 : 8 * mem_holes + 4); + case 3: /* ESXi 6.8 */ + return sizeof(struct guestdumpheader) + 14 + sizeof(struct mainmeminfo); + case 4: /* ESXi 7.0 */ + case 5: /* ESXi 8.0 */ + return sizeof(struct guestdumpheader) + 14 + sizeof(struct mainmeminfo); + case 6: /* ESXi 8.0u2 */ + return sizeof(struct guestdumpheader) + 15 + sizeof(struct mainmeminfo); + + } + return 0; +} + +/* + * Returns the size of reserved (unused) fields in the middle of vcpu_state structure. + */ +static inline long +get_vcpu_gapsize(uint32_t version) +{ + if (version < 4) + return 45; + return 42; +} /* - * vmware_guestdump is extension to vmware_vmss with ability to debug + * vmware_guestdump is an extension to the vmware_vmss with ability to debug * debug.guest and debug.vmem files. * * debug.guest.gz and debug.vmem.gz can be obtained using following @@ -86,73 +153,136 @@ struct vcpu_state { * monitor.mini-suspend_on_panic = TRUE * monitor.suspend_on_triplefault = TRUE * - * guestdump (debug.guest) is simplified version of *.vmss which does - * not contain full VM state, but minimal guest state, such as memory + * guestdump (debug.guest) is a simplified version of the *.vmss which does + * not contain a full VM state, but minimal guest state, such as a memory * layout and CPUs state, needed for debugger. is_vmware_guestdump() * and vmware_guestdump_init() functions parse guestdump header and - * populate vmss data structure (from vmware_vmss.c). As result, all + * populate vmss data structure (from vmware_vmss.c). In result, all * handlers (except mempry_dump) from vmware_vmss.c can be reused. * - * debug.guest does not have dedicated header magic or signature for - * its format. To probe debug.guest we need to perform header fields - * and file size validity. In addition, check for the filename - * extension, which must be ".guest". + * debug.guest does not have a dedicated header magic or file format signature + * To probe debug.guest we need to perform series of validations. In addition, + * we check for the filename extension, which must be ".guest". */ - int is_vmware_guestdump(char *filename) { - struct guestdumpheader hdr; + struct mainmeminfo mmi; + long vcpus_offset; FILE *fp; - uint64_t filesize, holes_sum = 0; + uint64_t filesize, expected_filesize, holes_sum = 0; int i; if (strcmp(filename + strlen(filename) - 6, ".guest")) return FALSE; - if ((fp = fopen(filename, "r")) == NULL) { + if ((fp = fopen(filename, "r")) == NULL) { error(INFO, LOGPRX"Failed to open '%s': [Error %d] %s\n", - filename, errno, strerror(errno)); + filename, errno, strerror(errno)); return FALSE; - } + } if (fread(&hdr, sizeof(struct guestdumpheader), 1, fp) != 1) { error(INFO, LOGPRX"Failed to read '%s' from file '%s': [Error %d] %s\n", - "guestdumpheader", filename, errno, strerror(errno)); + "guestdumpheader", filename, errno, strerror(errno)); + fclose(fp); + return FALSE; + } + + vcpus_offset = get_vcpus_offset(hdr.version, -1 /* Unknown yet, adjust it later */); + + if (!vcpus_offset) { + if (CRASHDEBUG(1)) + error(INFO, LOGPRX"Not supported version %d\n", hdr.version); fclose(fp); return FALSE; } + if (hdr.version == 1) { + struct mainmeminfo_old tmp; + if (fseek(fp, vcpus_offset - sizeof(struct mainmeminfo_old), SEEK_SET) == -1) { + if (CRASHDEBUG(1)) + error(INFO, LOGPRX"Failed to fseek '%s': [Error %d] %s\n", + filename, errno, strerror(errno)); + fclose(fp); + return FALSE; + } + + if (fread(&tmp, sizeof(struct mainmeminfo_old), 1, fp) != 1) { + if (CRASHDEBUG(1)) + error(INFO, LOGPRX"Failed to read '%s' from file '%s': [Error %d] %s\n", + "mainmeminfo_old", filename, errno, strerror(errno)); + fclose(fp); + return FALSE; + } + mmi.last_addr = tmp.last_addr; + mmi.memsize_in_pages = tmp.memsize_in_pages; + mmi.mem_holes = tmp.mem_holes; + mmi.holes[0].ppn = tmp.holes[0].ppn; + mmi.holes[0].pages = tmp.holes[0].pages; + mmi.holes[1].ppn = tmp.holes[1].ppn; + mmi.holes[1].pages = tmp.holes[1].pages; + /* vcpu_offset adjustment for mem_holes is required only for version 1. */ + vcpus_offset = get_vcpus_offset(hdr.version, mmi.mem_holes); + } else { + if (fseek(fp, vcpus_offset - sizeof(struct mainmeminfo), SEEK_SET) == -1) { + if (CRASHDEBUG(1)) + error(INFO, LOGPRX"Failed to fseek '%s': [Error %d] %s\n", + filename, errno, strerror(errno)); + fclose(fp); + return FALSE; + } + + if (fread(&mmi, sizeof(struct mainmeminfo), 1, fp) != 1) { + if (CRASHDEBUG(1)) + error(INFO, LOGPRX"Failed to read '%s' from file '%s': [Error %d] %s\n", + "mainmeminfo", filename, errno, strerror(errno)); + fclose(fp); + return FALSE; + } + } if (fseek(fp, 0L, SEEK_END) == -1) { - error(INFO, LOGPRX"Failed to fseek '%s': [Error %d] %s\n", - filename, errno, strerror(errno)); + if (CRASHDEBUG(1)) + error(INFO, LOGPRX"Failed to fseek '%s': [Error %d] %s\n", + filename, errno, strerror(errno)); fclose(fp); return FALSE; } filesize = ftell(fp); fclose(fp); - if (hdr.mem_holes > 2) - goto unrecognized; + if (mmi.mem_holes > 2) { + if (CRASHDEBUG(1)) + error(INFO, LOGPRX"Unexpected mmi.mem_holes value %d\n", + mmi.mem_holes); + return FALSE; + } - for (i = 0; i < hdr.mem_holes; i++) { + for (i = 0; i < mmi.mem_holes; i++) { /* hole start page */ - vmss.regions[i].startpagenum = hdr.holes[i].ppn; + vmss.regions[i].startpagenum = mmi.holes[i].ppn; /* hole end page */ - vmss.regions[i].startppn = hdr.holes[i].ppn + hdr.holes[i].pages; - holes_sum += hdr.holes[i].pages; + vmss.regions[i].startppn = mmi.holes[i].ppn + mmi.holes[i].pages; + holes_sum += mmi.holes[i].pages; + } + + if ((mmi.last_addr + 1) != ((mmi.memsize_in_pages + holes_sum) << VMW_PAGE_SHIFT)) { + if (CRASHDEBUG(1)) + error(INFO, LOGPRX"Memory size check failed\n"); + return FALSE; } - if (hdr.version != GUESTDUMP_VERSION || - hdr.magic1 != GUESTDUMP_MAGIC1 || - hdr.magic2 != GUESTDUMP_MAGIC2 || - (hdr.last_addr + 1) != ((hdr.memsize_in_pages + holes_sum) << VMW_PAGE_SHIFT) || - filesize != sizeof(struct guestdumpheader) + - hdr.num_vcpus * (sizeof (struct vcpu_state) + VMW_PAGE_SIZE)) - goto unrecognized; + expected_filesize = vcpus_offset + hdr.num_vcpus * (sizeof(struct vcpu_state1) + + get_vcpu_gapsize(hdr.version) + sizeof(struct vcpu_state2) + VMW_PAGE_SIZE); + if (filesize != expected_filesize) { + if (CRASHDEBUG(1)) + error(INFO, LOGPRX"Incorrect file size: %d != %d\n", + filesize, expected_filesize); + return FALSE; + } - vmss.memsize = hdr.memsize_in_pages << VMW_PAGE_SHIFT; - vmss.regionscount = hdr.mem_holes + 1; + vmss.memsize = mmi.memsize_in_pages << VMW_PAGE_SHIFT; + vmss.regionscount = mmi.mem_holes + 1; vmss.memoffset = 0; vmss.num_vcpus = hdr.num_vcpus; return TRUE; @@ -169,7 +299,8 @@ vmware_guestdump_init(char *filename, FILE *ofp) FILE *fp = NULL; int i, result = TRUE; char *vmem_filename = NULL; - struct vcpu_state vs; + struct vcpu_state1 vs1; + struct vcpu_state2 vs2; char *p; if (!machine_type("X86") && !machine_type("X86_64")) { @@ -180,14 +311,14 @@ vmware_guestdump_init(char *filename, FILE *ofp) goto exit; } - if ((fp = fopen(filename, "r")) == NULL) { + if ((fp = fopen(filename, "r")) == NULL) { error(INFO, LOGPRX"Failed to open '%s': [Error %d] %s\n", filename, errno, strerror(errno)); result = FALSE; goto exit; - } + } - if (fseek(fp, sizeof(struct guestdumpheader), SEEK_SET) == -1) { + if (fseek(fp, get_vcpus_offset(hdr.version, vmss.regionscount - 1), SEEK_SET) == -1) { error(INFO, LOGPRX"Failed to fseek '%s': [Error %d] %s\n", filename, errno, strerror(errno)); result = FALSE; @@ -203,7 +334,19 @@ vmware_guestdump_init(char *filename, FILE *ofp) } for (i = 0; i < vmss.num_vcpus; i++) { - if (fread(&vs, sizeof(struct vcpu_state), 1, fp) != 1) { + if (fread(&vs1, sizeof(struct vcpu_state1), 1, fp) != 1) { + error(INFO, LOGPRX"Failed to read '%s' from file '%s': [Error %d] %s\n", + "vcpu_state", filename, errno, strerror(errno)); + result = FALSE; + goto exit; + } + if (fseek(fp, get_vcpu_gapsize(hdr.version), SEEK_CUR) == -1) { + error(INFO, LOGPRX"Failed to read '%s' from file '%s': [Error %d] %s\n", + "vcpu_state", filename, errno, strerror(errno)); + result = FALSE; + goto exit; + } + if (fread(&vs2, sizeof(struct vcpu_state2), 1, fp) != 1) { error(INFO, LOGPRX"Failed to read '%s' from file '%s': [Error %d] %s\n", "vcpu_state", filename, errno, strerror(errno)); result = FALSE; @@ -217,29 +360,29 @@ vmware_guestdump_init(char *filename, FILE *ofp) } vmss.vcpu_regs[i] = 0; - vmss.regs64[i]->rax = vs.regs64.rax; - vmss.regs64[i]->rcx = vs.regs64.rcx; - vmss.regs64[i]->rdx = vs.regs64.rdx; - vmss.regs64[i]->rbx = vs.regs64.rbx; - vmss.regs64[i]->rbp = vs.regs64.rbp; - vmss.regs64[i]->rsp = vs.regs64.rsp; - vmss.regs64[i]->rsi = vs.regs64.rsi; - vmss.regs64[i]->rdi = vs.regs64.rdi; - vmss.regs64[i]->r8 = vs.regs64.r8; - vmss.regs64[i]->r9 = vs.regs64.r9; - vmss.regs64[i]->r10 = vs.regs64.r10; - vmss.regs64[i]->r11 = vs.regs64.r11; - vmss.regs64[i]->r12 = vs.regs64.r12; - vmss.regs64[i]->r13 = vs.regs64.r13; - vmss.regs64[i]->r14 = vs.regs64.r14; - vmss.regs64[i]->r15 = vs.regs64.r15; - vmss.regs64[i]->idtr = vs.idt_base; - vmss.regs64[i]->cr[0] = vs.cr0; - vmss.regs64[i]->cr[2] = vs.cr2; - vmss.regs64[i]->cr[3] = vs.cr3; - vmss.regs64[i]->cr[4] = vs.cr4; - vmss.regs64[i]->rip = vs.regs64.rip; - vmss.regs64[i]->rflags = vs.regs64.eflags; + vmss.regs64[i]->rax = vs2.regs64.rax; + vmss.regs64[i]->rcx = vs2.regs64.rcx; + vmss.regs64[i]->rdx = vs2.regs64.rdx; + vmss.regs64[i]->rbx = vs2.regs64.rbx; + vmss.regs64[i]->rbp = vs2.regs64.rbp; + vmss.regs64[i]->rsp = vs2.regs64.rsp; + vmss.regs64[i]->rsi = vs2.regs64.rsi; + vmss.regs64[i]->rdi = vs2.regs64.rdi; + vmss.regs64[i]->r8 = vs2.regs64.r8; + vmss.regs64[i]->r9 = vs2.regs64.r9; + vmss.regs64[i]->r10 = vs2.regs64.r10; + vmss.regs64[i]->r11 = vs2.regs64.r11; + vmss.regs64[i]->r12 = vs2.regs64.r12; + vmss.regs64[i]->r13 = vs2.regs64.r13; + vmss.regs64[i]->r14 = vs2.regs64.r14; + vmss.regs64[i]->r15 = vs2.regs64.r15; + vmss.regs64[i]->idtr = vs1.idt_base; + vmss.regs64[i]->cr[0] = vs1.cr0; + vmss.regs64[i]->cr[2] = vs1.cr2; + vmss.regs64[i]->cr[3] = vs1.cr3; + vmss.regs64[i]->cr[4] = vs1.cr4; + vmss.regs64[i]->rip = vs2.regs64.rip; + vmss.regs64[i]->rflags = vs2.regs64.eflags; vmss.vcpu_regs[i] = REGS_PRESENT_ALL; } @@ -268,9 +411,9 @@ vmware_guestdump_init(char *filename, FILE *ofp) fprintf(ofp, LOGPRX"vmem file: %s\n\n", vmem_filename); if (CRASHDEBUG(1)) { - vmware_guestdump_memory_dump(ofp); - dump_registers_for_vmss_dump(); - } + vmware_guestdump_memory_dump(ofp); + dump_registers_for_vmss_dump(); + } exit: if (fp) @@ -296,24 +439,23 @@ exit: int vmware_guestdump_memory_dump(FILE *ofp) { + uint64_t holes_sum = 0; + unsigned i; + fprintf(ofp, "vmware_guestdump:\n"); fprintf(ofp, " Header: version=%d num_vcpus=%llu\n", - GUESTDUMP_VERSION, (ulonglong)vmss.num_vcpus); + hdr.version, (ulonglong)vmss.num_vcpus); fprintf(ofp, "Total memory: %llu\n", (ulonglong)vmss.memsize); - if (vmss.regionscount > 1) { - uint64_t holes_sum = 0; - unsigned i; - fprintf(ofp, "Memory regions[%d]:\n", vmss.regionscount); - fprintf(ofp, " [0x%016x-", 0); - for (i = 0; i < vmss.regionscount - 1; i++) { - fprintf(ofp, "0x%016llx]\n", (ulonglong)vmss.regions[i].startpagenum << VMW_PAGE_SHIFT); - fprintf(ofp, " [0x%016llx-", (ulonglong)vmss.regions[i].startppn << VMW_PAGE_SHIFT); - holes_sum += vmss.regions[i].startppn - vmss.regions[i].startpagenum; - } - fprintf(ofp, "0x%016llx]\n", (ulonglong)vmss.memsize + (holes_sum << VMW_PAGE_SHIFT)); + fprintf(ofp, "Memory regions[%d]:\n", vmss.regionscount); + fprintf(ofp, " [0x%016x-", 0); + for (i = 0; i < vmss.regionscount - 1; i++) { + fprintf(ofp, "0x%016llx]\n", (ulonglong)vmss.regions[i].startpagenum << VMW_PAGE_SHIFT); + fprintf(ofp, " [0x%016llx-", (ulonglong)vmss.regions[i].startppn << VMW_PAGE_SHIFT); + holes_sum += vmss.regions[i].startppn - vmss.regions[i].startpagenum; } + fprintf(ofp, "0x%016llx]\n", (ulonglong)vmss.memsize + (holes_sum << VMW_PAGE_SHIFT)); return TRUE; } -- 2.39.0

1 year, 5 months

1
0
0 / 0

[ANNOUNCE] Tao Liu's participation in crash utility co-maintainers

by HAGIO KAZUHITO(萩尾　一仁)

Hi, Thank you everyone for your continued contribution to the crash utility. We'd like to announce Tao Liu's participation in the co-maintainers of the crash utility. Tao Liu <ltao(a)redhat.com> Tao has contributed to the crash utility for about three years, and has made many improvements and bug fixes. Especially the maple tree support, the improvements in module symbol search and interactions between crash and gdb were important. I believe that Tao will do a great job to improve the crash utility and the community as a co-maintainer too, and appreciate his participation. Thanks, Kazu

1 year, 5 months

3
2
0 / 0

Re: [ANNOUNCE] Tao Liu's participation in crash-utility co-maintainers

by Lianbo Jiang

On 2/27/24 11:27, devel-request(a)lists.crash-utility.osci.io wrote: > Date: Mon, 26 Feb 2024 07:30:34 +0000 > From: HAGIO KAZUHITO(萩尾　一仁)<k-hagio-ab(a)nec.com> > Subject: [Crash-utility] [ANNOUNCE] Tao Liu's participation in crash > utility co-maintainers > To:"devel(a)lists.crash-utility.osci.io" > <devel(a)lists.crash-utility.osci.io> > Message-ID:<e177c37f-22a8-0adc-883b-f0bbe61b342e(a)nec.com> > Content-Type: text/plain; charset="utf-8" > > Hi, > > Thank you everyone for your continued contribution to the crash utility. > > We'd like to announce Tao Liu's participation in the co-maintainers of > the crash utility. > > Tao Liu<ltao(a)redhat.com> > > Tao has contributed to the crash utility for about three years, and has > made many improvements and bug fixes. Especially the maple tree > support, the improvements in module symbol search and interactions > between crash and gdb were important. > > I believe that Tao will do a great job to improve the crash utility and > the community as a co-maintainer too, and appreciate his participation. Welcome, Tao! Also please allow me to take this opportunity to thank everyone for their contributions on crash-utility. It is precisely because of everyone's great efforts that crash-utility is getting better and better. Thanks. Lianbo > > Thanks, > Kazu

1 year, 5 months

2
1
0 / 0

Adding the zram decompression algorithm "lzo-rle" to support kernel versions >= 5.1

by Yulong TANG 汤玉龙

In Linux 5.1, the ZRAM block driver has changed its default compressor from "lzo" to "lzo-rle" to enhance LZO compression support. However, crash does not support the improved LZO algorithm, resulting in failure when reading memory. change default compressor : ce82f19fd5809f0cf87ea9f753c5cc65ca0673d6<https://github.com/torvalds/linux/commit/ce82f19fd5809f0cf87ea9f753c5cc65...> The issue was discovered when using the extension 'gcore' to generate a process coredump, which was found to be incomplete and unable to be opened properly with gdb. This patch is for Crash-utility tool, it enables the Crash-utility to support decompression of the "lzo-rle" compression algorithm used in zram. The patch has been tested with vmcore files from kernel version 5.4, and successfully allows reading of memory compressed with the zram compression algorithm. Thanks and regards, Yulong

1 year, 5 months

1
0
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility February 2024