April 2016 - Crash-utility - Crash Utility List Archives

PATCH 00/10] teach crash to work with "live" ramdump

by Oleg Nesterov

Hi Dave, Recently I used crash-tool for the first time and I was pleasantly surprised, it really looks like a very useful and handy debugging tool ;) And I was surprised again when I figured out that it can be used to debug the live system on the same machine. Cool! Now I am wondering if we can teach it to debug the live guests runnning under qemu/kvm. This looks certainly possible, qemu supports gdb remote protocol. But. this obviously needs more work. And. Afaics crash-tool needs some fixes anyway. See 01/10-07/10, but probably it needs more changes, so far I only tried to audit task.c and kernel.c. I tried to document every change, but I am very new to this code so I can be easily wrong. So can't we teach it to work with live/raw RAM dumpfiles for the start? See 09/10 and 10/10. With these changes I can run qemu-kvm with the -object memory-backend-file,id=MEM,size=128m,mem-path=/tmp/MEM,share=on -numa node,memdev=MEM options and then do $ crash path-to-guests-vmlinux live:/tmp/MEM@0 to debug the live guest, No need to dump-guest-memory + restart /usr/bin/crash which can be slow. What do you think? Oleg. defs.h | 1 + filesys.c | 8 ++++---- kernel.c | 7 +++---- main.c | 11 +++++++++++ ramdump.c | 49 ++++++++++++++++++++++++++++++++++--------------- task.c | 13 ++++++------- tools.c | 2 +- 7 files changed, 60 insertions(+), 31 deletions(-)

10 years, 2 months

2
54
0 / 0

[ANNOUNCE] crash version 7.1.5 is available

by Dave Anderson

Download from: http://people.redhat.com/anderson or https://github.com/crash-utility/crash/releases The github master branch serves as a development branch that will contain all patches that are queued for the next release: $ git clone git://github.com/crash-utility/crash.git Changelog: - Fix for the handling of Xen DomU ELF dumpfiles to prevent the pre-gathering of p2m frames during session initialization, which is unnecessary since ELF files contain the mapping information in their ".xen_p2m" section. Without the patch, it is possible that the crash session may be unnecessarily aborted if the p2m frame-gathering fails, for example, if the CR3 value in the header is invalid. (ptesarik(a)suse.com) - Fix for the translation of X86_64 virtual addresses in the vsyscall region between 0xffffffffff600000 and 0xffffffffffe00000. Without the patch, the reading of addresses in that region returns invalid data; in addition, the "vtop" command for an address in that region shows an invalid physical address under the "PHYSICAL" column. (nakajima.akira(a)nttcom.co.jp, anderson(a)redhat.com) - Make the "zero excluded" mode default behavior when analyzing SADUMP dumpfiles because some Fujitsu troubleshooting software assumes the behavior. Also, fix the "set -v" option to show the "zero_excluded" internal variable as "on" if it has been set when analyzing SADUMP dumpfiles. (d.hatayama(a)jp.fujitsu.com) - Fix for the "bt" command to properly pull the stack and frame pointer registers from the NT_PRSTATUS notes of 32-bit tasks running in user-mode on ARM64. Without the patch, the "bt" command utilizes ptregs->sp and ptregs->regs[29] for 32-bit tasks instead of the architecturally-mapped ptregs->regs[13] and ptregs->regs[11], which yields unpredictable/invalid results, and possibly a segmentation violation. (drjones(a)redhat.com) - Fix for the "ps -t" option in 3.17 and later kernels that contain commit ccbf62d8a284cf181ac28c8e8407dd077d90dd4b, which changed the task_struct.start_time member from a struct timespec to a u64. Without the patch, the "RUN TIME" value is nonsensical. (anderson(a)redhat.com) - Fix for the changes made to the kernel module structure introduced by this kernel commit for Linux 4.5 and later kernels: commit 7523e4dc5057e157212b4741abd6256e03404cf1 module: use a structure to encapsulate layout. Without the patch, the crash session fails during initialization with the error message: "crash: invalid structure member offset: module_core_size". (sebott(a)linux.vnet.ibm.com) - The crash utility has not supported Xen dom0 and domU dumpfiles since this Linux 3.19 commit: commit 054954eb051f35e74b75a566a96fe756015352c8 xen: switch to linear virtual mapped sparse p2m list This patch resurrects support for dom0 dumpfiles only. Without the patch, the crash session fails during session initialization with the message "crash: cannot resolve p2m_top". (daniel.kiper(a)oracle.com) - Fix for the replacements made to the kernel's cpu_possible_mask, cpu_online_mask, cpu_present_mask and cpu_active_mask symbols in this kernel commit for Linux 4.5 and later kernels: commit 5aec01b834fd6f8ca49d1aeede665b950d0c148e kernel/cpu.c: eliminate cpu_*_mask Without the patch, behavior is architecture-specific, dependent upon whether the cpu mask values are used to calculate the number of cpus. For example, ARM64 crash sessions fail during session initialization with the error message "crash: zero-size memory allocation! (called from <address>)", whereas X86_64 sessions come up normally, but invalid cpu mask values of zero are stored internally. (anderson(a)redhat.com) - Fixes for "[-Werror=misleading-indentation]" compiler warnings that are generated by the following files, when building X86_64 in a Fedora Rawhide environment with gcc-6.0.0: gdb-7.6/bfd/coff-i386.c gdb-7.6/bfd/coff-x86_64.c kernel.c x86_64.c lkcd_common.c Without the patch, the warnings in the bfd library files are treated as errors, and abort the build. The three instances in the top-level crash source code directory are non-fatal. There are several other gdb-specific instances that are non-fatal and are not addressed. (anderson(a)redhat.com) - Fix for a "[-Werror=shift-negative-value]" compiler warning that is generated by "gdb-7.6/opcodes/arm-dis.c" when building crash with "make target=ARM64" on an x86_64 host with gcc-6.0.0. Without the patch, the warning is treated as an error and the build is aborted. (anderson(a)redhat.com) - Fix for a series of "[-Werror=shift-negative-value]" compiler warnings that are generated by "gdb-7.6/bfd/elf64-ppc.c" and "gdb-7.6/opcodes/ppc-opc.c" when building with "make target=PPC64" on an x86_64 host with gcc-6.0.0. Without the patch, the warnings are treated as errors and the build is aborted. (anderson(a)redhat.com) - Fix for a "[-Werror=unused-const-variable]" compiler warning that is generated by "gdb-7.6/opcodes/mips-dis.c" when building with "make target=MIPS" on an x86_64 host with gcc-6.0.0. Without the patch, the warning is treated as an error and the build is aborted. (anderson(a)redhat.com) - Configure the embedded gdb module with "--disable-sim" in order to bypass the unnecessary build of the libsim.a library. (anderson(a)redhat.com) - Implement support for per-cpu IRQ stacks on the ARM64 architecture, which were introduced in Linux 4.5 by this commit: commit 132cd887b5c54758d04bf25c52fa48f45e843a30 arm64: Modify stack trace and dump for use with irq_stack Without the patch, if an active task was operating on its per-cpu IRQ stack on dumpfiles generated by kdump, its backtrace would start at the exception frame that was laid down on the process stack. This patch also adds support for "bt -E" to search IRQ stacks for exception frames, and the "mach" command displays the addresses of each per-cpu IRQ stack. (anderson(a)redhat.com) - Fixes for "[-Werror=misleading-indentation]" compiler warnings that are generated by the following files, when building X86_64 in a Fedora Rawhide environment with gcc-6.0.0: gdb-7.6/gdb/ada-lang.c gdb-7.6/gdb/linux-record.c gdb-7.6/gdb/inflow.c gdb-7.6/gdb/printcmd.c gdb-7.6/gdb/c-typeprint.c Without the patch, warnings in the gdb-7.6/gdb directory are not treated as errors, and are non-fatal to the build. (anderson(a)redhat.com) - Further fix for the symbol name changes made to the kernel's cpu_online_mask, cpu_possible_mask, cpu_present_mask and cpu_active_mask symbols in Linux 4.5 and later kernels for when the crash session is brought up with "crash -d<debug-level>". Without the patch, the cpus found in each mask are displayed like this example: cpu_possible_(null): cpus: 0 1 2 3 4 5 6 7 cpu_present_(null): cpus: 0 1 cpu_online_(null): cpus: 0 1 cpu_active_(null): cpus: 0 1 The "(null)" string segments above should read "mask". (anderson(a)redhat.com) - Fix for the changes made to the kernel module structure introduced by this kernel commit for Linux 4.5 and later kernels: commit 8244062ef1e54502ef55f54cced659913f244c3e modules: fix longstanding /proc/kallsyms vs module insertion race. Without the patch, the crash session fails during initialization with the error message: "crash: invalid structure member offset: module_num_symtab". (anderson(a)redhat.com) - Fix for the "dis <function | address>" option if the function or address is the highest text symbol value in a kernel module. Without the patch, the disassembly may continue past the end of the function, or may show nothing at all. The patch utilizes in-kernel kallsyms symbol size information instead of disassembling until reaching the address of the next symbol in the module. (anderson(a)redhat.com) - Fix for the "irq -s" option in Linux 4.2 and later kernels. Without the patch, the irq_chip.name string (e.g. "IO-APIC", "PCI-MSI", etc.) is missing from the display. (rabin.vincent(a)axis.com) - Improvement of the accuracy of the allocated objects count for each kmem_cache shown by "kmem -s" in kernels configured with CONFIG_SLUB. Without the patch, the values under the ALLOCATED column may be too large because cached per-cpu objects are counted as allocated. (vinayakm.list(a)gmail.com) - Fixes to address two gcc-4.1.2 compiler warnings introduced by the previous patch: memory.c: In function â€˜count_cpu_partialâ€™: memory.c:17958: warning: comparison is always false due to limited range of data type memory.c: In function â€˜count_partialâ€™: memory.c:18729: warning: comparison is always false due to limited range of data type (anderson(a)redhat.com) - Introduction of the "whatis -r" and "whatis -m" options. The -r option searches for data structures of a specified size or within a range of specified sizes. The -m option searches for data structures that contain a member of a given type. If a structure contains another structure, the members of the embedded structure will also be subject to the search. The type string may be a substring of the data type name. The output displays the size and name of the data structure. (Alexandr_Terekhov(a)epam.com, anderson(a)redhat.com) - Apply a fuzz factor of zero to the re-application of a modified version of the gdb-7.6.patch in a pre-existing build directory. Without the patch, it is possible that a previously-applied patch could be applied a second time without the fuzz restriction. (anderson(a)redhat.com) - Include sys/macros.h explicitly in filesys.c for the definitions of major(), minor() and makedev(). These functions are defined in the sys/sysmacros.h header, not sys/types.h. Linux C libraries are updating to drop the implicit include, so we need to include it explicitly. (vapier(a)gentoo.org) - Fix for "kmem -[sS]" options for kernels configured with CONFIG_SLUB. Without the patch, the count displayed in the ALLOCATED column may be too large, and the "kmem -S" display of allocated/free status of individual objects may be incorrect. (hirofumi(a)mail.parknet.co.jp) - Fix for "kmem -[sS]" options for kernels configured with CONFIG_SLUB. Without the patch, if a freelist pointer is corrupt, the address of the slab page being referenced may not be displayed by the error message, showing something like: "kmem: kmalloc-32: slab: 0 invalid freepointer: 6e652f323a302d74". (hirofumi(a)mail.parknet.co.jp) - Fix for the "vm -p" option on kernels that are not configured with CONFIG_SWAP. Without the patch, the command may fail prematurely with the message "nr_swapfiles doesn't exist in this kernel". (rabinv(a)axis.com) - Introduction of ARM64 support for 64K pages with 3-level page tables and 48 VA bits. Until now, support has only existed for 64K pages with 2-level page tables, and 4K pages with 3-level page tables. (jim.hull(a)hpe.com) - Fix for the "vm -p" and "vtop <user virtual address>" commands if a user page is swapped out. Without the patch, the "/dev" component of the swap file pathname may be missing from its display. (anderson(a)redhat.com) - Fix for the x86_64 "vm -p" command to properly emulate the kernel's pte_present() function, which checks for either _PAGE_PRESENT or _PAGE_PROTNONE to be set. Without the patch, user pages whose PTE does not have _PAGE_PRESENT bit set are misconstrued as SWAP pages with an "(unknown swap location") along with a bogus OFFSET value. (anderson(a)redhat.com) - When reading a task's task_struct.flags field, check for its size, which was changed from an unsigned long to an unsigned int. (dave.kleikamp(a)oracle.com) - Introduction of support for the 64-bit SPARC V9 architecture. This version supports running against a live kernel. Compressed kdump support is also here, but the crash dump support for the kernel, kexec-tools, and makedumpfile is still pending. Initial work was done by Karl Volz with help from Bob Picco. (dave.kleikamp(a)oracle.com) - Account for the Linux 3.17 increase of the ARM64 MAX_PHYSMEM_BITS definition from 40 to 48. (Johan.Erlandsson.sonymobile.com)

10 years, 2 months

1
0
0 / 0

[PATCH] arm64: support MAX_PHYSMEM_BITS=48

by Erlandsson, Johan

Hi, This match update made in file 'arch/arm64/include/asm/sparsemem.h'. commit 07a15dd55a3d65f81b4b09eab293f4afc720b082 arm64: mm: update max pa bits to 48 --- arm64.c | 5 ++++- defs.h | 1 + 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/arm64.c b/arm64.c index d1c9c3e..34c8c59 100644 --- a/arm64.c +++ b/arm64.c @@ -267,7 +267,10 @@ arm64_init(int when) case POST_GDB: arm64_calc_virtual_memory_ranges(); machdep->section_size_bits = _SECTION_SIZE_BITS; - machdep->max_physmem_bits = _MAX_PHYSMEM_BITS; + if (THIS_KERNEL_VERSION >= LINUX(3,17,0)) + machdep->max_physmem_bits = _MAX_PHYSMEM_BITS_3_17; + else + machdep->max_physmem_bits = _MAX_PHYSMEM_BITS; ms = machdep->machspec; if (THIS_KERNEL_VERSION >= LINUX(4,0,0)) { diff --git a/defs.h b/defs.h index a1746cc..a09fa9a 100644 --- a/defs.h +++ b/defs.h @@ -2965,6 +2965,7 @@ typedef signed int s32; #define _SECTION_SIZE_BITS 30 #define _MAX_PHYSMEM_BITS 40 +#define _MAX_PHYSMEM_BITS_3_17 48 typedef unsigned long long __u64; typedef unsigned long long u64; -- 2.4.2

10 years, 2 months

2
1
0 / 0

[PATCH V3 0/3] sparc64 support for crash utility

by Dave Kleikamp

These patches add support for the sparc64 architecture. This supports running against a live kernel. Diskdump support is also here, but the crashdump support for the kernel, kexec-tools, and makedumpfile is still pending. Initial work was done by Karl Volz with help from Bob Picco. V3: * Broke task_struct_flags fix into separate patch * A lot of various cleanups suggested by Sam Ravnborg * Implemented sparc64_dump_machdep_table() V2: * Use SIZE(task_struct_flags) instead of just changing ULONG to UINT * Put new code in get_idle_threads() inside #ifdef SPARC64 Dave Kleikamp (3): Use proper size for task_struct->flags Implement byte-by-byte memory access facilitators crash-utility: Support for sparc64 architecture Makefile | 9 +- configure.c | 23 + defs.h | 180 ++++++++- diskdump.c | 36 ++- lkcd_vmdump_v2_v3.h | 2 +- sparc64.c | 1253 +++++++++++++++++++++++++++++++++++++++++++++++++++ symbols.c | 10 + task.c | 23 +- 8 files changed, 1527 insertions(+), 9 deletions(-) create mode 100644 sparc64.c

10 years, 2 months

4
12
0 / 0

[PATCH] ARM64 support for 3-level page tables with 64K pages

by Jim Hull

Adds ARM64 support for 3-level page tables with 64K pages and 48 VA bits. --- arm64.c | 126 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------- defs.h | 28 +++++++++++---- 2 files changed, 133 insertions(+), 21 deletions(-) diff --git a/arm64.c b/arm64.c index f6ea7a1..d1c9c3e 100644 --- a/arm64.c +++ b/arm64.c @@ -34,6 +34,7 @@ static void arm64_init_kernel_pgd(void); static int arm64_kvtop(struct task_context *, ulong, physaddr_t *, int); static int arm64_uvtop(struct task_context *, ulong, physaddr_t *, int); static int arm64_vtop_2level_64k(ulong, ulong, physaddr_t *, int); +static int arm64_vtop_3level_64k(ulong, ulong, physaddr_t *, int); static int arm64_vtop_3level_4k(ulong, ulong, physaddr_t *, int); static ulong arm64_get_task_pgd(ulong); static void arm64_irq_stack_init(void); @@ -188,15 +189,29 @@ arm64_init(int when) break; case 65536: - machdep->flags |= VM_L2_64K; - machdep->ptrs_per_pgd = PTRS_PER_PGD_L2_64K; - if ((machdep->pgd = - (char *)malloc(PTRS_PER_PGD_L2_64K * 8)) == NULL) - error(FATAL, "cannot malloc pgd space."); - if ((machdep->ptbl = - (char *)malloc(PTRS_PER_PTE_L2_64K * 8)) == NULL) - error(FATAL, "cannot malloc ptbl space."); - machdep->pmd = NULL; /* not used */ + if (machdep->machspec->VA_BITS > PGDIR_SHIFT_L3_64K) { + machdep->flags |= VM_L3_64K; + machdep->ptrs_per_pgd = PTRS_PER_PGD_L3_64K; + if ((machdep->pgd = + (char *)malloc(PTRS_PER_PGD_L3_64K * 8)) == NULL) + error(FATAL, "cannot malloc pgd space."); + if ((machdep->pmd = + (char *)malloc(PTRS_PER_PMD_L3_64K * 8)) == NULL) + error(FATAL, "cannot malloc pmd space."); + if ((machdep->ptbl = + (char *)malloc(PTRS_PER_PTE_L3_64K * 8)) == NULL) + error(FATAL, "cannot malloc ptbl space."); + } else { + machdep->flags |= VM_L2_64K; + machdep->ptrs_per_pgd = PTRS_PER_PGD_L2_64K; + if ((machdep->pgd = + (char *)malloc(PTRS_PER_PGD_L2_64K * 8)) == NULL) + error(FATAL, "cannot malloc pgd space."); + if ((machdep->ptbl = + (char *)malloc(PTRS_PER_PTE_L2_64K * 8)) == NULL) + error(FATAL, "cannot malloc ptbl space."); + machdep->pmd = NULL; /* not used */ + } machdep->pud = NULL; /* not used */ break; @@ -379,6 +394,8 @@ arm64_dump_machdep_table(ulong arg) fprintf(fp, "%sPHYS_OFFSET", others++ ? "|" : ""); if (machdep->flags & VM_L2_64K) fprintf(fp, "%sVM_L2_64K", others++ ? "|" : ""); + if (machdep->flags & VM_L3_64K) + fprintf(fp, "%sVM_L3_64K", others++ ? "|" : ""); if (machdep->flags & VM_L3_4K) fprintf(fp, "%sVM_L3_4K", others++ ? "|" : ""); if (machdep->flags & VMEMMAP) @@ -410,10 +427,14 @@ arm64_dump_machdep_table(ulong arg) fprintf(fp, " processor_speed: arm64_processor_speed()\n"); fprintf(fp, " uvtop: arm64_uvtop()->%s()\n", machdep->flags & VM_L3_4K ? - "arm64_vtop_3level_4k" : "arm64_vtop_2level_64k"); + "arm64_vtop_3level_4k" : + machdep->flags & VM_L3_64K ? + "arm64_vtop_3level_64k" : "arm64_vtop_2level_64k"); fprintf(fp, " kvtop: arm64_kvtop()->%s()\n", machdep->flags & VM_L3_4K ? - "arm64_vtop_3level_4k" : "arm64_vtop_2level_64k"); + "arm64_vtop_3level_4k" : + machdep->flags & VM_L3_64K ? + "arm64_vtop_3level_64k" : "arm64_vtop_2level_64k"); fprintf(fp, " get_task_pgd: arm64_get_task_pgd()\n"); fprintf(fp, " dump_irq: generic_dump_irq()\n"); fprintf(fp, " get_stack_frame: arm64_get_stack_frame()\n"); @@ -719,10 +740,12 @@ arm64_kvtop(struct task_context *tc, ulong kvaddr, physaddr_t *paddr, int verbos kernel_pgd = vt->kernel_pgd[0]; *paddr = 0; - switch (machdep->flags & (VM_L2_64K|VM_L3_4K)) + switch (machdep->flags & (VM_L2_64K|VM_L3_64K|VM_L3_4K)) { case VM_L2_64K: return arm64_vtop_2level_64k(kernel_pgd, kvaddr, paddr, verbose); + case VM_L3_64K: + return arm64_vtop_3level_64k(kernel_pgd, kvaddr, paddr, verbose); case VM_L3_4K: return arm64_vtop_3level_4k(kernel_pgd, kvaddr, paddr, verbose); default: @@ -740,10 +763,12 @@ arm64_uvtop(struct task_context *tc, ulong uvaddr, physaddr_t *paddr, int verbos *paddr = 0; - switch (machdep->flags & (VM_L2_64K|VM_L3_4K)) + switch (machdep->flags & (VM_L2_64K|VM_L3_64K|VM_L3_4K)) { case VM_L2_64K: return arm64_vtop_2level_64k(user_pgd, uvaddr, paddr, verbose); + case VM_L3_64K: + return arm64_vtop_3level_64k(user_pgd, uvaddr, paddr, verbose); case VM_L3_4K: return arm64_vtop_3level_4k(user_pgd, uvaddr, paddr, verbose); default: @@ -820,6 +845,78 @@ no_page: return FALSE; } +static int +arm64_vtop_3level_64k(ulong pgd, ulong vaddr, physaddr_t *paddr, int verbose) +{ + ulong *pgd_base, *pgd_ptr, pgd_val; + ulong *pmd_base, *pmd_ptr, pmd_val; + ulong *pte_base, *pte_ptr, pte_val; + + if (verbose) + fprintf(fp, "PAGE DIRECTORY: %lx\n", pgd); + + pgd_base = (ulong *)pgd; + FILL_PGD(pgd_base, KVADDR, PTRS_PER_PGD_L3_64K * sizeof(ulong)); + pgd_ptr = pgd_base + (((vaddr) >> PGDIR_SHIFT_L3_64K) & (PTRS_PER_PGD_L3_64K - 1)); + pgd_val = ULONG(machdep->pgd + PAGEOFFSET(pgd_ptr)); + if (verbose) + fprintf(fp, " PGD: %lx => %lx\n", (ulong)pgd_ptr, pgd_val); + if (!pgd_val) + goto no_page; + + /* + * #define __PAGETABLE_PUD_FOLDED + */ + + pmd_base = (ulong *)PTOV(pgd_val & PHYS_MASK & (s32)machdep->pagemask); + FILL_PMD(pmd_base, KVADDR, PTRS_PER_PMD_L3_64K * sizeof(ulong)); + pmd_ptr = pmd_base + (((vaddr) >> PMD_SHIFT_L3_64K) & (PTRS_PER_PMD_L3_64K - 1)); + pmd_val = ULONG(machdep->pmd + PAGEOFFSET(pmd_ptr)); + if (verbose) + fprintf(fp, " PMD: %lx => %lx\n", (ulong)pmd_ptr, pmd_val); + if (!pmd_val) + goto no_page; + + if ((pmd_val & PMD_TYPE_MASK) == PMD_TYPE_SECT) { + ulong sectionbase = (pmd_val & SECTION_PAGE_MASK_512MB) & PHYS_MASK; + if (verbose) { + fprintf(fp, " PAGE: %lx (512MB)\n\n", sectionbase); + arm64_translate_pte(pmd_val, 0, 0); + } + *paddr = sectionbase + (vaddr & ~SECTION_PAGE_MASK_512MB); + return TRUE; + } + + pte_base = (ulong *)PTOV(pmd_val & PHYS_MASK & (s32)machdep->pagemask); + FILL_PTBL(pte_base, KVADDR, PTRS_PER_PTE_L3_64K * sizeof(ulong)); + pte_ptr = pte_base + (((vaddr) >> machdep->pageshift) & (PTRS_PER_PTE_L3_64K - 1)); + pte_val = ULONG(machdep->ptbl + PAGEOFFSET(pte_ptr)); + if (verbose) + fprintf(fp, " PTE: %lx => %lx\n", (ulong)pte_ptr, pte_val); + if (!pte_val) + goto no_page; + + if (pte_val & PTE_VALID) { + *paddr = (PAGEBASE(pte_val) & PHYS_MASK) + PAGEOFFSET(vaddr); + if (verbose) { + fprintf(fp, " PAGE: %lx\n\n", PAGEBASE(*paddr)); + arm64_translate_pte(pte_val, 0, 0); + } + } else { + if (IS_UVADDR(vaddr, NULL)) + *paddr = pte_val; + if (verbose) { + fprintf(fp, "\n"); + arm64_translate_pte(pte_val, 0, 0); + } + goto no_page; + } + + return TRUE; +no_page: + return FALSE; +} + static int arm64_vtop_3level_4k(ulong pgd, ulong vaddr, physaddr_t *paddr, int verbose) { @@ -2348,9 +2445,10 @@ arm64_calc_virtual_memory_ranges(void) STRUCT_SIZE_INIT(page, "page"); - switch (machdep->flags & (VM_L2_64K|VM_L3_4K)) + switch (machdep->flags & (VM_L2_64K|VM_L3_64K|VM_L3_4K)) { case VM_L2_64K: + case VM_L3_64K: PUD_SIZE = PGDIR_SIZE_L2_64K; break; case VM_L3_4K: diff --git a/defs.h b/defs.h index 56ae06c..d1b49d0 100644 --- a/defs.h +++ b/defs.h @@ -2815,7 +2815,7 @@ typedef u64 pte_t; typedef signed int s32; -/* +/* * 3-levels / 4K pages */ #define PTRS_PER_PGD_L3_4K (512) @@ -2823,10 +2823,23 @@ typedef signed int s32; #define PTRS_PER_PTE_L3_4K (512) #define PGDIR_SHIFT_L3_4K (30) #define PGDIR_SIZE_L3_4K ((1UL) << PGDIR_SHIFT_L3_4K) -#define PGDIR_MASK_L3 4K (~(PGDIR_SIZE_L3_4K-1)) +#define PGDIR_MASK_L3_4K (~(PGDIR_SIZE_L3_4K-1)) #define PMD_SHIFT_L3_4K (21) -#define PMD_SIZE_L3_4K (1UL << PMD_SHIFT_4K) -#define PMD_MASK_L3 4K (~(PMD_SIZE_4K-1)) +#define PMD_SIZE_L3_4K (1UL << PMD_SHIFT_L3_4K) +#define PMD_MASK_L3_4K (~(PMD_SIZE_L3_4K-1)) + +/* + * 3-levels / 64K pages + */ +#define PTRS_PER_PGD_L3_64K (64) +#define PTRS_PER_PMD_L3_64K (8192) +#define PTRS_PER_PTE_L3_64K (8192) +#define PGDIR_SHIFT_L3_64K (42) +#define PGDIR_SIZE_L3_64K ((1UL) << PGDIR_SHIFT_L3_64K) +#define PGDIR_MASK_L3_64K (~(PGDIR_SIZE_L3_64K-1)) +#define PMD_SHIFT_L3_64K (29) +#define PMD_SIZE_L3_64K (1UL << PMD_SHIFT_L3_64K) +#define PMD_MASK_L3_64K (~(PMD_SIZE_L3_64K-1)) /* * 2-levels / 64K pages @@ -2868,9 +2881,10 @@ typedef signed int s32; #define KSYMS_START (0x1) #define PHYS_OFFSET (0x2) #define VM_L2_64K (0x4) -#define VM_L3_4K (0x8) -#define KDUMP_ENABLED (0x10) -#define IRQ_STACKS (0x20) +#define VM_L3_64K (0x8) +#define VM_L3_4K (0x10) +#define KDUMP_ENABLED (0x20) +#define IRQ_STACKS (0x40) /* * sources: Documentation/arm64/memory.txt -- 2.1.4

10 years, 2 months

2
1
0 / 0

[PATCH 1/2] Fix cpu_slab freelist handling on SLUB

by OGAWA Hirofumi

Hi, SLUB cpu_slab has 2 freelist. One is cpu_slab->freelist for local cpu. One is cpu_slab->page->freelist for remote cpu. So, we have to check both of freelists to know details. Note, page->inuse counts only for free on page->freelist, not cpu_slab->freelist. so total free objects are (page->objects - page->inuse) + count(cpu_slab->freelist)) --- memory.c | 213 ++++++++++++++++++++++++++++---------------------------------- 1 file changed, 99 insertions(+), 114 deletions(-) diff -puN memory.c~crash-slub-freelist-fix memory.c --- crash-64/memory.c~crash-slub-freelist-fix 2016-04-18 02:29:57.743774055 +0900 +++ crash-64-hirofumi/memory.c 2016-04-18 02:32:30.999515870 +0900 @@ -17914,15 +17914,62 @@ bailout: FREEBUF(si->cache_buf); } +static ushort slub_page_objects(struct meminfo *si, ulong page) +{ + ulong objects_vaddr; + ushort objects; + + /* + * Pre-2.6.27, the object count and order were fixed in the + * kmem_cache structure. Now they may change, say if a high + * order slab allocation fails, so the per-slab object count + * is kept in the slab. + */ + if (VALID_MEMBER(page_objects)) { + objects_vaddr = page + OFFSET(page_objects); + if (si->flags & SLAB_BITFIELD) + objects_vaddr += sizeof(ushort); + if (!readmem(objects_vaddr, KVADDR, &objects, + sizeof(ushort), "page.objects", RETURN_ON_ERROR)) + return 0; + /* + * Strip page.frozen bit. + */ + if (si->flags & SLAB_BITFIELD) { + if (__BYTE_ORDER == __LITTLE_ENDIAN) { + objects <<= 1; + objects >>= 1; + } + if (__BYTE_ORDER == __BIG_ENDIAN) + objects >>= 1; + } + + if (CRASHDEBUG(1) && (objects != si->objects)) + error(NOTE, "%s: slab: %lx oo objects: %ld " + "slab objects: %d\n", + si->curname, si->slab, + si->objects, objects); + + if (objects == (ushort)(-1)) { + error(INFO, "%s: slab: %lx invalid page.objects: -1\n", + si->curname, si->slab); + return 0; + } + } else + objects = (ushort)si->objects; + + return objects; +} + static short count_cpu_partial(struct meminfo *si, int cpu) { short cpu_partial_inuse, cpu_partial_objects, free_objects; - ulong cpu_partial, objects_vaddr; + ulong cpu_partial; free_objects = 0; - if (VALID_MEMBER(kmem_cache_cpu_partial)) { + if (VALID_MEMBER(kmem_cache_cpu_partial) && VALID_MEMBER(page_objects)) { readmem(ULONG(si->cache_buf + OFFSET(kmem_cache_cpu_slab)) + kt->__per_cpu_offset[cpu] + OFFSET(kmem_cache_cpu_partial), KVADDR, &cpu_partial, sizeof(ulong), @@ -17939,27 +17986,13 @@ count_cpu_partial(struct meminfo *si, in return 0; if (cpu_partial_inuse == -1) return 0; - if (VALID_MEMBER(page_objects)) { - objects_vaddr = cpu_partial + OFFSET(page_objects); - if (si->flags & SLAB_BITFIELD) - objects_vaddr += sizeof(ushort); - if (!readmem(objects_vaddr, KVADDR, - &cpu_partial_objects, sizeof(ushort), - "page.objects", RETURN_ON_ERROR)) - return 0; - if (si->flags & SLAB_BITFIELD) { - if (__BYTE_ORDER == __LITTLE_ENDIAN) { - cpu_partial_objects <<= 1; - cpu_partial_objects >>= 1; - } - if (__BYTE_ORDER == __BIG_ENDIAN) - cpu_partial_objects >>= 1; - } - if (cpu_partial_objects == (short)(-1)) - return 0; - free_objects += - cpu_partial_objects - cpu_partial_inuse; - } + + cpu_partial_objects = slub_page_objects(si, + cpu_partial); + if (!cpu_partial_objects) + return 0; + free_objects += cpu_partial_objects - cpu_partial_inuse; + readmem(cpu_partial + OFFSET(page_next), KVADDR, &cpu_partial, sizeof(ulong), "page.next", RETURN_ON_ERROR); @@ -18011,14 +18044,12 @@ get_kmem_cache_slub_data(long cmd, struc KVADDR, &inuse, sizeof(short), "page inuse", RETURN_ON_ERROR)) return FALSE; - if (!cpu_freelist) - if (!readmem(cpu_slab_ptr + OFFSET(page_freelist), - KVADDR, &cpu_freelist, sizeof(ulong), - "page freelist", RETURN_ON_ERROR)) - return FALSE; + objects = slub_page_objects(si, cpu_slab_ptr); + if (!objects) + return FALSE; - free_objects += - count_free_objects(si, cpu_freelist); + free_objects += objects - inuse; + free_objects += count_free_objects(si, cpu_freelist); free_objects += count_cpu_partial(si, i); if (!node_total_avail) @@ -18255,7 +18286,7 @@ static int do_slab_slub(struct meminfo *si, int verbose) { physaddr_t paddr; - ulong vaddr, objects_vaddr; + ulong vaddr; ushort inuse, objects; ulong freelist, cpu_freelist, cpu_slab_ptr; int i, free_objects, cpu_slab, is_free, node; @@ -18287,50 +18318,17 @@ do_slab_slub(struct meminfo *si, int ver if (!readmem(si->slab + OFFSET(page_freelist), KVADDR, &freelist, sizeof(void *), "page.freelist", RETURN_ON_ERROR)) return FALSE; - /* - * Pre-2.6.27, the object count and order were fixed in the - * kmem_cache structure. Now they may change, say if a high - * order slab allocation fails, so the per-slab object count - * is kept in the slab. - */ - if (VALID_MEMBER(page_objects)) { - objects_vaddr = si->slab + OFFSET(page_objects); - if (si->flags & SLAB_BITFIELD) - objects_vaddr += sizeof(ushort); - if (!readmem(objects_vaddr, KVADDR, &objects, - sizeof(ushort), "page.objects", RETURN_ON_ERROR)) - return FALSE; - /* - * Strip page.frozen bit. - */ - if (si->flags & SLAB_BITFIELD) { - if (__BYTE_ORDER == __LITTLE_ENDIAN) { - objects <<= 1; - objects >>= 1; - } - if (__BYTE_ORDER == __BIG_ENDIAN) - objects >>= 1; - } - - if (CRASHDEBUG(1) && (objects != si->objects)) - error(NOTE, "%s: slab: %lx oo objects: %ld " - "slab objects: %d\n", - si->curname, si->slab, - si->objects, objects); - if (objects == (ushort)(-1)) { - error(INFO, "%s: slab: %lx invalid page.objects: -1\n", - si->curname, si->slab); - return FALSE; - } - } else - objects = (ushort)si->objects; + objects = slub_page_objects(si, si->slab); + if (!objects) + return FALSE; if (!verbose) { DUMP_SLAB_INFO_SLUB(); return TRUE; } + cpu_freelist = 0; for (i = 0, cpu_slab = -1; i < kt->cpus; i++) { cpu_slab_ptr = get_cpu_slab_ptr(si, i, &cpu_freelist); @@ -18342,11 +18340,15 @@ do_slab_slub(struct meminfo *si, int ver * Later slub scheme uses the per-cpu freelist * so count the free objects by hand. */ - if (cpu_freelist) - freelist = cpu_freelist; - if ((free_objects = count_free_objects(si, freelist)) < 0) + if ((free_objects = count_free_objects(si, cpu_freelist)) < 0) return FALSE; - inuse = si->objects - free_objects; + /* + * If the object is freed on foreign cpu, the + * object is liked to page->freelist. + */ + if (freelist) + free_objects += objects - inuse; + inuse = objects - free_objects; break; } } @@ -18377,28 +18379,31 @@ do_slab_slub(struct meminfo *si, int ver for (p = vaddr; p < vaddr + objects * si->size; p += si->size) { hq_open(); is_free = FALSE; - for (is_free = 0, q = freelist; q; - q = get_freepointer(si, (void *)q)) { + /* Search an object on both of freelist and cpu_freelist */ + ulong lists[] = { freelist, cpu_freelist, }; + for (int i = 0; i < sizeof(lists) / sizeof(lists[0]); i++) { + for (is_free = 0, q = lists[i]; q; + q = get_freepointer(si, (void *)q)) { - if (q == BADADDR) { - hq_close(); - return FALSE; - } - if (q & PAGE_MAPPING_ANON) - break; - if (p == q) { - is_free = TRUE; - break; - } - if (!hq_enter(q)) { - hq_close(); - error(INFO, - "%s: slab: %lx duplicate freelist object: %lx\n", - si->curname, si->slab, q); - return FALSE; + if (q == BADADDR) { + hq_close(); + return FALSE; + } + if (q & PAGE_MAPPING_ANON) + break; + if (p == q) { + is_free = TRUE; + goto found_object; + } + if (!hq_enter(q)) { + hq_close(); + error(INFO, "%s: slab: %lx duplicate freelist object: %lx\n", + si->curname, si->slab, q); + return FALSE; + } } - } + found_object: hq_close(); if (si->flags & ADDRESS_SPECIFIED) { @@ -18677,7 +18682,7 @@ compound_head(ulong page) long count_partial(ulong node, struct meminfo *si, ulong *free) { - ulong list_head, next, last, objects_vaddr; + ulong list_head, next, last; short inuse, objects; ulong total_inuse; ulong count = 0; @@ -18708,31 +18713,11 @@ count_partial(ulong node, struct meminfo total_inuse += inuse; if (VALID_MEMBER(page_objects)) { - objects_vaddr = last + OFFSET(page_objects); - if (si->flags & SLAB_BITFIELD) - objects_vaddr += sizeof(ushort); - if (!readmem(objects_vaddr, KVADDR, &objects, - sizeof(ushort), "page.objects", RETURN_ON_ERROR)) { - hq_close(); - return -1; - } - - if (si->flags & SLAB_BITFIELD) { - if (__BYTE_ORDER == __LITTLE_ENDIAN) { - objects <<= 1; - objects >>= 1; - } - if (__BYTE_ORDER == __BIG_ENDIAN) - objects >>= 1; - } - - if (objects == (short)(-1)) { - error(INFO, "%s: slab: %lx invalid page.objects: -1\n", - si->curname, last); + objects = slub_page_objects(si, last); + if (!objects) { hq_close(); return -1; } - *free += objects - inuse; } _

10 years, 2 months

2
14
0 / 0

[PATCH] Make vm -p work without swap

by Rabin Vincent

From: Rabin Vincent <rabinv(a)axis.com> On kernels without swap, vm -p currently errors out with the message "nr_swapfiles doesn't exist in this kernel". By handling this case gracefully instead of erroring out, we make it work on such kernels. --- memory.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/memory.c b/memory.c index b0ecb05..693516e 100644 --- a/memory.c +++ b/memory.c @@ -15550,6 +15550,9 @@ swap_location(ulonglong pte, char *buf) if (!pte) return NULL; + if (!symbol_exists("nr_swapfiles") || !symbol_exists("swap_info")) + return NULL; + if (THIS_KERNEL_VERSION >= LINUX(2,6,0)) sprintf(buf, "%s OFFSET: %lld", get_swapdev(__swp_type(pte), swapdev), (ulonglong)__swp_offset(pte)); @@ -15570,12 +15573,6 @@ get_swapdev(ulong type, char *buf) ulong swap_info, swap_info_ptr, swap_file; ulong vfsmnt; - if (!symbol_exists("nr_swapfiles")) - error(FATAL, "nr_swapfiles doesn't exist in this kernel!\n"); - - if (!symbol_exists("swap_info")) - error(FATAL, "swap_info doesn't exist in this kernel!\n"); - swap_info_init(); swap_info = symbol_value("swap_info"); -- 2.7.0

10 years, 2 months

1
0
0 / 0

[PATCH] Add the proccgroup extension

by Nikolay Borisov

Initial version of a crash module which can be used to show which cgroups is the currently active process member of. --- Hello this is a simple crash extension that I hacked up over the weekend, in my case when I look at kernel crash dump I want to quickly understand which cgroup is the current process member of. Currently it uses the process from the current context but this might change in the future. Here is an example output: crash> show_cgroups subsys: cpuset cgroup: c6666 subsys: cpu cgroup: c6666 subsys: cpuacct cgroup: c6666 subsys: io cgroup: c6666 subsys: memory cgroup: c6666 subsys: devices cgroup: c6666 subsys: freezer cgroup: c6666 subsys: perf_event cgroup: c6666 subsys: pids cgroup: c6666 I have tested this on 4.6-rc2 with and without cgroup support enabled. I'm just sending this to get an initial idea whether I have used crash's facilities correctly and canvas for future ideas. I'm aware there are already 2 cgroup modules but when I tried running either they complained of no cgroup support or the command did nothing. In any case provided that the code is ok I guess this can be used as a good example of how to traverse structures with crash TODO: * Make the command understand either task_struct pointer or pid being passed. * Add support for pre-3.15 kernels (the cgroup name struct changed to kernfs at that point) * Whatever people think might be useful extensions/proccgroup.c | 136 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 136 insertions(+) create mode 100644 extensions/proccgroup.c diff --git a/extensions/proccgroup.c b/extensions/proccgroup.c new file mode 100644 index 0000000..fceeaf6 --- /dev/null +++ b/extensions/proccgroup.c @@ -0,0 +1,136 @@ +/* + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * Nikolay Borisov <n.borisov.lkml(a)gmail.com> + */ + +#include "defs.h" + +#define CGROUP_PATH_MAX +void proccgroup_init(void); +void proccgroup_fini(void); + +void show_proc_cgroups(void); +char *help_proc_cgroups[]; + +static struct command_table_entry command_table[] = { + { "show_cgroups", show_proc_cgroups, help_proc_cgroups, 0}, + { NULL }, +}; + + +void __attribute__((constructor)) +echo_init(void) +{ + register_extension(command_table); +} + +void __attribute__((destructor)) +echo_fini(void) { } + + +static void get_cgroup_name(ulong cgroup, char *buf, size_t buflen) +{ + + ulong kernfs_node; + ulong cgroup_name_ptr; + ulong kernfs_parent; + + /* Get cgroup->kn */ + readmem(cgroup + MEMBER_OFFSET("cgroup", "kn"), KVADDR, &kernfs_node, sizeof(void *), + "cgroup->kn", RETURN_ON_ERROR); + + readmem(kernfs_node + MEMBER_OFFSET("kernfs_node", "parent"), KVADDR, &kernfs_parent, sizeof(void *), + "kernfs_node->parent", RETURN_ON_ERROR); + + if (kernfs_parent == 0) { + sprintf(buf, "/"); + return; + } + + /* Get kn->name */ + readmem(kernfs_node + MEMBER_OFFSET("kernfs_node", "name"), KVADDR, &cgroup_name_ptr, sizeof(void *), + "kernfs_node->name", RETURN_ON_ERROR); + + read_string(cgroup_name_ptr, buf, buflen-1); +} + + +static void get_subsys_name(ulong subsys, char *buf, size_t buflen) +{ + + ulong subsys_name_ptr; + ulong cgroup_subsys_ptr; + + /* Get cgroup->kn */ + readmem(subsys + MEMBER_OFFSET("cgroup_subsys_state", "ss"), KVADDR, &cgroup_subsys_ptr, sizeof(void *), + "cgroup_subsys_state->ss", RETURN_ON_ERROR); + + readmem(cgroup_subsys_ptr + MEMBER_OFFSET("cgroup_subsys", "name"), KVADDR, &subsys_name_ptr, sizeof(void *), + "cgroup_subsys->name", RETURN_ON_ERROR); + read_string(subsys_name_ptr, buf, buflen-1); +} + +static void get_kn_cgroup_name(ulong cgroup, ulong subsys) +{ + + char cgroup_name[BUFSIZE]; + char subsys_name[BUFSIZE]; + + get_cgroup_name(cgroup, cgroup_name, BUFSIZE); + get_subsys_name(subsys, subsys_name, BUFSIZE); + + fprintf(fp, "subsys: %-20s cgroup: %s\n", subsys_name, cgroup_name); +} + +void +show_proc_cgroups(void) +{ + ulong cgroups_subsys_ptr = 0; + int subsyscount; + int i; + + if (!MEMBER_EXISTS("task_struct", "cgroups")) { + fprintf(fp, "No cgroup support found\n"); + return; + } + + /* Get address of task_struct->cgroups */ + readmem(CURRENT_TASK() + MEMBER_OFFSET("task_struct", "cgroups"), + KVADDR, &cgroups_subsys_ptr, sizeof(void *), + "task_struct->cgroups", RETURN_ON_ERROR); + + subsyscount = MEMBER_SIZE("css_set", "subsys") / sizeof(void *); + + for (i = 0; i < subsyscount; i++) { + ulong subsys_ptr; + ulong subsys_base = cgroups_subsys_ptr + MEMBER_OFFSET("css_set", "subsys"); + ulong cgroup; + + /* Get css_set->subsys[i] address */ + readmem(subsys_base + (i * sizeof(void*)), KVADDR, &subsys_ptr, sizeof(void *), + "css_set->subsys[i]", RETURN_ON_ERROR); + /* Get cgroup_subsys_state -> cgroup */ + readmem(subsys_ptr + MEMBER_OFFSET("cgroup_subsys_state", "cgroup"), KVADDR, &cgroup, sizeof(void *), + "cgroup_subsys_state->cgroup", RETURN_ON_ERROR); + + /* Handle the 2 cases of cgroup_name and the kernfs one */ + if (MEMBER_EXISTS("cgroup", "kn")) { + get_kn_cgroup_name(cgroup, subsys_ptr); + } else if (MEMBER_EXISTS("cgroup", "name")) { + fprintf(fp, "Unsupported kernel version"); + } + } +} + +char *help_proc_cgroups[] = { + "show_cgroup", /* command name */ + "Show which cgroups is the current process member of", /* short description */ + " ", /* argument synopsis, or " " if none */ + NULL +}; + + -- 2.5.0

10 years, 2 months

3
11
0 / 0

[Patch 0/2] Request data structures of particular size.

by Alexandr Terekhov

Hello Dave, Here is the brief background for the patch. We had a problem - there was a page which contained some structure which we weren't able to identify, but we could specify approximate size of this structure. Also it might be said that this structure contains two adjacent lists. But it was really difficult to tell which exactly structure it is. So, here is a patch for such kind of functionality - you can list data structures which size is in some range and which contain fields of some particular type.

10 years, 3 months

2
11
0 / 0

wrong values shown by kmem -s

by vinayak menon

Hi, With a ramdump of a system with 386MB of available memory, "kmem -s" shows something like this CACHE NAME OBJSIZE ALLOCATED TOTAL SLABS SSIZE dd801c00 kmalloc-2048 2048 670869 670960 41935 32k The memory occupied by objects go beyond the total system memory if we use the above numbers to calculate that. I think the output is wrong because with SLUB, slabs can be of different sizes depending on the order of allocation. And here objects are calculated considering slabs are of fixed size. The attached patch worked for me. Thanks, Vinayak

10 years, 3 months

2
11
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility April 2016