Crash-utility May 2008

devel@lists.crash-utility.osci.io

7 participants
6 discussions

[PATCH] Add xen_phys_start value in the crash info note

by Itsuro ODA

Hi, This patch makes the vmcore utilities (ex. crash, makedumpfile) be able to get the relocation address of the xen hypervisor from a vmcore. It is necessary for the utilities to find the data of the hypervisor structures. Note that this patch does not raise any comptibility issue for the utilities (which I know) nor the other components of xen. Signed-off-by: Itsuro Oda <oda(a)valinux.co.jp> diff -r f681c4de91fc xen/arch/x86/crash.c --- a/xen/arch/x86/crash.c Wed May 28 16:14:10 2008 +0100 +++ b/xen/arch/x86/crash.c Fri May 30 08:40:50 2008 +0900 @@ -102,6 +102,7 @@ void machine_crash_shutdown(void) hvm_cpu_down(); info = kexec_crash_save_info(); + info->xen_phys_start = xen_phys_start; info->dom0_pfn_to_mfn_frame_list_list = arch_get_pfn_to_mfn_frame_list_list(dom0); } diff -r f681c4de91fc xen/include/xen/elfcore.h --- a/xen/include/xen/elfcore.h Wed May 28 16:14:10 2008 +0100 +++ b/xen/include/xen/elfcore.h Fri May 30 08:39:40 2008 +0900 @@ -66,6 +66,7 @@ typedef struct { unsigned long xen_compile_time; unsigned long tainted; #if defined(__i386__) || defined(__x86_64__) + unsigned long xen_phys_start; unsigned long dom0_pfn_to_mfn_frame_list_list; #endif #if defined(__ia64__) -- Itsuro ODA <oda(a)valinux.co.jp>

17 years, 7 months

3
2
0 / 0

handle x86_64 xen code/data relocation

by Itsuro ODA

Hi all, Recent version of xen (ex. RHEL5.2, 3.2.0) on the x86_64 moves the physical(machine) address of xen code/data area after the system started up. The start address of this is stored in 'xen_phys_start'. Thus to get a machine address of a xen text symbol from its virtual address, calculate "va - __XEN_VIRT_START + xen_phys_start". crash and makedumpfile command need the value of xen_phys_start. They know the virtual address of 'xen_phys_start' symbol but no way to extract the value of xen_phys_start. I think adding the xen_phys_start value to the CRASHINFO ElfNote section at first. (Plan A: patch for xen hypervisor code attaced) It is smallest modification necessary over all. On the other hand there is a opinion that it is better to upgrade a user-package than a hypervisor or kernel package. The xen_phys_start value can be got from /proc/iomem. ------------------------------------------------------- # cat /proc/iomem ... 7e600000-7f5fffff : Hypervisor code and data *** this line ... ------------------------------------------------------- So the kexec-tools can handle it theoretically. The Plan B is that kexec-tools adds another ElfNote section which holds the xen_phys_start value. The attached patch works well though I am concern about it is a bit tricky. Which plan is better ? Or more good implementation ? Please comment. (note that crash and makedumpfile modification is same degree for both plan.) Thanks. Itsuro Oda === Plan A (modify the xen hypervisor. It is for RHEL5.2 but almost same for other version) === --- include/xen/elfcore.h.org 2008-04-17 14:11:41.000000000 +0900 +++ include/xen/elfcore.h 2008-04-17 14:11:57.000000000 +0900 @@ -66,6 +66,7 @@ unsigned long xen_compile_time; unsigned long tainted; #ifdef CONFIG_X86 + unsigned long xen_phys_start; unsigned long dom0_pfn_to_mfn_frame_list_list; #endif } crash_xen_info_t; --- arch/x86/crash.c.org 2008-04-17 14:12:51.000000000 +0900 +++ arch/x86/crash.c 2008-04-17 14:13:13.000000000 +0900 @@ -102,6 +102,7 @@ hvm_disable(); info = kexec_crash_save_info(); + info->xen_phys_start = xen_phys_start; info->dom0_pfn_to_mfn_frame_list_list = arch_get_pfn_to_mfn_frame_list_list(dom0); } ================================================================ === Plan B (modify the kexec-tools. proof of concept version) === diff -ru kexec-tools-testing-20080324.org/kexec/arch/x86_64/crashdump-x86_64.c kexec-tools-testing-20080324/kexec/arch/x86_64/crashdump-x86_64.c --- kexec-tools-testing-20080324.org/kexec/arch/x86_64/crashdump-x86_64.c 2008-03-21 13:16:28.000000000 +0900 +++ kexec-tools-testing-20080324/kexec/arch/x86_64/crashdump-x86_64.c 2008-04-22 15:15:08.000000000 +0900 @@ -73,6 +73,25 @@ return -1; } +static int get_hypervisor_paddr(struct kexec_info *info) +{ + uint64_t start; + + if (!xen_present()) + return 0; + + if (parse_iomem_single("Hypervisor code and data\n", &start, NULL) == 0) { + info->hypervisor_paddr_start = start; +#ifdef DEBUG + printf("kernel load physical addr start = 0x%016Lx\n", start); +#endif + return 0; + } + + fprintf(stderr, "Cannot determine hypervisor physical load addr\n"); + return -1; +} + /* Retrieve info regarding virtual address kernel has been compiled for and * size of the kernel from /proc/kcore. Current /proc/kcore parsing from * from kexec-tools fails because of malformed elf notes. A kernel patch has @@ -581,6 +600,9 @@ if (get_kernel_paddr(info)) return -1; + if (get_hypervisor_paddr(info)) + return -1; + if (get_kernel_vaddr_and_size(info)) return -1; @@ -620,6 +642,9 @@ */ elfcorehdr = add_buffer(info, tmp, sz, 16*1024, align, min_base, max_addr, -1); + if (info->hypervisor_paddr_start && xen_present()) { + *(info->hypervisor_paddr_loc) += elfcorehdr; + } if (delete_memmap(memmap_p, elfcorehdr, sz) < 0) return -1; cmdline_add_memmap(mod_cmdline, memmap_p); diff -ru kexec-tools-testing-20080324.org/kexec/crashdump.c kexec-tools-testing-20080324/kexec/crashdump.c --- kexec-tools-testing-20080324.org/kexec/crashdump.c 2008-03-21 13:16:28.000000000 +0900 +++ kexec-tools-testing-20080324/kexec/crashdump.c 2008-04-22 15:33:47.000000000 +0900 @@ -36,8 +36,10 @@ #define FUNC crash_create_elf64_headers #define EHDR Elf64_Ehdr #define PHDR Elf64_Phdr +#define NHDR Elf64_Nhdr #include "crashdump-elf.c" #undef ELF_WIDTH +#undef NHDR #undef PHDR #undef EHDR #undef FUNC @@ -46,8 +48,10 @@ #define FUNC crash_create_elf32_headers #define EHDR Elf32_Ehdr #define PHDR Elf32_Phdr +#define NHDR Elf32_Nhdr #include "crashdump-elf.c" #undef ELF_WIDTH +#undef NHDR #undef PHDR #undef EHDR #undef FUNC diff -ru kexec-tools-testing-20080324.org/kexec/crashdump-elf.c kexec-tools-testing-20080324/kexec/crashdump-elf.c --- kexec-tools-testing-20080324.org/kexec/crashdump-elf.c 2008-01-11 12:13:48.000000000 +0900 +++ kexec-tools-testing-20080324/kexec/crashdump-elf.c 2008-04-22 15:35:16.000000000 +0900 @@ -1,6 +1,6 @@ -#if !defined(FUNC) || !defined(EHDR) || !defined(PHDR) -#error FUNC, EHDR and PHDR must be defined +#if !defined(FUNC) || !defined(EHDR) || !defined(PHDR) || !defined(NHDR) +#error FUNC, EHDR, PHDR and NHDR must be defined #endif #if (ELF_WIDTH == 64) @@ -37,6 +37,7 @@ uint64_t vmcoreinfo_addr, vmcoreinfo_len; int has_vmcoreinfo = 0; int (*get_note_info)(int cpu, uint64_t *addr, uint64_t *len); + int has_hypervisor_paddr_start = 0; if (xen_present()) nr_cpus = xen_get_nr_phys_cpus(); @@ -78,6 +79,11 @@ sz += sizeof(PHDR); } + if (info->hypervisor_paddr_start && xen_present()) { + sz += sizeof(PHDR) + sizeof(NHDR) + 4 + sizeof(unsigned long); + has_hypervisor_paddr_start = 1; + } + /* * Make sure the ELF core header is aligned to at least 1024. * We do this because the secondary kernel gets the ELF core @@ -168,6 +174,22 @@ dbgprintf_phdr("vmcoreinfo header", phdr); } + if (has_hypervisor_paddr_start) { + phdr = (PHDR *) bufp; + bufp += sizeof(PHDR); + phdr->p_type = PT_NOTE; + phdr->p_flags = 0; + phdr->p_offset = phdr->p_paddr = 0; + phdr->p_vaddr = 0; + phdr->p_filesz = phdr->p_memsz = sizeof(NHDR) + 4 + sizeof(unsigned long); + phdr->p_align = 0; + + (elf->e_phnum)++; + dbgprintf_phdr("hypervisor phys addr header", phdr); + + info->hypervisor_paddr_loc = (unsigned long *)&phdr->p_offset; + } + /* Setup an PT_LOAD type program header for the region where * Kernel is mapped if info->kern_size is non-zero. */ @@ -225,6 +247,24 @@ (elf->e_phnum)++; dbgprintf_phdr("Elf header", phdr); } + + if (has_hypervisor_paddr_start) { + NHDR *nhdr; + unsigned int offset = (void *)bufp - *buf; + + nhdr = (NHDR *) bufp; + bufp += sizeof(NHDR); + nhdr->n_namesz = 4; + nhdr->n_descsz = sizeof(unsigned long); + nhdr->n_type = 0x1000003; + memcpy(bufp, "Xen", 4); + bufp += 4; + *((unsigned long *)bufp) = info->hypervisor_paddr_start; + bufp += sizeof(unsigned long); + + *(info->hypervisor_paddr_loc) = offset; + } + return 0; } diff -ru kexec-tools-testing-20080324.org/kexec/kexec.h kexec-tools-testing-20080324/kexec/kexec.h --- kexec-tools-testing-20080324.org/kexec/kexec.h 2008-03-21 13:16:28.000000000 +0900 +++ kexec-tools-testing-20080324/kexec/kexec.h 2008-04-22 15:08:57.000000000 +0900 @@ -123,6 +123,8 @@ unsigned long kern_vaddr_start; unsigned long kern_paddr_start; unsigned long kern_size; + unsigned long hypervisor_paddr_start; + unsigned long *hypervisor_paddr_loc; }; void usage(void); ====================================================================================== -- Itsuro ODA <oda(a)valinux.co.jp>

17 years, 7 months

3
5
0 / 0

[Patch] Fix backtrace of xen-ia64

by Akio Takebe

Hi, This patch improves backtrace of xen-ia64. If the cpu map is not contiguous, we cannot get the stack address. Signed-off-by: Akio Takebe <takebe_akio(a)jp.fujitsu.com> Best Regards, Akio Takebe --- --- crash-4.0-6.3.orig/ia64.c 2008-04-30 02:39:16.000000000 +0900 +++ crash-4.0-6.3/ia64.c 2008-05-21 15:05:53.000000000 +0900 @@ -4167,14 +4167,14 @@ ia64_in_mca_stack_hyper(ulong addr, stru if (!machdep->kvtop(NULL, addr, &paddr, 0)) return 0; - __per_cpu_mca = (ulong *)GETBUF(sizeof(ulong) * xht->pcpus); + __per_cpu_mca = (ulong *)GETBUF(sizeof(ulong) * plen); if (!readmem(symbol_value("__per_cpu_mca"), KVADDR, __per_cpu_mca, - sizeof(ulong) * xht->pcpus, "__per_cpu_mca", RETURN_ON_ERROR|QUIET)) + sizeof(ulong) * plen, "__per_cpu_mca", RETURN_ON_ERROR|QUIET)) return 0; if (CRASHDEBUG(1)) { - for (i = 0; i < xht->pcpus; i++) { + for (i = 0; i < plen; i++) { fprintf(fp, "__per_cpu_mca[%d]: %lx\n", i, __per_cpu_mca[i]); }

17 years, 7 months

2
1
0 / 0

[PATCH] Use backtrace() instead of __builtin_return_address()

by Bernhard Walle

When crash is compiled with gcc 4.3 and -O2, the __builtin_return_address() causes crash to crash. See also [1] for a discussion about that. The gcc documentation [2] says __builtin_return_address() On some machines it may be impossible to determine the return address of any function other than the current one; in such cases, or when the top of the stack has been reached, this function will return 0 or a random value. In addition, __builtin_frame_address may be used to determine if the top of the stack has been reached. This function should only be used with a nonzero argument for debugging purposes. Even the __builtin_frame_address() does not work here. Instead of checking if the crash is built with -O2 and introducing new preprocessor checks here, I use the backtrace() function which is available via glibc. This works here (tested without the other patch which brought my attention to this bug). Since crash only runs on Linux (IIRC), the glibc dependency should not be a problem. Signed-off-by: Bernhard Walle <bwalle(a)suse.de> [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=165992 [2] http://gcc.gnu.org/onlinedocs/gcc/Return-Address.html --- defs.h | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) --- a/defs.h +++ b/defs.h @@ -1803,15 +1803,7 @@ struct alias_data { /* c static inline void save_return_address(ulong *retaddr) { - retaddr[0] = (ulong) __builtin_return_address(0); -#if defined(X86) || defined(PPC) || defined(X86_64) || defined(PPC64) - if (__builtin_frame_address(1)) - retaddr[1] = (ulong) __builtin_return_address(1); - if (__builtin_frame_address(2)) - retaddr[2] = (ulong) __builtin_return_address(2); - if (__builtin_frame_address(3)) - retaddr[3] = (ulong) __builtin_return_address(3); -#endif + backtrace(retaddr, 4); } #endif /* !GDB_COMMON */

17 years, 7 months

2
14
0 / 0

[PATCH] Use address_space.__nrpages for RT kernel

by Bernhard Walle

The patch mapping_nrpages.patch from RT kernel Subject: mm/fs: abstract address_space::nrpages Currently the tree_lock protects mapping->nrpages, this will not be possible much longer. Hence abstract the access to this variable so that it can be easily replaced by an atomic_ulong_t. Signed-off-by: Peter Zijlstra <a.p.zijlstra(a)chello.nl> renames address_space.nrpages to address_space.__nrpages. This patch implements that renaming for crash if address_space.nrpages is invalid. Signed-off-by: Bernhard Walle <bwalle(a)suse.de> --- memory.c | 3 +++ 1 file changed, 3 insertions(+) --- a/memory.c +++ b/memory.c @@ -320,6 +320,9 @@ vm_init(void) MEMBER_OFFSET_INIT(block_device_bd_disk, "block_device", "bd_disk"); MEMBER_OFFSET_INIT(inode_i_mapping, "inode", "i_mapping"); MEMBER_OFFSET_INIT(address_space_nrpages, "address_space", "nrpages"); + if (INVALID_MEMBER(address_space_nrpages)) + MEMBER_OFFSET_INIT(address_space_nrpages, "address_space", "__nrpages"); + MEMBER_OFFSET_INIT(gendisk_major, "gendisk", "major"); MEMBER_OFFSET_INIT(gendisk_fops, "gendisk", "fops"); MEMBER_OFFSET_INIT(gendisk_disk_name, "gendisk", "disk_name");

17 years, 7 months

2
1
0 / 0

Re: source line numbers and modules (on x86_64)

by Mike Snitzer

Hi, I searched the archives and found that you've discussed an issue I'm seeing with x86_64 kernels where crash doesn't have line numbers for modules' symbols: https://www.redhat.com/archives/crash-utility/2008-January/msg00021.html I'm using crash-4.0-6.3 on a RHEL5U1 x86_64 system with a custom 2.6.22.19 kernel. Given that the RHEL5U1 x86_64 kernels clearly do provide accurate line numbers for modules, has anyone identified how that is? I have to believe the redhat kernel is patched to fix this issue. I looked over the various redhat patches that are applied to RHEL5's 2.6.18 sources but can't see a patch that stands out as specifically addressing this x86_64 issue. But I could easily be overlooking some patch. please advise, thanks. Mike ps. please cc me as I've not yet been able to join the list

17 years, 8 months

2
7
0 / 0

← Newer
1
Older →

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility May 2008