June 2006 - Crash-utility - Crash Utility List Archives

[patch 1/1] fix incomplete FILENAME in "swap" command

by Lin Feng Shen

From: Lin Feng Shen <shenlinf(a)cn.ibm.com> dump_swap_info() invokes get_pathname() to get the file name of swap device. If swap_vfsmnt isn't defined but old_block_size is defined in kernel struct swap_info_struct(e.g. 2.6 kernel), 0 is passed as vfsmnt to get_pathname(). In this case, get_pathname() won't go up to the parent vfsmnt, so the file name of swap space is shown incompletely. Here is an example. crash-4.0-25 was installed and udev was mounted on /dev in sles10. Swap /dev/sda2 was active before kdump. sub-command "swap" in crash command showed an incomplete swap file name. --------------- crash> swap FILENAME TYPE SIZE USED PCT PRIORITY /sda2 PARTITION 2104504k 0k 0% -1 --------------- The solution is to retrieve vfsmnt from swap_file just as file_to_dentry() does to get dentry, and then pass it to get_pathname(). Signed-off-by: Lin Feng Shen <shenlinf(a)cn.ibm.com> --- diff -ruNp crash-4.0-2.18.orig/defs.h crash-4.0-2.18/defs.h --- crash-4.0-2.18.orig/defs.h 2006-06-02 02:51:15.000000000 -0400 +++ crash-4.0-2.18/defs.h 2006-06-02 03:06:17.000000000 -0400 @@ -3031,6 +3031,7 @@ void close_tmpfile2(void); void open_files_dump(ulong, int, struct reference *); void get_pathname(ulong, char *, int, int, ulong); ulong file_to_dentry(ulong); +ulong file_to_vfsmnt(ulong); void nlm_files_dump(void); int get_proc_version(void); int file_checksum(char *, long *); diff -ruNp crash-4.0-2.18.orig/filesys.c crash-4.0-2.18/filesys.c --- crash-4.0-2.18.orig/filesys.c 2006-06-02 02:51:14.000000000 -0400 +++ crash-4.0-2.18/filesys.c 2006-06-02 03:11:34.000000000 -0400 @@ -2544,6 +2544,20 @@ file_to_dentry(ulong file) } /* + * Get the vfsmnt associated with a file. + */ +ulong +file_to_vfsmnt(ulong file) +{ + char *file_buf; + ulong vfsmnt; + + file_buf = fill_file_cache(file); + vfsmnt = ULONG(file_buf + OFFSET(file_f_vfsmnt)); + return vfsmnt; +} + +/* * get_pathname() fills in a pathname string for an ending dentry * See __d_path() in the kernel for help fixing problems. */ diff -ruNp crash-4.0-2.18.orig/memory.c crash-4.0-2.18/memory.c --- crash-4.0-2.18.orig/memory.c 2006-06-02 02:51:15.000000000 -0400 +++ crash-4.0-2.18/memory.c 2006-06-02 03:06:31.000000000 -0400 @@ -10293,7 +10293,7 @@ dump_swap_info(ulong swapflags, ulong *t } else if (VALID_MEMBER (swap_info_struct_old_block_size)) { get_pathname(file_to_dentry(swap_file), - buf, BUFSIZE, 1, 0); + buf, BUFSIZE, 1, file_to_vfsmnt(swap_file)); } else { get_pathname(swap_file, buf, BUFSIZE, 1, 0); }

20 years, 1 month

2
1
0 / 0

Michael Holzheu/Germany/IBM is out of the office.

by Michael Holzheu

I will be out of the office starting 26.06.2006 and will not return until 03.07.2006.

20 years, 1 month

1
0
0 / 0

crash version 4.0-2.31 is available

by Dave Anderson

- Bumped crash-internal NR_CPUS for x86 and ia64; added a warning message to "recompile crash" and forced an initialization failure when the kernel's configured NR_CPUS is greater than the maximum allowed NR_CPUS value compiled into crash. (maneesh(a)in.ibm.com, anderson(a)redhat.com) - Fix for initialization failure indicating a kernel/memory-source mismatch when x86 kernel configures its physical memory start address higher than the traditional 1MB starting point. (anderson(a)redhat.com) - Fix for kernels that have replaced the "system_utsname" data structure with contents of the "init_uts_ns" data structure. This fixes a "crash: cannot resolve system_utsname" initialization failure. (pbadari(a)us.ibm.com, anderson(a)redhat.com) - Fix for large LKCD dumpfiles that resulted in an initialization time failure indicating "fixme, need to add more zones (ZONE_ALLOC)". When statically-defined ZONE_ALLOC value is too small, the fix expands the zone size dynamically. (indou.takao(a)jp.fujitsu.com) - Fix for "kmem -i" failure when the "all_bdevs" block_device list is empty. Part of the command output would be displayed, followed by "kmem: invalid kernel virtual address: 0 type: inode buffer". (anderson(a)redhat.com) - First pass at supporting a xen hypervisor kexec/kdump vmcore as the dumpfile format for the dom0 vmlinux. Developed/tested OK on an x86 vmlinux/vmcore set supplied by horms(a)verge.net.au. Code for x86_64 is in place, but untested. (anderson(a)redhat.com) - Also in place, but untested, is initial support for xen x86 PAE kernels. (anderson(a)redhat.com) Download from: http://people.redhat.com/anderson

20 years, 1 month

1
0
0 / 0

crash with Xen image from kdump

by Dave Anderson

> I added some code to kdump to have it record CR3 for dom0. This is > done using a second note in the per-cpu notes area, which for now > just stores a single 4byte entity, the mfn of that CPU in dom0 > if it was present in dom0. > > I have made a dump available that includes this. The tarball > also includes the kernels, xen, symbol files, and patches to xen. > If you want to find the cr3 saving code its in ./arch/x86/crash.c > > I plan to post this update to xen-devel shortly, hopefully tomorrow, > after upporting to the latest xen tree (I'm still working off about > 3 weeks ago's tree). > > http://packages.vergenet.net/tmp/xen-unstable.hg+kexec-20060616.tar.bz2 OK -- here's a proof-of-concept running the dom0 vmlinux against the xen kdump: # crash vmlinux vmcore crash 4.0-2.31-rc1 Copyright (C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005 Fujitsu Limited Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb 6.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... KERNEL: vmlinux DUMPFILE: vmcore CPUS: 2 DATE: Wed Jun 14 15:05:01 2006 UPTIME: 00:04:40 LOAD AVERAGE: 1.22, 0.39, 0.13 TASKS: 94 NODENAME: aiko.lab.ultramonkey.org RELEASE: 2.6.16.13-xen VERSION: #7 SMP Fri Jun 9 16:25:32 JST 2006 MACHINE: i686 (866 Mhz) MEMORY: 887.4 MB PANIC: "SysRq : Trigger a crashdump" PID: 3949 COMMAND: "do_kdump" TASK: f3e64030 [THREAD_INFO: f3dba000] CPU: 1 STATE: TASK_RUNNING (SYSRQ) crash> bt -a PID: 0 TASK: c02ce460 CPU: 0 COMMAND: "swapper" #0 [c030ff34] schedule at c028e648 #1 [c030ffb0] cpu_idle at c0103e9f PID: 3949 TASK: f3e64030 CPU: 1 COMMAND: "do_kdump" #0 [f3dbbed8] crash_kexec at c0140c45 #1 [f3dbbf28] __handle_sysrq at c01f54e4 #2 [f3dbbf54] write_sysrq_trigger at c019cbff #3 [f3dbbf6c] vfs_write at c0168dbf #4 [f3dbbf90] sys_write at c0169736 #5 [f3dbbfb8] system_call at c0105542 EAX: 00000004 EBX: 00000001 ECX: 080f8408 EDX: 00000002 DS: 007b ESI: 00000002 ES: 007b EDI: b7f007c0 SS: 007b ESP: bfb5ffc8 EBP: bfb5ffe4 CS: 0073 EIP: b7e93028 ERR: 00000004 EFLAGS: 00000246 crash> As I discussed earlier, given that this is a writable-page-table kernel, having any legitimate CR3 (I just use the first one found in the ELF header), I first get the value of "max_pfn" (x86), and then the value of "phys_to_machine_mapping", which makes up dom0's "phys_to_machine_mapping[max_pfn]" array. From that, all subsequent pseudo-physical address requests can be translated into the physical address for the existing read_netdump() function to access. As we talked about before, this won't work for shadow-page-table kernels; for those I would need to having the "pfn_to_mfn_frame_list_list" mfn value from the shared, per-domain, "arch_shared_info" structure(s). With that single value, the phys_to_machine_mapping[] array can be resurrected for both writable- and shadow-page-table kernels. Also, with either the cr3 or pfn_to_mfn_frame_list_list schemes, if those values were made available for *all* of the other domains instead of just dom0, then we could run a crash session against any of the domains on the system. In any case, this is pretty cool for starters... BTW, I've created a new n_type value to handle this particular invocation, which I understand will be subject to change. Note that the spelling in your PT_NOTE is a bit strange: crash> help -n ... Elf32_Nhdr: n_namesz: 18 ("Xen Domanin-0 CR3") n_descsz: 4 n_type: 10000001 (NT_XEN_KDUMP_CR3) 00027227 ... crash> Anyway, I'll do the same thing for x86_64 (untested) and update the crash release so you'll have something to work with. Thanks, Dave

20 years, 1 month

3
2
0 / 0

Increase NR_CPUS

by Maneesh Soni

Hi Dave, crash seg faults while opening a kdump with NR_CPUS=128, due to buffer overflow in max_cpudata_limit() on a i386 system. -------- kmem_cache_s_array_nodes: if (!readmem(cache+OFFSET(kmem_cache_s_array), KVADDR, &cpudata[0], sizeof(ulong) * ARRAY_LENGTH(kmem_cache_s_array), "array cache array", RETURN_ON_ERROR)) goto bail_out; for (i = max_limit = 0; (i < ARRAY_LENGTH(kmem_cache_s_array)) && cpudata[i]; i++) { if (!readmem(cpudata[i]+OFFSET(array_cache_limit), KVADDR, &limit, sizeof(int), "array cache limit", RETURN_ON_ERROR)) goto bail_out; if (limit > max_limit) max_limit = limit; } *cpus = i; <<<<<< faults here -------- The first readmem() call overwrites the parameter "cpus" on stack. ARRAY_LENGTH gives 128 whereas we have 32 elements in cpudata[NR_CPUS]. Though the default NR_CPUS in kernel source is 32 but it can go upto 256 based on the kernel config option CONFIG_NR_CPUS. So, in crash it should be defined as the max NR_CPUS. Please find the patch below which makes sure to have max NR_CPUS for various architecture. --- crash-4.0-2.30/defs.h 2006-06-07 01:16:33.000000000 +0530 +++ crash-4.0-2.30-fix/defs.h 2006-06-24 04:29:35.000000000 +0530 @@ -56,7 +56,7 @@ #define FALSE (0) #ifdef X86 -#define NR_CPUS (32) +#define NR_CPUS (256) #endif #ifdef X86_64 #define NR_CPUS (256) @@ -68,7 +68,7 @@ #define NR_CPUS (32) #endif #ifdef IA64 -#define NR_CPUS (512) +#define NR_CPUS (1024) #endif #ifdef PPC64 #define NR_CPUS (128) Thanks Maneesh

20 years, 1 month

2
1
0 / 0

crash-4.0-2.30 broken on 2.6.17-rc6-mm2 ?

by Badari Pulavarty

Known issue ? Thanks, Badari elm3b29:~/vec.mm/linux-2.6.17-rc6 # /tmp/crash ./System.map vmlinux crash 4.0-2.30 Copyright (C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005 Fujitsu Limited Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb 6.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-unknown-linux-gnu"... crash: cannot resolve "system_utsname"

20 years, 1 month

3
8
0 / 0

Re: [Crash-utility] crash-4.0-2.30 broken on 2.6.17-rc6-mm2 ?

by Dave Anderson

> Badari Pulavarty wrote: > > > Just realized that -mm has patches to remove system_utsname. > > Thing to watch out for future changes for crash. > > > > What does sys_uname() read? > > Dave > I put it at the top of the crash.TODO list... If anybody wants to take a crack at it before I get around to it, be my guest. Dave

20 years, 1 month

1
0
0 / 0

ERROR: fixme, need to add more zones (ZONE_ALLOC)

by Takao Indoh

Hi, When I tested LKCD on a machine which has big memory, an error occurred and the following message was displayed. fixme, need to add more zones (ZONE_ALLOC) This message means that zone size(the size is defined statically by ZONE_ALLOC) is too small. Attached patch fixes crash to expand zone size dynamically. Regards, Takao Indoh diff -Nurp crash-4.0-2.30.org/lkcd_common.c crash-4.0-2.30/lkcd_common.c --- crash-4.0-2.30.org/lkcd_common.c 2006-06-14 22:14:09.000000000 +0900 +++ crash-4.0-2.30/lkcd_common.c 2006-06-15 12:34:08.000000000 +0900 @@ -670,6 +670,8 @@ save_offset(uint64_t paddr, off_t off) { uint64_t zone, page; int ii, ret; + int max_zones; + struct physmem_zone *zones; zone = paddr & lkcd->zone_mask; @@ -696,6 +698,7 @@ save_offset(uint64_t paddr, off_t off) lkcd->num_zones++; } +retry: /* find the zone */ for (ii=0; ii < lkcd->num_zones; ii++) { if (lkcd->zones[ii].start == zone) { @@ -737,8 +740,20 @@ save_offset(uint64_t paddr, off_t off) ret = 1; lkcd->num_zones++; } else { - lkcd_print("fixme, need to add more zones (ZONE_ALLOC)\n"); - exit(1); + /* need to expand zone */ + max_zones = lkcd->max_zones * 2; + zones = malloc(max_zones * sizeof(struct physmem_zone)); + if (!zones) { + return -1; /* This should be fatal */ + } + BZERO(zones, max_zones * sizeof(struct physmem_zone)); + memcpy(zones, lkcd->zones, + lkcd->max_zones * sizeof(struct physmem_zone)); + free(lkcd->zones); + + lkcd->zones = zones; + lkcd->max_zones = max_zones; + goto retry; } }

20 years, 1 month

2
1
0 / 0

Re: crash with Xen dom0 image from kdump

by Horms

On Mon, Jun 12, 2006 at 01:41:07PM +0900, Kazuo Moriwaka wrote: > Hi, > > I'm not clear about shadow mode; are vcpu->arch.shadow_table need for > shadow-mode domains? To be perfectly honest, I'm not clear about it either. The code below probably does not cover shadow mode, but it should be easy enough to fix, probably using vcpu->arch.shadow_table as you suggest. I'll look into it some more. -- 西門宝曼 (サイモン・ホーマン) | Simon Horman (Horms)

20 years, 1 month

3
3
0 / 0

Re: crash with Xen dom0 image from kdump

by David Anderson

> So it would appear that given the mfn of the xen_start_info structure > of a domain, then the pfn_to_mfn_frame_list_list could be tracked down. Even easier -- just getting the mfn of the "shared_info" page would be even less painful. Take the xen_start_info page right out of the picture... Dave

20 years, 1 month

4
4
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility June 2006