[patch 1/1] fix incomplete FILENAME in "swap" command
by Lin Feng Shen
From: Lin Feng Shen <shenlinf(a)cn.ibm.com>
dump_swap_info() invokes get_pathname() to get the file name of swap
device.
If swap_vfsmnt isn't defined but old_block_size is defined in kernel
struct
swap_info_struct(e.g. 2.6 kernel), 0 is passed as vfsmnt to
get_pathname().
In this case, get_pathname() won't go up to the parent vfsmnt, so the file
name of swap space is shown incompletely.
Here is an example. crash-4.0-25 was installed and udev was mounted on
/dev
in sles10. Swap /dev/sda2 was active before kdump. sub-command "swap" in
crash command showed an incomplete swap file name.
---------------
crash> swap
FILENAME TYPE SIZE USED PCT PRIORITY
/sda2 PARTITION 2104504k 0k 0% -1
---------------
The solution is to retrieve vfsmnt from swap_file just as file_to_dentry()
does to get dentry, and then pass it to get_pathname().
Signed-off-by: Lin Feng Shen <shenlinf(a)cn.ibm.com>
---
diff -ruNp crash-4.0-2.18.orig/defs.h crash-4.0-2.18/defs.h
--- crash-4.0-2.18.orig/defs.h 2006-06-02 02:51:15.000000000 -0400
+++ crash-4.0-2.18/defs.h 2006-06-02 03:06:17.000000000 -0400
@@ -3031,6 +3031,7 @@ void close_tmpfile2(void);
void open_files_dump(ulong, int, struct reference *);
void get_pathname(ulong, char *, int, int, ulong);
ulong file_to_dentry(ulong);
+ulong file_to_vfsmnt(ulong);
void nlm_files_dump(void);
int get_proc_version(void);
int file_checksum(char *, long *);
diff -ruNp crash-4.0-2.18.orig/filesys.c crash-4.0-2.18/filesys.c
--- crash-4.0-2.18.orig/filesys.c 2006-06-02 02:51:14.000000000
-0400
+++ crash-4.0-2.18/filesys.c 2006-06-02 03:11:34.000000000 -0400
@@ -2544,6 +2544,20 @@ file_to_dentry(ulong file)
}
/*
+ * Get the vfsmnt associated with a file.
+ */
+ulong
+file_to_vfsmnt(ulong file)
+{
+ char *file_buf;
+ ulong vfsmnt;
+
+ file_buf = fill_file_cache(file);
+ vfsmnt = ULONG(file_buf + OFFSET(file_f_vfsmnt));
+ return vfsmnt;
+}
+
+/*
* get_pathname() fills in a pathname string for an ending dentry
* See __d_path() in the kernel for help fixing problems.
*/
diff -ruNp crash-4.0-2.18.orig/memory.c crash-4.0-2.18/memory.c
--- crash-4.0-2.18.orig/memory.c 2006-06-02 02:51:15.000000000
-0400
+++ crash-4.0-2.18/memory.c 2006-06-02 03:06:31.000000000 -0400
@@ -10293,7 +10293,7 @@ dump_swap_info(ulong swapflags, ulong *t
} else if (VALID_MEMBER
(swap_info_struct_old_block_size)) {
get_pathname(file_to_dentry(swap_file),
- buf, BUFSIZE, 1, 0);
+ buf, BUFSIZE, 1,
file_to_vfsmnt(swap_file));
} else {
get_pathname(swap_file, buf, BUFSIZE, 1,
0);
}
18 years, 5 months
crash version 4.0-2.31 is available
by Dave Anderson
- Bumped crash-internal NR_CPUS for x86 and ia64; added a warning
message to "recompile crash" and forced an initialization failure
when the kernel's configured NR_CPUS is greater than the maximum
allowed NR_CPUS value compiled into crash.
(maneesh(a)in.ibm.com, anderson(a)redhat.com)
- Fix for initialization failure indicating a kernel/memory-source
mismatch when x86 kernel configures its physical memory start
address higher than the traditional 1MB starting point.
(anderson(a)redhat.com)
- Fix for kernels that have replaced the "system_utsname" data
structure with contents of the "init_uts_ns" data structure.
This fixes a "crash: cannot resolve system_utsname" initialization
failure. (pbadari(a)us.ibm.com, anderson(a)redhat.com)
- Fix for large LKCD dumpfiles that resulted in an initialization
time failure indicating "fixme, need to add more zones (ZONE_ALLOC)".
When statically-defined ZONE_ALLOC value is too small, the fix
expands the zone size dynamically. (indou.takao(a)jp.fujitsu.com)
- Fix for "kmem -i" failure when the "all_bdevs" block_device list
is empty. Part of the command output would be displayed, followed by
"kmem: invalid kernel virtual address: 0 type: inode buffer".
(anderson(a)redhat.com)
- First pass at supporting a xen hypervisor kexec/kdump vmcore as the
dumpfile format for the dom0 vmlinux. Developed/tested OK on an x86
vmlinux/vmcore set supplied by horms(a)verge.net.au. Code for x86_64
is in place, but untested. (anderson(a)redhat.com)
- Also in place, but untested, is initial support for xen x86 PAE
kernels. (anderson(a)redhat.com)
Download from: http://people.redhat.com/anderson
18 years, 5 months
crash with Xen image from kdump
by Dave Anderson
> I added some code to kdump to have it record CR3 for dom0. This is
> done using a second note in the per-cpu notes area, which for now
> just stores a single 4byte entity, the mfn of that CPU in dom0
> if it was present in dom0.
>
> I have made a dump available that includes this. The tarball
> also includes the kernels, xen, symbol files, and patches to xen.
> If you want to find the cr3 saving code its in ./arch/x86/crash.c
>
> I plan to post this update to xen-devel shortly, hopefully tomorrow,
> after upporting to the latest xen tree (I'm still working off about
> 3 weeks ago's tree).
>
> http://packages.vergenet.net/tmp/xen-unstable.hg+kexec-20060616.tar.bz2
OK -- here's a proof-of-concept running the dom0 vmlinux against the
xen kdump:
# crash vmlinux vmcore
crash 4.0-2.31-rc1
Copyright (C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005 Fujitsu Limited
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
KERNEL: vmlinux
DUMPFILE: vmcore
CPUS: 2
DATE: Wed Jun 14 15:05:01 2006
UPTIME: 00:04:40
LOAD AVERAGE: 1.22, 0.39, 0.13
TASKS: 94
NODENAME: aiko.lab.ultramonkey.org
RELEASE: 2.6.16.13-xen
VERSION: #7 SMP Fri Jun 9 16:25:32 JST 2006
MACHINE: i686 (866 Mhz)
MEMORY: 887.4 MB
PANIC: "SysRq : Trigger a crashdump"
PID: 3949
COMMAND: "do_kdump"
TASK: f3e64030 [THREAD_INFO: f3dba000]
CPU: 1
STATE: TASK_RUNNING (SYSRQ)
crash> bt -a
PID: 0 TASK: c02ce460 CPU: 0 COMMAND: "swapper"
#0 [c030ff34] schedule at c028e648
#1 [c030ffb0] cpu_idle at c0103e9f
PID: 3949 TASK: f3e64030 CPU: 1 COMMAND: "do_kdump"
#0 [f3dbbed8] crash_kexec at c0140c45
#1 [f3dbbf28] __handle_sysrq at c01f54e4
#2 [f3dbbf54] write_sysrq_trigger at c019cbff
#3 [f3dbbf6c] vfs_write at c0168dbf
#4 [f3dbbf90] sys_write at c0169736
#5 [f3dbbfb8] system_call at c0105542
EAX: 00000004 EBX: 00000001 ECX: 080f8408 EDX: 00000002
DS: 007b ESI: 00000002 ES: 007b EDI: b7f007c0
SS: 007b ESP: bfb5ffc8 EBP: bfb5ffe4
CS: 0073 EIP: b7e93028 ERR: 00000004 EFLAGS: 00000246
crash>
As I discussed earlier, given that this is a writable-page-table
kernel, having any legitimate CR3 (I just use the first one found
in the ELF header), I first get the value of "max_pfn" (x86),
and then the value of "phys_to_machine_mapping", which makes up
dom0's "phys_to_machine_mapping[max_pfn]" array. From that, all
subsequent pseudo-physical address requests can be translated
into the physical address for the existing read_netdump() function
to access. As we talked about before, this won't work for
shadow-page-table kernels; for those I would need to having the
"pfn_to_mfn_frame_list_list" mfn value from the shared,
per-domain, "arch_shared_info" structure(s). With that single
value, the phys_to_machine_mapping[] array can be resurrected
for both writable- and shadow-page-table kernels.
Also, with either the cr3 or pfn_to_mfn_frame_list_list schemes,
if those values were made available for *all* of the other domains
instead of just dom0, then we could run a crash session against
any of the domains on the system.
In any case, this is pretty cool for starters...
BTW, I've created a new n_type value to handle this particular
invocation, which I understand will be subject to change.
Note that the spelling in your PT_NOTE is a bit strange:
crash> help -n
...
Elf32_Nhdr:
n_namesz: 18 ("Xen Domanin-0 CR3")
n_descsz: 4
n_type: 10000001 (NT_XEN_KDUMP_CR3)
00027227
...
crash>
Anyway, I'll do the same thing for x86_64 (untested) and
update the crash release so you'll have something to work
with.
Thanks,
Dave
18 years, 5 months
Increase NR_CPUS
by Maneesh Soni
Hi Dave,
crash seg faults while opening a kdump with NR_CPUS=128, due to buffer overflow
in max_cpudata_limit() on a i386 system.
--------
kmem_cache_s_array_nodes:
if (!readmem(cache+OFFSET(kmem_cache_s_array),
KVADDR, &cpudata[0],
sizeof(ulong) * ARRAY_LENGTH(kmem_cache_s_array),
"array cache array", RETURN_ON_ERROR))
goto bail_out;
for (i = max_limit = 0; (i < ARRAY_LENGTH(kmem_cache_s_array)) &&
cpudata[i]; i++) {
if (!readmem(cpudata[i]+OFFSET(array_cache_limit),
KVADDR, &limit, sizeof(int),
"array cache limit", RETURN_ON_ERROR))
goto bail_out;
if (limit > max_limit)
max_limit = limit;
}
*cpus = i; <<<<<< faults here
--------
The first readmem() call overwrites the parameter "cpus" on stack. ARRAY_LENGTH
gives 128 whereas we have 32 elements in cpudata[NR_CPUS].
Though the default NR_CPUS in kernel source is 32 but it can go upto
256 based on the kernel config option CONFIG_NR_CPUS. So, in crash it
should be defined as the max NR_CPUS. Please find the patch below which
makes sure to have max NR_CPUS for various architecture.
--- crash-4.0-2.30/defs.h 2006-06-07 01:16:33.000000000 +0530
+++ crash-4.0-2.30-fix/defs.h 2006-06-24 04:29:35.000000000 +0530
@@ -56,7 +56,7 @@
#define FALSE (0)
#ifdef X86
-#define NR_CPUS (32)
+#define NR_CPUS (256)
#endif
#ifdef X86_64
#define NR_CPUS (256)
@@ -68,7 +68,7 @@
#define NR_CPUS (32)
#endif
#ifdef IA64
-#define NR_CPUS (512)
+#define NR_CPUS (1024)
#endif
#ifdef PPC64
#define NR_CPUS (128)
Thanks
Maneesh
18 years, 5 months
crash-4.0-2.30 broken on 2.6.17-rc6-mm2 ?
by Badari Pulavarty
Known issue ?
Thanks,
Badari
elm3b29:~/vec.mm/linux-2.6.17-rc6 # /tmp/crash ./System.map vmlinux
crash 4.0-2.30
Copyright (C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005 Fujitsu Limited
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public
License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for
details.
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
crash: cannot resolve "system_utsname"
18 years, 5 months
Re: [Crash-utility] crash-4.0-2.30 broken on 2.6.17-rc6-mm2 ?
by Dave Anderson
> Badari Pulavarty wrote:
>
> > Just realized that -mm has patches to remove system_utsname.
> > Thing to watch out for future changes for crash.
> >
>
> What does sys_uname() read?
>
> Dave
>
I put it at the top of the crash.TODO list...
If anybody wants to take a crack at it before
I get around to it, be my guest.
Dave
18 years, 5 months
ERROR: fixme, need to add more zones (ZONE_ALLOC)
by Takao Indoh
Hi,
When I tested LKCD on a machine which has big memory,
an error occurred and the following message was displayed.
fixme, need to add more zones (ZONE_ALLOC)
This message means that zone size(the size is defined
statically by ZONE_ALLOC) is too small. Attached patch
fixes crash to expand zone size dynamically.
Regards,
Takao Indoh
diff -Nurp crash-4.0-2.30.org/lkcd_common.c crash-4.0-2.30/lkcd_common.c
--- crash-4.0-2.30.org/lkcd_common.c 2006-06-14 22:14:09.000000000 +0900
+++ crash-4.0-2.30/lkcd_common.c 2006-06-15 12:34:08.000000000 +0900
@@ -670,6 +670,8 @@ save_offset(uint64_t paddr, off_t off)
{
uint64_t zone, page;
int ii, ret;
+ int max_zones;
+ struct physmem_zone *zones;
zone = paddr & lkcd->zone_mask;
@@ -696,6 +698,7 @@ save_offset(uint64_t paddr, off_t off)
lkcd->num_zones++;
}
+retry:
/* find the zone */
for (ii=0; ii < lkcd->num_zones; ii++) {
if (lkcd->zones[ii].start == zone) {
@@ -737,8 +740,20 @@ save_offset(uint64_t paddr, off_t off)
ret = 1;
lkcd->num_zones++;
} else {
- lkcd_print("fixme, need to add more zones (ZONE_ALLOC)\n");
- exit(1);
+ /* need to expand zone */
+ max_zones = lkcd->max_zones * 2;
+ zones = malloc(max_zones * sizeof(struct physmem_zone));
+ if (!zones) {
+ return -1; /* This should be fatal */
+ }
+ BZERO(zones, max_zones * sizeof(struct physmem_zone));
+ memcpy(zones, lkcd->zones,
+ lkcd->max_zones * sizeof(struct physmem_zone));
+ free(lkcd->zones);
+
+ lkcd->zones = zones;
+ lkcd->max_zones = max_zones;
+ goto retry;
}
}
18 years, 5 months
Re: crash with Xen dom0 image from kdump
by Horms
On Mon, Jun 12, 2006 at 01:41:07PM +0900, Kazuo Moriwaka wrote:
> Hi,
>
> I'm not clear about shadow mode; are vcpu->arch.shadow_table need for
> shadow-mode domains?
To be perfectly honest, I'm not clear about it either.
The code below probably does not cover shadow mode, but
it should be easy enough to fix, probably using vcpu->arch.shadow_table
as you suggest. I'll look into it some more.
--
西門 宝曼 (サイモン・ホーマン) | Simon Horman (Horms)
18 years, 5 months
Re: crash with Xen dom0 image from kdump
by David Anderson
> So it would appear that given the mfn of the xen_start_info structure
> of a domain, then the pfn_to_mfn_frame_list_list could be tracked down.
Even easier -- just getting the mfn of the "shared_info" page would be
even less painful. Take the xen_start_info page right out of the
picture...
Dave
18 years, 5 months