March 2009 - Crash-utility - Crash Utility List Archives

by Anirudh Srinivasan

hello friends, I was setting up netdump server in my workplace. I followed the following procedure: Server Configuration: 1. Verify that the netdump server is installed: rpm -q netdump-server. If it is not installed, install it by running the command: up2date netdump-server. 2. After the netdump server package is installed change the password for the "netdump" user to something that you know: passwd netdump 3. Enable the netdump server: chkconfig netdump-server on 4. Start the netdump server: service netdump-server start Client Configuration: 1. Verify that the netdump client is installed: rpm -q netdump. If it is not installed, install it by running the command: up2date netdump. 2. Edit /etc/sysconfig/netdump and add the following line: NETDUMPADDR=192.168.0.5 **192.168.0.5 should be changed to the ip address of the netdump server. 3. Enter the following command and give the netdump password when prompted: service netdump propagate 4. Enable the netdump client: chkconfig netdump on 5. Start the netdump client: service netdump start Now after doing this i get the following message: # service netdump start netdump: cannot arp <ipaddress> netdump: cannot find <ipaddress>in arp cache netdump: can't resolve <ipaddress> MAC address netdump server address resolution [FAILED] What could be the reason for this ? How could i solve this? Thanks Anirudh Srinivasan

16 years, 4 months

3
2
0 / 0

live crash(4.0-5.0.3) invocation fails on rhel5

by Nipul Gandhi

Hi all - What am I doing wrong here ? [root@wal-rhel5-04 kern]# uname -a Linux wal-rhel5-04 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux Using the installed by this debuginfo RPM: kernel-debuginfo-common-2.6.18-92.el5 kernel-debug-debuginfo-2.6.18-92.el5 [root@wal-rhel5-04 kern]# crash /usr/lib/debug/lib/modules/2.6.18-92.el5debug/vmlinux : : WARNING: /usr/lib/debug/lib/modules/2.6.18-92.el5debug/vmlinux and /proc/version do not match! WARNING: /proc/version indicates kernel version: 2.6.18-92.el5 crash: please use the vmlinux file for that kernel version, or try using the System.map for that kernel version as an additional argument. [root@wal-rhel5-04 tmp]# cat /proc/version Linux version 2.6.18-92.el5 (brewbuilder(a)ls20-bc2-13.build.redhat.com) (gcc version 4.1.2 20071124 (Red Hat 4.1.2-41)) #1 SMP Tue Apr 29 13:16:15 EDT 2008 I tried using the System.Map as well as argument.....but then it segfaulted. # crash /usr/lib/debug/lib/modules/2.6.18-92.el5debug/vmlinux /boot/System.map-2.6.18-92.el5 Segmentation fault (core dumped) Thanks in advance for any help. -Nipul

16 years, 4 months

2
1
0 / 0

Re: [Crash-utility] Question about timestampt output for "sys"

by Dave Anderson

----- "James Washer" <washer(a)trlp.com> wrote: > The time is aware of MY timezone (easily tested).. but I'd still not > sure if is the time of the panic... or some later time > > On Mon, 2009-03-30 at 12:08 -0700, James Washer wrote: > > If I run 'sys', I see timestamps such as > > DATE: Thu Mar 26 08:53:13 2009 > > > > What "time" is this.. the time the panic occurred? The time the dump was > > "collected"? Is it Zulu timeszone, is it my (the crash investigators) > > time zone, is it the timezone of the system that crashed? It's a ctime() translation of the contents of the kernel's "xtime" timespec structure. So running on a live system, you can see it change. On a dumpfile, that's a good question, because thinking about it, it may have slightly different meanings depending upon the dumpfile-creation mechanism used. So, for example, on a netdump or diskdump it's whatever was last there when the kernel memory containing the data structure was copied to disk or over the network. With a kdump, it would still be getting bumped up until the point where the kernel transitions/kexec's into the secondary kernel, right? Anyway, it's *somewhere* around the time of the panic... Dave

16 years, 4 months

1
0
0 / 0

Question about timestampt output for "sys"

by James Washer

If I run 'sys', I see timestamps such as DATE: Thu Mar 26 08:53:13 2009 What "time" is this.. the time the panic occurred? The time the dump was "collected"? Is it Zulu timeszone, is it my (the crash investigators) time zone, is it the timezone of the system that crashed? Thanks - jim

16 years, 4 months

1
1
0 / 0

crash version 4.0-8.8 is available

by Dave Anderson

- If a live kernel crash session fails during initialization due to read errors, and it appears to be because the running kernel was configured with CONFIG_STRICT_DEVMEM, display this warning message: "crash: This kernel may be configured with CONFIG_STRICT_DEVMEM, which renders /dev/mem unusable as a live memory source." (anderson(a)redhat.com) - Fix for the "bt" command to prevent a segmentation violation seen with an x86_64 Egenera/LKCD dumpfile where the starting stack hooks for the active tasks in the dumpfile header were nonsensical. (anderson(a)redhat.com) - Fix for the chronological display of the kernel printk buffer data by the "log" output if the administrator has cleared the buffer with syslog() or klogctl(). (oomichi(a)mxs.nes.nec.co.jp) - Change the message displayed when supplying a non-process stack address as an argument to "bt -S". Because the supplied address is typically valid, such as a hard or soft IRQ stack address, the message will indicate "non-process address" instead of "invalid stack address". (anderson(a)redhat.com) - The crash-<release>.src.rpm will create an additional binary crash-extensions-<release>.rpm file containing the sial.so and dminfo.so extension modules. The modules will be installed in the /usr/lib[64]/crash/extensions directory. (holzheu(a)linux.vnet.ibm.com, anderson(a)redhat.com) - If a shared-object filename passed to the "extend" command is not expressed with a fully-qualified pathname, the following directories will be searched in the order shown, and the first instance of the file that is found will be selected: 1. the current working directory 2. the directory specified in the CRASH_EXTENSIONS shell environment variable 3. /usr/lib64/crash/extensions (64-bit architectures) 4. /usr/lib/crash/extensions The same rules will be applied when unloading shared object files with "extend -u <shared-object>". Without the patch, only files in the current directory or those specified with a fully-qualified pathname were accepted. (anderson(a)redhat.com) - Changed the manner in which the "bt" command determines which PID 0 swapper task was interrupted by an ia64 INIT or MCA exception. There is an existing ia64 INIT/MCA handler bug which incorrectly writes the pseudo task's command name in its comm[] name string such that the cpu number may not be part of the string. If that happens without this patch, the "bt" command fails to make the link back to the interrupted task, and displays the error message: "bt: unwind: failed to locate return link (ip=0x0)!" (anderson(a)redhat.com) - Removed an unused initialized variable in get_task_mem_usage(). (junkoi2004(a)gmail.com) - Added a debug-level 8 statement in readmem() that will display the current input address and its translated physical address under the existing debug-level 4 "<readmem: ...>" debug line, put in place to aid in debugging read and/or seek errors. (anderson(a)redhat.com) Download from: http://people.redhat.com/anderson

16 years, 4 months

1
0
0 / 0

Fwd: crash seek error

by Dave Anderson

----- Forwarded Message ----- From: "Dharmosoth Seetharam" <seetharam_21(a)yahoo.com> To: "Dave Anderson" <anderson(a)redhat.com> Sent: Wednesday, March 11, 2009 2:12:01 PM GMT -05:00 US/Canada Eastern Subject: Re: crash seek error Hi Dave, I have compiled the latest crash tool and tried with the dump file, it looks good. thanks for your quick suggestion. Sure i will also include mailing list. thanks a lot. regards, Seetharam

16 years, 5 months

1
0
0 / 0

Re: [Crash-utility] Re: crash seek error

by Dave Anderson

----- "Dave Anderson" <anderson(a)redhat.com> wrote: > ----- "Dharmosoth Seetharam" <seetharam_21(a)yahoo.com> wrote: > > dump_header: > > dh_magic_number: 618f23ed (DUMP_MAGIC_NUMBER) > > dh_version: 8 (LKCD_DUMP_V8) > > dh_header_size: 734 > > dh_dump_level: f > > (DUMP_LEVEL_HEADER|DUMP_LEVEL_KERN|DUMP_LEVEL_USED|DUMP_LEVEL_ALL) > > dh_page_size: 4096 > > dh_memory_size: 524153 > > dh_memory_start: c0000000 > > dh_memory_end: 618f23ed > > dh_num_pages: 524153 > > dh_panic_string: Compulsory dump(stat of mkexec was set as 2). > > dh_time: Tue Mar 10 17:48:00 2009 > > dh_utsname_sysname: Linux > > dh_utsname_nodename: Assam > > dh_utsname_release: 2.6.12-5MKEXEC > > dh_utsname_version: #7 SMP Thu Mar 5 15:25:22 IST 2009 > > dh_utsname_machine: i686 > > dh_utsname_domainname: (none) > > dh_current_task: efc4a020 > > dh_dump_compress: 0 (DUMP_COMPRESS_NONE) > > dh_dump_flags: 80000000 () > > dh_dump_device: 0 > > unknown page flag in dump: 2de > > found DUMP_DH_END > > <readmem: 8015564b, KVADDR, "x86_omit_frame_pointer", 8, (ROE), 7fbbe228> > > crash: seek error: kernel virtual address: 8015564b type: "x86_omit_frame_point er" > > <readmem: 804b1210, KVADDR, "xtime", 8, (FOE), 834d234> > > crash: seek error: kernel virtual address: 804b1210 type: "xtime" > > [root@Assam ~]# > > > > can you please help me in this. One other thing to look at... > > dh_memory_start: c0000000 The failing kernel virtual addresses are 8015564b and 804b1210, so apparently you're running a kernel configured with a 2G/2G split? I'm not sure whether the crash utility even works with that configuration? Crash does support the old RHEL4 "hugemem" 4G/4G kernels, but I've never worked with a 2G/2G kernel. In any case, it may work by dumb luck -- to be sure, first try to run crash on the live system. Anyway, even though the dump header advertises a kernel configured with the traditional 3G/1G split (with kernel memory starting at c000000), that "dh_memory_start" field is not used by the crash utility. Dave

16 years, 5 months

1
0
0 / 0

Re: crash seek error

by Dave Anderson

----- "Dharmosoth Seetharam" <seetharam_21(a)yahoo.com> wrote: > Hi, > > I have configured the linux kernel 2.6.12 to support the kernel crash dump using the > "mini kernel dump" method. Sorry, but I have no clue what the "mini kernel dump" method is. Although from the output below, it looks to be an LKCD derivative. > > I have few questions please help me. > > details: > kernel : linux 2.6.12 > arch : i386 > distr: centOS > System RAM : 8G > > 1) While writing dump to block device its got hung after writing 4GB > > 2) I have reduced my SYSTEM RAM to 2G and tried it dumped 2G to block device > > But, crash tool unable to read it. > following is the error > > ---- > [root@Assam ~]# crash -d7 /root/linux-2.6.12/vmlinux > /scratch/dump/2009031017583 1/lkcd_dump > crash 4.0-2.15 Your crash version is remarkably old -- 3+ years old -- and it's always worth your while to update to the latest version. > Copyright (C) 2002, 2003, 2004, 2005 Red Hat, Inc. > Copyright (C) 2004, 2005 IBM Corporation > Copyright (C) 1999-2005 Hewlett-Packard Co > Copyright (C) 2005 Fujitsu Limited > Copyright (C) 2005 NEC Corporation > Copyright (C) 1999, 2002 Silicon Graphics, Inc. > Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. > This program is free software, covered by the GNU General Public License, > and you are welcome to change it and/or distribute copies of it under > certain conditions. Enter "help copying" to see the conditions. > This program has absolutely no warranty. Enter "help warranty" for details. > crash: diskdump: dump does not have panic dump header > dump_header: > dh_magic_number: 618f23ed (DUMP_MAGIC_NUMBER) > dh_version: 8 (LKCD_DUMP_V8) > dh_header_size: 734 > dh_dump_level: f > (DUMP_LEVEL_HEADER|DUMP_LEVEL_KERN|DUMP_LEVEL_USED|DUMP_LEV EL_ALL) > dh_page_size: 4096 > dh_memory_size: 524153 > dh_memory_start: c0000000 > dh_memory_end: 618f23ed > dh_num_pages: 524153 > dh_panic_string: Compulsory dump(stat of mkexec was set as 2). > dh_time: Tue Mar 10 17:48:00 2009 > dh_utsname_sysname: Linux > dh_utsname_nodename: Assam > dh_utsname_release: 2.6.12-5MKEXEC > dh_utsname_version: #7 SMP Thu Mar 5 15:25:22 IST 2009 > dh_utsname_machine: i686 > dh_utsname_domainname: (none) > dh_current_task: efc4a020 > dh_dump_compress: 0 (DUMP_COMPRESS_NONE) > dh_dump_flags: 80000000 () > dh_dump_device: 0 > unknown page flag in dump: 2de > found DUMP_DH_END > <readmem: 8015564b, KVADDR, "x86_omit_frame_pointer", 8, (ROE), 7fbbe228> > crash: seek error: kernel virtual address: 8015564b type: "x86_omit_frame_point er" > <readmem: 804b1210, KVADDR, "xtime", 8, (FOE), 834d234> > crash: seek error: kernel virtual address: 804b1210 type: "xtime" > [root@Assam ~]# > > can you please help me in this. Maybe, maybe not... Seek errors are meant to indicate that, after the translation from kernel virtual address to physical address to the dumpfile location ended up with a dumpfile offset that was either: (1) not accessible, or (2) the physical page associated with the virtual address was not found in the dumpfile. I can't really help you with LKCD particulars, and like I mentioned above, I don't know what the "mini kernel dump" version of LKCD is, but I do note above that the dumpfile is being recognized as version LKCD_DUMP_V8. And http://people.redhat.com/anderson/crash.changelog.html contains this change to 4.0-2.17 that fixed something in your version 4.0-2.15: 4.0-2.17 - Fix to resurrect LKCD version 8 support, inadvertently broken in 4.0-2.15. (troy.heber(a)hp.com) - Fix for "net -S" failures in certain 2.6 kernels that failed with "net: cannot determine what an inet_sock structure is" message; shows embedded sock structure instead of failing. (anonymous donor) - Fix for erroneous "net -s" source/destination address and port values in certain 2.6 kernels; added "net -s" source/destination address and port values for IPv6 sockets. (anderson(a)redhat.com) (12/16/05) 4.0-2.16 - Fix for the x86_64 backtrace code to search all of the exception stacks for the origin of the active tasks' backtrace when the information is not available in the dumpfile header. Up until now, the search was made in the process stack, the per-cpu IRQ stack, and the per-cpu NMI exception stack; this patch looks at all 3 exception stacks in 2.4 kernels (NMI, STACKFAULT and DOUBLEFAULT), and all 5 exception stacks in 2.6 kernels (NMI, STACKFAULT, DOUBLEFAULT, DEBUG and MCE). - Fix to remove erroneous warning message re: the task cpu not being the same as the IRQ or exception stack cpu, which was displayed when doing a non-context-sensitive "bt -E" on an x86_64. (12/12/05) 4.0-2.15 - Applied Kurt Rader's (kdrader(a)us.ibm.com) patch for SUSE SLES 9 "bigsmp" kernel LKCD dumpfiles, to fix "conflicting page" abort caused by a dumpfile header that is larger than the formerly hard-wired header size. - Fix for ppc64-only segmentation violation when running "bt" on the panic task when run against a dumpfile created by the diskdump facility's new compressed format. (12/02/05) Perhaps upgrading to the latest version (4.0-7.7) will help? Dave > thanks in advance. > > regards, > Seetharam

16 years, 5 months

1
0
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility March 2009