 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        netdump starting problem
                                
                                
                                
                                    
                                        by Anirudh Srinivasan
                                    
                                
                                
                                        hello friends,
I was setting up netdump server in my workplace. I followed the following
procedure:
Server Configuration:
   1.
   Verify that the netdump server is installed: rpm -q netdump-server. If it
   is not installed, install it by running the command: up2date
   netdump-server.
   2.
   After the netdump server package is installed change the password for the
   "netdump" user to something that you know: passwd netdump
   3.
   Enable the netdump server: chkconfig netdump-server on
   4.
   Start the netdump server: service netdump-server start
Client Configuration:
   1.
   Verify that the netdump client is installed: rpm -q netdump. If it is not
   installed, install it by running the command: up2date netdump.
   2.
   Edit /etc/sysconfig/netdump and add the following line:
   NETDUMPADDR=192.168.0.5
   **192.168.0.5 should be changed to the ip address of the netdump server.
   3.
   Enter the following command and give the netdump password when
prompted: service
   netdump propagate
   4.
   Enable the netdump client: chkconfig netdump on
   5. Start the netdump client: service netdump start
Now after doing this i get the following message:
# service netdump start
netdump: cannot arp <ipaddress>
netdump: cannot find <ipaddress>in arp cache
netdump: can't resolve <ipaddress> MAC address
netdump server address resolution                          [FAILED]
What could be the reason for this ? How could i solve this?
Thanks
Anirudh Srinivasan
                                
                         
                        
                                
                                16 years, 7 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        live crash(4.0-5.0.3)  invocation fails on rhel5
                                
                                
                                
                                    
                                        by Nipul Gandhi
                                    
                                
                                
                                        
Hi all - 
What am I doing wrong here ?
[root@wal-rhel5-04 kern]# uname -a
Linux wal-rhel5-04 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
Using the installed by this debuginfo RPM:
kernel-debuginfo-common-2.6.18-92.el5
kernel-debug-debuginfo-2.6.18-92.el5
[root@wal-rhel5-04 kern]# crash /usr/lib/debug/lib/modules/2.6.18-92.el5debug/vmlinux
:
:
WARNING: /usr/lib/debug/lib/modules/2.6.18-92.el5debug/vmlinux
         and /proc/version do not match!
WARNING: /proc/version indicates kernel version: 2.6.18-92.el5
crash: please use the vmlinux file for that kernel version, or try using
       the System.map for that kernel version as an additional argument.
[root@wal-rhel5-04 tmp]# cat /proc/version
Linux version 2.6.18-92.el5 (brewbuilder(a)ls20-bc2-13.build.redhat.com) (gcc version 4.1.2 20071124 (Red Hat 4.1.2-41)) #1 SMP Tue Apr 29 13:16:15 EDT 2008
I tried using the System.Map as well as argument.....but then it segfaulted.
# crash /usr/lib/debug/lib/modules/2.6.18-92.el5debug/vmlinux /boot/System.map-2.6.18-92.el5
Segmentation fault (core dumped)
Thanks in advance for any help.
-Nipul
      
                                
                         
                        
                                
                                16 years, 7 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: [Crash-utility] Question about timestampt output for "sys"
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
----- "James Washer" <washer(a)trlp.com> wrote:
> The time is aware of MY timezone (easily tested).. but I'd still not
> sure if is the time of the panic... or some later time
> 
> On Mon, 2009-03-30 at 12:08 -0700, James Washer wrote:
> > If I run 'sys', I see timestamps such as
> > 			   DATE: Thu Mar 26 08:53:13 2009
> > 
> > What "time" is this.. the time the panic occurred? The time the dump was
> > "collected"? Is it Zulu timeszone, is it my (the crash investigators)
> > time zone, is it the timezone of the system that crashed?
It's a ctime() translation of the contents of the kernel's "xtime" timespec
structure.  So running on a live system, you can see it change.  
On a dumpfile, that's a good question, because thinking about it, it may have
slightly different meanings depending upon the dumpfile-creation mechanism used. 
So, for example, on a netdump or diskdump it's whatever was last there when the
kernel memory containing the data structure was copied to disk or over the
network.  With a kdump, it would still be getting bumped up until the point
where the kernel transitions/kexec's into the secondary kernel, right?
Anyway, it's *somewhere* around the time of the panic...
Dave
                                
                         
                        
                                
                                16 years, 7 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Question about timestampt output for "sys"
                                
                                
                                
                                    
                                        by James Washer
                                    
                                
                                
                                        If I run 'sys', I see timestamps such as
			   DATE: Thu Mar 26 08:53:13 2009
What "time" is this.. the time the panic occurred? The time the dump was
"collected"? Is it Zulu timeszone, is it my (the crash investigators)
time zone, is it the timezone of the system that crashed?
Thanks
 - jim
                                
                         
                        
                                
                                16 years, 7 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        crash version 4.0-8.8 is available
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
- If a live kernel crash session fails during initialization due to 
  read errors, and it appears to be because the running kernel was 
  configured with CONFIG_STRICT_DEVMEM, display this warning message:
  "crash: This kernel may be configured with CONFIG_STRICT_DEVMEM, 
  which renders /dev/mem unusable as a live memory source."
  (anderson(a)redhat.com)
- Fix for the "bt" command to prevent a segmentation violation seen
  with an x86_64 Egenera/LKCD dumpfile where the starting stack hooks 
  for the active tasks in the dumpfile header were nonsensical.
  (anderson(a)redhat.com)
- Fix for the chronological display of the kernel printk buffer data
  by the "log" output if the administrator has cleared the buffer
  with syslog() or klogctl().  (oomichi(a)mxs.nes.nec.co.jp)
- Change the message displayed when supplying a non-process stack
  address as an argument to "bt -S".  Because the supplied address
  is typically valid, such as a hard or soft IRQ stack address,
  the message will indicate "non-process address" instead of 
  "invalid stack address".  (anderson(a)redhat.com)
- The crash-<release>.src.rpm will create an additional binary
  crash-extensions-<release>.rpm file containing the sial.so and 
  dminfo.so extension modules.  The modules will be installed in the 
  /usr/lib[64]/crash/extensions directory.
  (holzheu(a)linux.vnet.ibm.com, anderson(a)redhat.com)
- If a shared-object filename passed to the "extend" command is not
  expressed with a fully-qualified pathname, the following directories
  will be searched in the order shown, and the first instance of the 
  file that is found will be selected:
    1. the current working directory
    2. the directory specified in the CRASH_EXTENSIONS shell 
       environment variable
    3. /usr/lib64/crash/extensions (64-bit architectures)
    4. /usr/lib/crash/extensions
  The same rules will be applied when unloading shared object files
  with "extend -u <shared-object>".  Without the patch, only files
  in the current directory or those specified with a fully-qualified
  pathname were accepted.  (anderson(a)redhat.com)
- Changed the manner in which the "bt" command determines which PID 0 
  swapper task was interrupted by an ia64 INIT or MCA exception.
  There is an existing ia64 INIT/MCA handler bug which incorrectly 
  writes the pseudo task's command name in its comm[] name string
  such that the cpu number may not be part of the string.  If that
  happens without this patch, the "bt" command fails to make the link
  back to the interrupted task, and displays the error message:
  "bt: unwind: failed to locate return link (ip=0x0)!"
  (anderson(a)redhat.com)
- Removed an unused initialized variable in get_task_mem_usage().
  (junkoi2004(a)gmail.com) 
- Added a debug-level 8 statement in readmem() that will display the
  current input address and its translated physical address under the
  existing debug-level 4 "<readmem: ...>" debug line, put in place to
  aid in debugging read and/or seek errors.  
  (anderson(a)redhat.com)
Download from: http://people.redhat.com/anderson
                                
                         
                        
                                
                                16 years, 7 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Fwd: crash seek error
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
----- Forwarded Message -----
From: "Dharmosoth Seetharam" <seetharam_21(a)yahoo.com>
To: "Dave Anderson" <anderson(a)redhat.com>
Sent: Wednesday, March 11, 2009 2:12:01 PM GMT -05:00 US/Canada Eastern
Subject: Re: crash seek error
Hi Dave, 
I have compiled the latest crash tool and tried with the dump file, it looks good. 
thanks for your quick suggestion. 
Sure i will also include mailing list. 
thanks a lot. 
regards, 
Seetharam 
                                
                         
                        
                                
                                16 years, 7 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: [Crash-utility] Re: crash seek error
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
----- "Dave Anderson" <anderson(a)redhat.com> wrote:
> ----- "Dharmosoth Seetharam" <seetharam_21(a)yahoo.com> wrote:
> > dump_header:
> > dh_magic_number: 618f23ed (DUMP_MAGIC_NUMBER)
> > dh_version: 8 (LKCD_DUMP_V8)
> > dh_header_size: 734
> > dh_dump_level: f
> > (DUMP_LEVEL_HEADER|DUMP_LEVEL_KERN|DUMP_LEVEL_USED|DUMP_LEVEL_ALL)
> > dh_page_size: 4096
> > dh_memory_size: 524153
> > dh_memory_start: c0000000
> > dh_memory_end: 618f23ed
> > dh_num_pages: 524153
> > dh_panic_string: Compulsory dump(stat of mkexec was set as 2).
> > dh_time: Tue Mar 10 17:48:00 2009
> > dh_utsname_sysname: Linux
> > dh_utsname_nodename: Assam
> > dh_utsname_release: 2.6.12-5MKEXEC
> > dh_utsname_version: #7 SMP Thu Mar 5 15:25:22 IST 2009
> > dh_utsname_machine: i686
> > dh_utsname_domainname: (none)
> > dh_current_task: efc4a020
> > dh_dump_compress: 0 (DUMP_COMPRESS_NONE)
> > dh_dump_flags: 80000000 ()
> > dh_dump_device: 0
> > unknown page flag in dump: 2de
> > found DUMP_DH_END
> > <readmem: 8015564b, KVADDR, "x86_omit_frame_pointer", 8, (ROE), 7fbbe228>
> > crash: seek error: kernel virtual address: 8015564b type: "x86_omit_frame_point er"
> > <readmem: 804b1210, KVADDR, "xtime", 8, (FOE), 834d234> 
> > crash: seek error: kernel virtual address: 804b1210 type: "xtime"
> > [root@Assam ~]#
> >
> > can you please help me in this.
One other thing to look at...
> > dh_memory_start: c0000000
The failing kernel virtual addresses are 8015564b and 804b1210, so apparently
you're running a kernel configured with a 2G/2G split?  I'm not sure
whether the crash utility even works with that configuration?  Crash does
support the old RHEL4 "hugemem" 4G/4G kernels, but I've never worked with
a 2G/2G kernel.  In any case, it may work by dumb luck -- to be sure, first
try to run crash on the live system.
Anyway, even though the dump header advertises a kernel configured with 
the traditional 3G/1G split (with kernel memory starting at c000000), 
that "dh_memory_start" field is not used by the crash utility.
Dave
                                
                         
                        
                                
                                16 years, 7 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: crash seek error
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
----- "Dharmosoth Seetharam" <seetharam_21(a)yahoo.com> wrote:
> Hi,
> 
> I have configured the linux kernel 2.6.12 to support the kernel crash dump using the
> "mini kernel dump" method.
Sorry, but I have no clue what the "mini kernel dump" method is.
Although from the output below, it looks to be an LKCD derivative.
> 
> I have few questions please help me.
> 
> details:
> kernel : linux 2.6.12
> arch : i386
> distr: centOS
> System RAM : 8G
> 
> 1) While writing dump to block device its got hung after writing 4GB
> 
> 2) I have reduced my SYSTEM RAM to 2G and tried it dumped 2G to block device
> 
> But, crash tool unable to read it.
> following is the error
> 
> ----
> [root@Assam ~]# crash -d7 /root/linux-2.6.12/vmlinux
> /scratch/dump/2009031017583 1/lkcd_dump
> crash 4.0-2.15
Your crash version is remarkably old -- 3+ years old -- and it's 
always worth your while to update to the latest version.
 
> Copyright (C) 2002, 2003, 2004, 2005 Red Hat, Inc.
> Copyright (C) 2004, 2005 IBM Corporation
> Copyright (C) 1999-2005 Hewlett-Packard Co
> Copyright (C) 2005 Fujitsu Limited
> Copyright (C) 2005 NEC Corporation
> Copyright (C) 1999, 2002 Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions. Enter "help copying" to see the conditions.
> This program has absolutely no warranty. Enter "help warranty" for details.
> crash: diskdump: dump does not have panic dump header
> dump_header:
> dh_magic_number: 618f23ed (DUMP_MAGIC_NUMBER)
> dh_version: 8 (LKCD_DUMP_V8)
> dh_header_size: 734
> dh_dump_level: f
> (DUMP_LEVEL_HEADER|DUMP_LEVEL_KERN|DUMP_LEVEL_USED|DUMP_LEV EL_ALL)
> dh_page_size: 4096
> dh_memory_size: 524153
> dh_memory_start: c0000000
> dh_memory_end: 618f23ed
> dh_num_pages: 524153
> dh_panic_string: Compulsory dump(stat of mkexec was set as 2).
> dh_time: Tue Mar 10 17:48:00 2009
> dh_utsname_sysname: Linux
> dh_utsname_nodename: Assam
> dh_utsname_release: 2.6.12-5MKEXEC
> dh_utsname_version: #7 SMP Thu Mar 5 15:25:22 IST 2009
> dh_utsname_machine: i686
> dh_utsname_domainname: (none)
> dh_current_task: efc4a020
> dh_dump_compress: 0 (DUMP_COMPRESS_NONE)
> dh_dump_flags: 80000000 ()
> dh_dump_device: 0
> unknown page flag in dump: 2de
> found DUMP_DH_END
> <readmem: 8015564b, KVADDR, "x86_omit_frame_pointer", 8, (ROE), 7fbbe228>
> crash: seek error: kernel virtual address: 8015564b type: "x86_omit_frame_point er"
> <readmem: 804b1210, KVADDR, "xtime", 8, (FOE), 834d234>
> crash: seek error: kernel virtual address: 804b1210 type: "xtime"
> [root@Assam ~]#
>
> can you please help me in this.
Maybe, maybe not...
Seek errors are meant to indicate that, after the translation from
kernel virtual address to physical address to the dumpfile location
ended up with a dumpfile offset that was either:
 (1) not accessible, or
 (2) the physical page associated with the virtual address was not
     found in the dumpfile.
I can't really help you with LKCD particulars, and like I mentioned
above, I don't know what the "mini kernel dump" version of LKCD is,
but I do note above that the dumpfile is being recognized as version 
LKCD_DUMP_V8.  And http://people.redhat.com/anderson/crash.changelog.html
contains this change to 4.0-2.17 that fixed something in your version
4.0-2.15:
  4.0-2.17 - Fix to resurrect LKCD version 8 support, inadvertently broken in 
             4.0-2.15. (troy.heber(a)hp.com)
           - Fix for "net -S" failures in certain 2.6 kernels that failed with
             "net: cannot determine what an inet_sock structure is" message;
             shows embedded sock structure instead of failing. (anonymous donor)
           - Fix for erroneous "net -s" source/destination address and port
             values in certain 2.6 kernels; added "net -s" source/destination 
             address and port values for IPv6 sockets.  (anderson(a)redhat.com)
             (12/16/05)
  4.0-2.16 - Fix for the x86_64 backtrace code to search all of the exception
             stacks for the origin of the active tasks' backtrace when the
             information is not available in the dumpfile header.  Up until now,
             the search was made in the process stack, the per-cpu IRQ stack,
             and the per-cpu NMI exception stack; this patch looks at all 3 
             exception stacks in 2.4 kernels (NMI, STACKFAULT and DOUBLEFAULT), 
             and all 5 exception stacks in 2.6 kernels (NMI, STACKFAULT, 
             DOUBLEFAULT, DEBUG and MCE).
           - Fix to remove erroneous warning message re: the task cpu not being 
             the same as the IRQ or exception stack cpu, which was displayed when 
             doing a non-context-sensitive "bt -E" on an x86_64.
             (12/12/05)
  4.0-2.15 - Applied Kurt Rader's (kdrader(a)us.ibm.com) patch for SUSE SLES 9 
             "bigsmp" kernel LKCD dumpfiles, to fix "conflicting page" abort
             caused by a dumpfile header that is larger than the formerly 
             hard-wired header size.
           - Fix for ppc64-only segmentation violation when running "bt" on the
             panic task when run against a dumpfile created by the diskdump 
             facility's new compressed format. 
             (12/02/05)
Perhaps upgrading to the latest version (4.0-7.7) will help?
Dave
 
> thanks in advance.
> 
> regards,
> Seetharam
                                
                         
                        
                                
                                16 years, 7 months