fix_lkcd_address problem
                                
                                
                                
                                    
                                        by Alan Tyson
                                    
                                
                                
                                        Hi,
I believe that there is an incorrect comparison in fix_lkcd_address:
059 ulonglong
060 fix_lkcd_address(ulonglong addr)
061 {
062     int i;
063     ulong offset;
064
065     for (i = 0; i < lkcd->fix_addr_num; i++) {
066         if ( (addr >=lkcd->fix_addr[i].task) &&
067                 (addr <= lkcd->fix_addr[i].task + STACKSIZE())){
                         ^^^^^- here
On Itanium fix_addr[i] + STACKSIZE() may be the address of an adjacent 
task structure.  As it stands both parts of the comparison pass if addr is
the address in the fix_addr[i].task field or if it is the task structure 
which follows that one.  The result is this it is not possible to read the
task structure of the task that follows a task which is in this fixup list
and zeroes are returned instead.
Regards,
Alan Tyson, HP.
--- lkcd_common.c.orig  2007-08-27 16:51:11.000000000 +0100
+++ lkcd_common.c       2007-09-19 16:46:07.000000000 +0100
@@ -64,7 +64,7 @@ fix_lkcd_address(ulonglong addr)
     for (i = 0; i < lkcd->fix_addr_num; i++) {
        if ( (addr >=lkcd->fix_addr[i].task) &&
-               (addr <= lkcd->fix_addr[i].task + STACKSIZE())){
+               (addr < lkcd->fix_addr[i].task + STACKSIZE())){
            offset = addr - lkcd->fix_addr[i].task;
            addr = lkcd->fix_addr[i].saddr + offset;
                                
                         
                        
                                
                                18 years, 1 month
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                        
                                
                                
                                        
                                                
                                        
                                        
                                        crash 4.0-3.14 and SLES 10
                                
                                
                                
                                    
                                        by reagen_jeff@emc.com
                                    
                                
                                
                                        I am trying to look at a live SLES 10 system using crash. Crash fails to
start successfully. It returns:
	crash: /boot/vmlinux: no debugging data available
The vmlinux file in question is not stripped.
Has anyone else been able to get this to work? If so what did you do?
Jeff
                                
                         
                        
                                
                                18 years, 1 month
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: [PATCH 0/2] vmcoreinfo support for dump filtering #2
                                
                                
                                
                                    
                                        by Vivek Goyal
                                    
                                
                                
                                        On Mon, Sep 10, 2007 at 11:35:21AM -0700, Randy Dunlap wrote:
> On Fri, 7 Sep 2007 17:57:46 +0900 Ken'ichi Ohmichi wrote:
> 
> > Hi,
> 
> > I released a new makedumpfile (version 1.2.0) with vmcoreinfo support.
> > I updated the patches for linux and kexec-tools.
> > 
> > PATCH SET:
> > [1/2] [linux-2.6.22] Add vmcoreinfo
> >   The patch is for linux-2.6.22.
> >   The patch adds the vmcoreinfo data. Its address and size are output
> >   to /sys/kernel/vmcoreinfo.
> > 
> > [2/2] [kexec-tools] Pass vmcoreinfo's address and size
> >   The patch is for kexec-tools-testing-20070330.
> >   (http://www.kernel.org/pub/linux/kernel/people/horms/kexec-tools/)
> >   kexec command gets the address and size of the vmcoreinfo data from
> >   /sys/kernel/vmcoreinfo, and passes them to the second kernel through
> >   ELF header of /proc/vmcore. When the second kernel is booting, the
> >   kernel gets them from the ELF header and creates vmcoreinfo's PT_NOTE
> >   segment into /proc/vmcore.
> 
> Hi,
> When using the vmcoreinfo patches, what tool(s) are available for
> analyzing the vmcore (dump) file?  E.g., lkcd or crash or just gdb?
> 
> gdb works for me, but I tried to use crash (4.0-4.6 from
> http://people.redhat.com/anderson/) and crash complained:
> 
> crash: invalid kernel virtual address: 0  type: "cpu_pda entry"
> 
> Should crash work, or does it need to be modified?
> 
Hi Randy,
Crash should just work. It might broken on latest kernel. Copying it
to crash-utility mailing list. Dave will be able to tell us better.
> This is on a 2.6.23-rc3 kernel with vmcoreinfo patches and a dump file
> with -l 31 (dump level 31, omitting all possible pages).
> 
Thanks
Vivek
                                
                         
                        
                                
                                18 years, 1 month
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: Re: Re: crash and sles 9 dumps (Dave Anderson)
                                
                                
                                
                                    
                                        by Daniel Li
                                    
                                
                                
                                        Hey Dave,
When you said this was something you never saw before, did you mean you 
never tried to use crash on a dump of SLES9 guest with the nonstandard 
ELF format, or that this scenario was working for you, thus you didn't 
see this type of error message?
If the answer happen to be the first one, do you have any plan to 
support SLES guest dumps? (with the new ELF format you incorporated in 
the first half of this year to get crash working with Redhat guest dumps)
Later,
Daniel
Virtual Iron Software, Inc
www.virtualiron.com
crash-utility-request(a)redhat.com wrote:
> Send Crash-utility mailing list submissions to
> 	crash-utility(a)redhat.com
>
> To subscribe or unsubscribe via the World Wide Web, visit
> 	https://www.redhat.com/mailman/listinfo/crash-utility
> or, via email, send a message with subject or body 'help' to
> 	crash-utility-request(a)redhat.com
>
> You can reach the person managing the list at
> 	crash-utility-owner(a)redhat.com
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Crash-utility digest..."
>
>
> Today's Topics:
>
>    1. Re: Re: crash and sles 9 dumps (Dave Anderson)
>    2. Re: crash and sles 9 GUEST dumps (Dave Anderson)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 20 Aug 2007 15:49:47 -0400
> From: Dave Anderson <anderson(a)redhat.com>
> Subject: Re: [Crash-utility] Re: crash and sles 9 dumps
> To: holzheu(a)linux.vnet.ibm.com, "Discussion list for crash utility
> 	usage,	maintenance and development" <crash-utility(a)redhat.com>
> Message-ID: <46C9F05B.2090800(a)redhat.com>
> Content-Type: text/plain; charset=us-ascii; format=flowed
>
> Michael Holzheu wrote:
>   
>> Hi Cliff
>>
>> On Mon, 2007-08-13 at 11:33 -0500, Cliff Wickman wrote:
>>
>>     
>>> On Fri, Aug 10, 2007 at 05:19:10PM +0200, Bernhard Walle wrote:
>>>       
>>     
>>> The kerntypes file that crash can use is built by the LKCD dwarfexract
>>> command.  Types are extracted from a -g kernel and modules.  And dwarfextract
>>> writes a magic ELF e_version that crash uses to distinguish a kerntypes from
>>> a vmlinux.  So only such a kerntypes file will work.
>>>       
>> Also the standard -g compiled lkcd Kerntypes file seems to work, if you
>> set the KERNTYPES flag. This can be useful, if you don't want to build a
>> full -g compiled vmlinux.
>>
>> I used the following simple patch which adds the "-k" option to force
>> crash using the kerntypes code path.
>>
>> diff -Naurp crash-4.0-4.5/main.c crash-4.0-4.5-kerntypes/main.c
>> --- crash-4.0-4.5/main.c	2007-08-13 15:07:20.000000000 +0200
>> +++ crash-4.0-4.5-kerntypes/main.c	2007-08-13 15:06:51.000000000 +0200
>> @@ -70,7 +70,7 @@ main(int argc, char **argv)
>>  	 */
>>  	opterr = 0;
>>  	optind = 0;
>> -	while((c = getopt_long(argc, argv, "Lgh::e:i:sSvc:d:tfp:m:",
>> +	while((c = getopt_long(argc, argv, "Lkgh::e:i:sSvc:d:tfp:m:",
>>         		long_options, &option_index)) != -1) {
>>  		switch (c)
>>  		{
>> @@ -222,6 +222,9 @@ main(int argc, char **argv)
>>  			else
>>  				program_usage(LONG_FORM);
>>  			clean_exit(0);
>> +		case 'k':
>> +			pc->flags |= KERNTYPES;
>> +			break;
>>  			
>>  		case 'e':
>>  			if (STREQ(optarg, "vi"))
>>
>>
>>     
>
> This simple "-k" fix looks fine to me, presuming that there's
> nothing else obvious in the lkcd kerntypes file that distinguishes
> it -- i.e., like the unique ELF e_version that dwarfextract uses.
> (EV_DWARFEXTRACT  101010101)
>
> So unless anybody objects, or has a better idea, I'll put this -k
> option in the next release.
>
> Thanks,
>   Dave
>
>
>   
>> I attached the kerntypes file, which works for s390:
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> /*
>>  * kerntypes.c
>>  *
>>  * Dummy module that includes headers for all kernel types of interest.
>>  * The kernel type information is used by the lcrash utility when
>>  * analyzing system crash dumps or the live system. Using the type
>>  * information for the running system, rather than kernel header files,
>>  * makes for a more flexible and robust analysis tool.
>>  *
>>  * This source code is released under the GNU GPL.
>>  */
>>
>> /* generate version for this file */
>> typedef char *COMPILE_VERSION;
>>
>> /* General linux types */
>>
>> #include <linux/autoconf.h>
>> #include <linux/compile.h>
>> #include <linux/utsname.h>
>> #include <linux/module.h>
>> #include <linux/sched.h>
>> #include <linux/mm.h>
>> #include <linux/slab_def.h>
>> #include <linux/slab.h>
>> #include <linux/bio.h>
>> #include <linux/bitmap.h>
>> #include <linux/bitops.h>
>> #include <linux/bitrev.h>
>> #include <linux/blkdev.h>
>> #include <linux/blkpg.h>
>> #include <linux/bootmem.h>
>> #include <linux/buffer_head.h>
>> #include <linux/cache.h>
>> #include <linux/cdev.h>
>> #include <linux/cpu.h>
>> #include <linux/cpumask.h>
>> #include <linux/cpuset.h>
>> #include <linux/dcache.h>
>> #include <linux/debugfs.h>
>> #include <linux/elevator.h>
>> #include <linux/fd.h>
>> #include <linux/file.h>
>> #include <linux/fs.h>
>> #include <linux/futex.h>
>> #include <linux/genhd.h>
>> #include <linux/highmem.h>
>> #include <linux/if.h>
>> #include <linux/if_addr.h>
>> #include <linux/if_arp.h>
>> #include <linux/if_bonding.h>
>> #include <linux/if_ether.h>
>> #include <linux/if_tr.h>
>> #include <linux/if_tun.h>
>> #include <linux/if_vlan.h>
>> #include <linux/in.h>
>> #include <linux/in6.h>
>> #include <linux/in_route.h>
>> #include <linux/inet.h>
>> #include <linux/inet_diag.h>
>> #include <linux/inetdevice.h>
>> #include <linux/init.h>
>> #include <linux/initrd.h>
>> #include <linux/inotify.h>
>> #include <linux/interrupt.h>
>> #include <linux/ioctl.h>
>> #include <linux/ip.h>
>> #include <linux/ipsec.h>
>> #include <linux/ipv6.h>
>> #include <linux/ipv6_route.h>
>> #include <linux/irq.h>
>> #include <linux/irqflags.h>
>> #include <linux/irqreturn.h>
>> #include <linux/jbd.h>
>> #include <linux/jbd2.h>
>> #include <linux/jffs2.h>
>> #include <linux/jhash.h>
>> #include <linux/jiffies.h>
>> #include <linux/kallsyms.h>
>> #include <linux/kernel.h>
>> #include <linux/kernel_stat.h>
>> #include <linux/kexec.h>
>> #include <linux/kobject.h>
>> #include <linux/kthread.h>
>> #include <linux/ktime.h>
>> #include <linux/list.h>
>> #include <linux/memory.h>
>> #include <linux/miscdevice.h>
>> #include <linux/mm.h>
>> #include <linux/mm_inline.h>
>> #include <linux/mm_types.h>
>> #include <linux/mman.h>
>> #include <linux/mmtimer.h>
>> #include <linux/mmzone.h>
>> #include <linux/mnt_namespace.h>
>> #include <linux/module.h>
>> #include <linux/moduleloader.h>
>> #include <linux/moduleparam.h>
>> #include <linux/mount.h>
>> #include <linux/mpage.h>
>> #include <linux/mqueue.h>
>> #include <linux/mtio.h>
>> #include <linux/mutex.h>
>> #include <linux/namei.h>
>> #include <linux/neighbour.h>
>> #include <linux/net.h>
>> #include <linux/netdevice.h>
>> #include <linux/netfilter.h>
>> #include <linux/netfilter_arp.h>
>> #include <linux/netfilter_bridge.h>
>> #include <linux/netfilter_decnet.h>
>> #include <linux/netfilter_ipv4.h>
>> #include <linux/netfilter_ipv6.h>
>> #include <linux/netlink.h>
>> #include <linux/netpoll.h>
>> #include <linux/pagemap.h>
>> #include <linux/param.h>
>> #include <linux/percpu.h>
>> #include <linux/percpu_counter.h>
>> #include <linux/pfn.h>
>> #include <linux/pid.h>
>> #include <linux/pid_namespace.h>
>> #include <linux/poll.h>
>> #include <linux/posix-timers.h>
>> #include <linux/posix_acl.h>
>> #include <linux/posix_acl_xattr.h>
>> #include <linux/posix_types.h>
>> #include <linux/preempt.h>
>> #include <linux/prio_tree.h>
>> #include <linux/proc_fs.h>
>> #include <linux/profile.h>
>> #include <linux/ptrace.h>
>> #include <linux/radix-tree.h>
>> #include <linux/ramfs.h>
>> #include <linux/raw.h>
>> #include <linux/rbtree.h>
>> #include <linux/rcupdate.h>
>> #include <linux/reboot.h>
>> #include <linux/relay.h>
>> #include <linux/resource.h>
>> #include <linux/romfs_fs.h>
>> #include <linux/root_dev.h>
>> #include <linux/route.h>
>> #include <linux/rwsem.h>
>> #include <linux/sched.h>
>> #include <linux/sem.h>
>> #include <linux/seq_file.h>
>> #include <linux/seqlock.h>
>> #include <linux/shm.h>
>> #include <linux/shmem_fs.h>
>> #include <linux/signal.h>
>> #include <linux/signalfd.h>
>> #include <linux/skbuff.h>
>> #include <linux/smp.h>
>> #include <linux/smp_lock.h>
>> #include <linux/socket.h>
>> #include <linux/sockios.h>
>> #include <linux/spinlock.h>
>> #include <linux/stat.h>
>> #include <linux/statfs.h>
>> #include <linux/stddef.h>
>> #include <linux/swap.h>
>> #include <linux/swapops.h>
>> #include <linux/sys.h>
>> #include <linux/syscalls.h>
>> #include <linux/sysctl.h>
>> #include <linux/sysdev.h>
>> #include <linux/sysfs.h>
>> #include <linux/sysrq.h>
>> #include <linux/tc.h>
>> #include <linux/tcp.h>
>> #include <linux/thread_info.h>
>> #include <linux/threads.h>
>> #include <linux/tick.h>
>> #include <linux/time.h>
>> #include <linux/timer.h>
>> #include <linux/timerfd.h>
>> #include <linux/times.h>
>> #include <linux/timex.h>
>> #include <linux/topology.h>
>> #include <linux/transport_class.h>
>> #include <linux/tty.h>
>> #include <linux/tty_driver.h>
>> #include <linux/tty_flip.h>
>> #include <linux/tty_ldisc.h>
>> #include <linux/types.h>
>> #include <linux/uaccess.h>
>> #include <linux/unistd.h>
>> #include <linux/utime.h>
>> #include <linux/uts.h>
>> #include <linux/utsname.h>
>> #include <linux/utsrelease.h>
>> #include <linux/version.h>
>> #include <linux/vfs.h>
>> #include <linux/vmalloc.h>
>> #include <linux/vmstat.h>
>> #include <linux/wait.h>
>> #include <linux/watchdog.h>
>> #include <linux/workqueue.h>
>> #include <linux/zconf.h>
>> #include <linux/zlib.h>
>>
>> /*
>>  * s390 specific includes
>>  */
>>
>> #include <asm/lowcore.h>
>> #include <asm/debug.h>
>> #include <asm/ccwdev.h>
>> #include <asm/ccwgroup.h>
>> #include <asm/qdio.h>
>> #include <asm/zcrypt.h>
>> #include <asm/etr.h>
>> #include <asm/ipl.h>
>> #include <asm/setup.h>
>>
>> /* channel subsystem driver */
>> #include "drivers/s390/cio/cio.h"
>> #include "drivers/s390/cio/chsc.h"
>> #include "drivers/s390/cio/css.h"
>> #include "drivers/s390/cio/device.h"
>> #include "drivers/s390/cio/qdio.h"
>>
>> /* dasd device driver */
>> #include "drivers/s390/block/dasd_int.h"
>> #include "drivers/s390/block/dasd_diag.h"
>> #include "drivers/s390/block/dasd_eckd.h"
>> #include "drivers/s390/block/dasd_fba.h"
>>
>> /* networking drivers */
>> #include "drivers/s390/net/fsm.h"
>> #include "include/net/iucv/iucv.h"
>> #include "drivers/s390/net/lcs.h"
>> #include "drivers/s390/net/qeth.h"
>>
>> /* zfcp device driver */
>> #include "drivers/s390/scsi/zfcp_def.h"
>> #include "drivers/s390/scsi/zfcp_fsf.h"
>>
>> /* crypto device driver */
>> #include "drivers/s390/crypto/ap_bus.h"
>> #include "drivers/s390/crypto/zcrypt_api.h"
>> #include "drivers/s390/crypto/zcrypt_cca_key.h"
>> #include "drivers/s390/crypto/zcrypt_pcica.h"
>> #include "drivers/s390/crypto/zcrypt_pcicc.h"
>> #include "drivers/s390/crypto/zcrypt_pcixcc.h"
>> #include "drivers/s390/crypto/zcrypt_cex2a.h"
>>
>> /* sclp device driver */
>> #include "drivers/s390/char/sclp.h"
>> #include "drivers/s390/char/sclp_rw.h"
>> #include "drivers/s390/char/sclp_tty.h"
>>
>> /* vmur device driver */
>> #include "drivers/s390/char/vmur.h"
>>
>> /*
>>  * include sched.c for types:
>>  *    - struct prio_array
>>  *    - struct runqueue
>>  */
>> #include "kernel/sched.c"
>> /*
>>  * include slab.c for struct kmem_cache
>>  */
>> #include "mm/slab.c"
>>
>>
>> ------------------------------------------------------------------------
>>
>> --
>> Crash-utility mailing list
>> Crash-utility(a)redhat.com
>> https://www.redhat.com/mailman/listinfo/crash-utility
>>     
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 20 Aug 2007 16:25:01 -0400
> From: Dave Anderson <anderson(a)redhat.com>
> Subject: Re: [Crash-utility] crash and sles 9 GUEST dumps
> To: "Discussion list for crash utility usage,	maintenance and
> 	development" <crash-utility(a)redhat.com>
> Message-ID: <46C9F89D.1000301(a)redhat.com>
> Content-Type: text/plain; charset=us-ascii; format=flowed
>
> Daniel Li wrote:
>   
>> After finding out how to get crash working with native sles 9 LKCD 
>> format dumps -- namely, build and use a debug vmlinux with appropriate 
>> flags to feed to crash -- I started looking into using crash on kernel 
>> dumps created for sles 9 guest domains.
>>
>> As compared to the LKCD format of native sles 9 dumps, those dumps are 
>> created using the new non-standard ELF format with section headers 
>> instead of program headers, which is the case with the xenctrl library 
>> now. Such formats are working for RHAS4U4 64bit guests, while I had to 
>> make minor modification to make it work for RHAS4U4 32bit guests as 
>> well. However, when it comes to sles 9 guests, crash seems to be having 
>> problems locating the stacks for each thread, with the exception of the 
>> CURRENT thread. (see below)
>>
>> It may well be that the stack pointers were not saved properly for sles 
>> 9 guests by the Xen library in the dump. I'll take a look into the dump 
>> and the xen library code to see if that is the case... Or is this the 
>> case of crash not looking in the right places for those stack pointers?
>>
>>     
>
> Looking at the data below, this is hard to decipher what's going on.
>
> The "ps" list -- except for the current task at ffffffff803d2800, shows
> seemingly legitimate tasks because the COMM ("task_struct.comm[16]")
> strings look OK.  But the state (ST) fields and the PPID values are
> bogus?
>
>  > crash> ps
>  >   PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
>  >  >     0      0   0  ffffffff803d2800  RU   0.0 4399648058624
>  > 4389578663200  [<80>^L]
>  >      0      0   0  ffffffff803d2808  ??   0.0       0      0  [swapper]
>  >      1      0   0     10017e1f2c8    ??   0.1     640    304  init
>  >      2     -1   0     10017e1e9a8    ??   0.0       0      0  [migration/0]
>  >      3     -1   0     10017e1e088    ??   0.0       0      0  [ksoftirqd/0]
> ...
>
> But the state (ST) field and the PPID values above are bogus.
>
> And that's all confirmed when you ran the "task 10015180208" command,
> which simply has gdb print the task_struct at that address:
>
>  > crash> bt 10015180208
>  > PID: 3696   TASK: 10015180208       CPU: 0   COMMAND: "klogd"
>  > *bt: invalid kernel virtual address: 12  type: "stack contents"*
>  > bt: read of stack at 12 failed
>  > crash>  task 10015180208
>  > PID: 3696   TASK: 10015180208       CPU: 0   COMMAND: "klogd"
>  > struct task_struct {
>  >  *state = 1099873050624,*
>  > *  thread_info = 0x12,*
>  >  usage = {
>  >    counter = 320
>  >  },
>  >  flags = 0,
> ...
>  >  comm = "klogd\000roc\000\000\000\000\000\000",
> ...
>
> The "state" and "thread_info" (i.e., the stack page pointer) fields
> make no sense, while the "comm" field, and many of the others (upon
> a quick examination) do seem correct.
>
> It's interesting that all of the task_struct addresses end in "8",
> though.  If you were to enter "task_struct 10015180200", do those
> two fields look right, and perhaps due to structure padding (?),
> you'd still see the "klog" string in the correct place?
>
> I'm sure this is something I've never seen before, so I'm afraid I
> can't offer any answers or suggestions...
>
> Dave
>
>
>
>   
>> Thanks,
>> Daniel
>>
>> /dumps/new/sles/64bit$ /home/dli/bin/crash vmlinux-2.6.5-7.244-smp 
>> vmlinux.dbg DUMP10.1.230.112
>>
>> crash 4.0-4.5
>> Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007  Red Hat, Inc.
>> Copyright (C) 2004, 2005, 2006  IBM Corporation
>> Copyright (C) 1999-2006  Hewlett-Packard Co
>> Copyright (C) 2005, 2006  Fujitsu Limited
>> Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
>> Copyright (C) 2005  NEC Corporation
>> Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
>> Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
>> This program is free software, covered by the GNU General Public License,
>> and you are welcome to change it and/or distribute copies of it under
>> certain conditions.  Enter "help copying" to see the conditions.
>> This program has absolutely no warranty.  Enter "help warranty" for 
>> details.
>>
>> GNU gdb 6.1
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and you 
>> are
>> welcome to change it and/or distribute copies of it under certain 
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for details.
>> This GDB was configured as "x86_64-unknown-linux-gnu"...
>>
>> WARNING: could not find MAGIC_START!
>> please wait... (gathering task table data)
>> crash: invalid kernel virtual address: 13  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: f  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 6  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: c  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 13  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: c  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 18  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: f  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 13  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 13  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 12  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 11  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 15  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: f  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 12  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 6e  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 22  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 13  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: f  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: c  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: c  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 13  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: f  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: f  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 11  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 10  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: c  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 14  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 13  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: 18  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: f  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: e  type: "fill_thread_info"
>>
>> crash: invalid kernel virtual address: f  type: "fill_thread_info"
>> please wait... (determining panic task)
>> bt: invalid kernel virtual address: 13  type: "stack contents"
>>
>> bt: read of stack at 13 failed
>>
>>
>> bt: invalid kernel virtual address: f  type: "stack contents"
>>
>> bt: read of stack at f failed
>>
>>
>> bt: invalid kernel virtual address: 6  type: "stack contents"
>>
>> bt: read of stack at 6 failed
>>
>>
>> bt: invalid kernel virtual address: c  type: "stack contents"
>>
>> bt: read of stack at c failed
>>
>>
>> bt: invalid kernel virtual address: 13  type: "stack contents"
>>
>> bt: read of stack at 13 failed
>>
>>
>> bt: invalid kernel virtual address: c  type: "stack contents"
>>
>> bt: read of stack at c failed
>>
>>
>> bt: invalid kernel virtual address: 18  type: "stack contents"
>>
>> bt: read of stack at 18 failed
>>
>>
>> bt: invalid kernel virtual address: f  type: "stack contents"
>>
>> bt: read of stack at f failed
>>
>>
>> bt: invalid kernel virtual address: 13  type: "stack contents"
>>
>> bt: read of stack at 13 failed
>>
>>
>> bt: invalid kernel virtual address: 13  type: "stack contents"
>>
>> bt: read of stack at 13 failed
>>
>>
>> bt: invalid kernel virtual address: 12  type: "stack contents"
>>
>> bt: read of stack at 12 failed
>>
>>
>> bt: invalid kernel virtual address: 11  type: "stack contents"
>>
>> bt: read of stack at 11 failed
>>
>>
>> bt: invalid kernel virtual address: 15  type: "stack contents"
>>
>> bt: read of stack at 15 failed
>>
>>
>> bt: invalid kernel virtual address: f  type: "stack contents"
>>
>> bt: read of stack at f failed
>>
>>
>> bt: invalid kernel virtual address: 12  type: "stack contents"
>>
>> bt: read of stack at 12 failed
>>
>>
>> bt: invalid kernel virtual address: 6e  type: "stack contents"
>>
>> bt: read of stack at 6e failed
>>
>>
>> bt: invalid kernel virtual address: 22  type: "stack contents"
>>
>> bt: read of stack at 22 failed
>>
>>
>> bt: invalid kernel virtual address: 13  type: "stack contents"
>>
>> bt: read of stack at 13 failed
>>
>>
>> bt: invalid kernel virtual address: f  type: "stack contents"
>>
>> bt: read of stack at f failed
>>
>>
>> bt: invalid kernel virtual address: c  type: "stack contents"
>>
>> bt: read of stack at c failed
>>
>>
>> bt: invalid kernel virtual address: c  type: "stack contents"
>>
>> bt: read of stack at c failed
>>
>>
>> bt: invalid kernel virtual address: 13  type: "stack contents"
>>
>> bt: read of stack at 13 failed
>>
>>
>> bt: invalid kernel virtual address: f  type: "stack contents"
>>
>> bt: read of stack at f failed
>>
>>
>> bt: invalid kernel virtual address: f  type: "stack contents"
>>
>> bt: read of stack at f failed
>>
>>
>> bt: invalid kernel virtual address: 11  type: "stack contents"
>>
>> bt: read of stack at 11 failed
>>
>>
>> bt: invalid kernel virtual address: 10  type: "stack contents"
>>
>> bt: read of stack at 10 failed
>>
>>
>> bt: invalid kernel virtual address: c  type: "stack contents"
>>
>> bt: read of stack at c failed
>>
>>
>> bt: invalid kernel virtual address: 14  type: "stack contents"
>>
>> bt: read of stack at 14 failed
>>
>>
>> bt: invalid kernel virtual address: 13  type: "stack contents"
>>
>> bt: read of stack at 13 failed
>>
>>
>> bt: invalid kernel virtual address: 18  type: "stack contents"
>>
>> bt: read of stack at 18 failed
>>
>>
>> bt: invalid kernel virtual address: f  type: "stack contents"
>>
>> bt: read of stack at f failed
>>
>>
>> bt: invalid kernel virtual address: e  type: "stack contents"
>>
>> bt: read of stack at e failed
>>
>>
>> bt: invalid kernel virtual address: f  type: "stack contents"
>>
>> bt: read of stack at f failed
>>
>>      KERNEL: vmlinux-2.6.5-7.244-smp
>> DEBUG KERNEL: vmlinux.dbg (2.6.5-7.244-default)
>>    DUMPFILE: DUMP10.1.230.112
>>        CPUS: 1
>>        DATE: Thu Jul 26 14:34:46 2007
>>      UPTIME: 213503982284 days, 21:34:00
>> LOAD AVERAGE: 0.01, 0.12, 0.07
>>       TASKS: 34
>>    NODENAME: linux
>>     RELEASE: 2.6.5-7.244-smp
>>     VERSION: #1 SMP Mon Dec 12 18:32:25 UTC 2005
>>     MACHINE: x86_64  (2793 Mhz)
>>      MEMORY: 1015808 GB
>>       PANIC: ""
>>         PID: 0
>>     COMMAND: "
>>               "
>>        TASK: ffffffff803d2800  (1 of 2)  [THREAD_INFO: ffffffff80590000]
>>         CPU: 0
>>       STATE: TASK_RUNNING (ACTIVE)
>>     WARNING: panic task not found
>>
>> crash> bt
>> PID: 0      TASK: ffffffff803d2800  CPU: 0   COMMAND: "<80>^L"
>> #0 [ffffffff80591ef0] schedule at ffffffff801394e4
>> #1 [ffffffff80591f98] default_idle at ffffffff8010f1c0
>> #2 [ffffffff80591fc8] cpu_idle at ffffffff8010f65a
>> crash> ps
>>   PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
>>  >     0      0   0  ffffffff803d2800  RU   0.0 4399648058624 
>> 4389578663200  [<80>^L]
>>      0      0   0  ffffffff803d2808  ??   0.0       0      0  [swapper]
>>      1      0   0     10017e1f2c8    ??   0.1     640    304  init
>>      2     -1   0     10017e1e9a8    ??   0.0       0      0  [migration/0]
>>      3     -1   0     10017e1e088    ??   0.0       0      0  [ksoftirqd/0]
>>      4     -1   0     10001b712d8    ??   0.0       0      0  [events/0]
>>      5     -1   0     10001b709b8    ??   0.0       0      0  [khelper]
>>      6     -1   0     10001b70098    ??   0.0       0      0  [kacpid]
>>     25     -1   0     10017dd72e8    ??   0.0       0      0  [kblockd/0]
>>     47     -1   0     10017dd69c8    ??   0.0       0      0  [pdflush]
>>     48     -1   0     10017dd60a8    ??   0.0       0      0  [pdflush]
>>     49     -1   0     100178272f8    ??   0.0       0      0  [kswapd0]
>>     50     -1   0     100178269d8    ??   0.0       0      0  [aio/0]
>>   1295     -1   0     100178260b8    ??   0.0       0      0  [kseriod]
>>   2077     -1   0     10017897308    ??   0.0       0      0  [reiserfs/0]
>>   2744     -1   0     10014de9488    ??   0.0       0      0  [khubd]
>>   3077     -1   0     10015aa13c8    ??   0.2    2560    608  hwscand
>>   3693     -1   0     100164e1348    ??   0.2    3568    816  syslogd
>>   3696     -1   0     10015180208    ??   0.3    2744   1112  klogd
>>   3721     -1   0     10015b0eab8    ??   0.2    3536    628  resmgrd
>>   3722     -1   0     10015e6e1c8    ??   0.2    4564    640  portmap
>>   3803     -1   0     10015d49368    ??   0.6   20036   2340  master
>>   3814     -1   0     10015daea58    ??   0.6   20100   2312  pickup
>>   3815     -1   0     10016c5a0d8    ??   0.6   20144   2364  qmgr
>>   3861     -1   0     10016ca2a08    ??   0.7   26800   2932  sshd
>>   4022     -1   0     10014c42b48    ??   0.2    6804    924  cron
>>   4057     -1   0     100178960c8    ??   0.2    2484    612  agetty
>>   4058     -1   0     10016c5b318    ??   0.5   21864   1772  login
>>   4059     -1   0     10016ca3328    ??   0.2    7012    936  mingetty
>>   4060     -1   0     10015fb5398    ??   0.2    7012    936  mingetty
>>   4061     -1   0     10014cc6238    ??   0.2    7012    936  mingetty
>>   4062     -1   0     10015b0f3d8    ??   0.2    7012    936  mingetty
>>   4063     -1   0     100151e7458    ??   0.2    7012    936  mingetty
>>   4152     -1   0     10016a180f8    ??   0.8   12716   2992  bash
>> crash> bt 10015180208
>> PID: 3696   TASK: 10015180208       CPU: 0   COMMAND: "klogd"
>> *bt: invalid kernel virtual address: 12  type: "stack contents"*
>> bt: read of stack at 12 failed
>> crash>  task 10015180208
>> PID: 3696   TASK: 10015180208       CPU: 0   COMMAND: "klogd"
>> struct task_struct {
>>  *state = 1099873050624,*
>> *  thread_info = 0x12,*
>>  usage = {
>>    counter = 320
>>  },
>>  flags = 0,
>>  ptrace = 502511173631,
>>  lock_depth = 120,
>>  prio = 0,
>>  static_prio = 1048832,
>>  run_list = {
>>    next = 0x200200,
>>    prev = 0x0
>>  },
>>  array = 0x50fe72e6,
>>  sleep_avg = 1,
>>  interactive_credit = 67616128664,
>>  timestamp = 67616128664,
>>  last_ran = 0,
>>  activated = 0,
>>  policy = 18446744073709551615,
>>  cpus_allowed = 18446744073709551615,
>>  time_slice = 150,
>>  first_time_slice = 0,
>>  tasks = {
>>    next = 0x10015b0eb48,
>>    prev = 0x100164e13d8
>>  },
>>  ptrace_children = {
>>    next = 0x100151802a8,
>>    prev = 0x100151802a8
>>  },
>>  ptrace_list = {
>>    next = 0x100151802b8,
>>    prev = 0x100151802b8
>>  },
>>  mm = 0x1001546c500,
>>  active_mm = 0x1001546c500,
>>  binfmt = 0xffffffff803e70c0,
>>  exit_state = 0,
>>  exit_code = 0,
>>  exit_signal = 17,
>>  pdeath_signal = 0,
>>  personality = 0,
>>  did_exec = 0,
>>  pid = 3696,
>>  tgid = 3696,
>>  real_parent = 0x10017e1f2c0,
>>  parent = 0x10017e1f2c0,
>>  children = {
>>    next = 0x10015180320,
>>    prev = 0x10015180320
>>  },
>>  sibling = {
>>    next = 0x10015b0ebe0,
>>    prev = 0x100164e1470
>>  },
>>  group_leader = 0x10015180200,
>>  pids = {{
>>      pid_chain = {
>>        next = 0x10015180370,
>>        prev = 0x10015180370
>>      },
>>      pidptr = 0x10015180360,
>>      pid = {
>>        nr = 3696,
>>        count = {
>>          counter = 1
>>        },
>>        task = 0x10015180200,
>>        task_list = {
>>          next = 0x10015180348,
>>          prev = 0x10015180348
>>        },
>>        hash_chain = {
>>          next = 0x10017827470,
>>          prev = 0x10016ca2b80
>>        }
>>      }
>>    }, {
>>      pid_chain = {
>>        next = 0x100151803b8,
>>        prev = 0x100151803b8
>>      },
>>      pidptr = 0x100151803a8,
>>      pid = {
>>        nr = 3696,
>>        count = {
>>          counter = 1
>>        },
>>        task = 0x10015180200,
>>        task_list = {
>>          next = 0x10015180390,
>>          prev = 0x10015180390
>>        },
>>        hash_chain = {
>>          next = 0x100178274b8,
>>          prev = 0x10016ca2bc8
>>        }
>>      }
>>    }, {
>>      pid_chain = {
>>        next = 0x10015180400,
>>        prev = 0x10015180400
>>      },
>>      pidptr = 0x100151803f0,
>>      pid = {
>>        nr = 3696,
>>        count = {
>>          counter = 1
>>        },
>>        task = 0x10015180200,
>>        task_list = {
>>          next = 0x100151803d8,
>>          prev = 0x100151803d8
>>        },
>>        hash_chain = {
>>          next = 0x10001949240,
>>          prev = 0x10016ca2c10
>>        }
>>      }
>>    }, {
>>      pid_chain = {
>>        next = 0x10015180448,
>>        prev = 0x10015180448
>>      },
>>      pidptr = 0x10015180438,
>>      pid = {
>>        nr = 3696,
>>        count = {
>>          counter = 1
>>        },
>>        task = 0x10015180200,
>>        task_list = {
>>          next = 0x10015180420,
>>          prev = 0x10015180420
>>        },
>>        hash_chain = {
>>          next = 0x10001949340,
>>          prev = 0x10016ca2c58
>>        }
>>      }
>>    }},
>>  wait_chldexit = {
>>    lock = {
>>      lock = 1
>>    },
>>    task_list = {
>>      next = 0x10015180470,
>>      prev = 0x10015180470
>>    }
>>  },
>>  vfork_done = 0x0,
>>  set_child_tid = 0x2a95894b90,
>>  clear_child_tid = 0x2a95894b90,
>>  rt_priority = 0,
>>  it_real_value = 0,
>>  it_prof_value = 0,
>>  it_virt_value = 0,
>>  it_real_incr = 0,
>>  it_prof_incr = 0,
>>  it_virt_incr = 0,
>>  real_timer = {
>>    entry = {
>>      next = 0x100100,
>>      prev = 0x200200
>>    },
>>    expires = 29143,
>>    lock = {
>>      lock = 1
>>    },
>>    magic = 1267182958,
>>    function = 0xffffffff80141b50 <it_real_fn>,
>>    data = 1099865522688,
>>    base = 0x0
>>  },
>>  utime = 0,
>>  stime = 4,
>>  cutime = 0,
>>  cstime = 0,
>>  nvcsw = 13,
>>  nivcsw = 2,
>>  cnvcsw = 0,
>>  cnivcsw = 0,
>>  start_time = 53888910424,
>>  min_flt = 105,
>>  maj_flt = 0,
>>  cmin_flt = 0,
>>  cmaj_flt = 0,
>>  uid = 0,
>>  euid = 0,
>>  suid = 0,
>>  fsuid = 0,
>>  gid = 0,
>>  egid = 0,
>>  sgid = 0,
>>  fsgid = 0,
>>  group_info = 0xffffffff803e2a00,
>>  cap_effective = 4294967039,
>>  cap_inheritable = 0,
>>  cap_permitted = 4294967039,
>>  keep_capabilities = 0,
>>  user = 0xffffffff803e29a0,
>>  rlim = {{
>>      rlim_cur = 18446744073709551615,
>>      rlim_max = 18446744073709551615
>>    }, {
>>      rlim_cur = 18446744073709551615,
>>      rlim_max = 18446744073709551615
>>    }, {
>>      rlim_cur = 18446744073709551615,
>>      rlim_max = 18446744073709551615
>>    }, {
>>      rlim_cur = 8388608,
>>      rlim_max = 18446744073709551615
>>    }, {
>>      rlim_cur = 0,
>>      rlim_max = 18446744073709551615
>>    }, {
>>      rlim_cur = 18446744073709551615,
>>      rlim_max = 18446744073709551615
>>    }, {
>>      rlim_cur = 3071,
>>      rlim_max = 3071
>>    }, {
>>      rlim_cur = 1024,
>>      rlim_max = 1024
>>    }, {
>>      rlim_cur = 18446744073709551615,
>>      rlim_max = 18446744073709551615
>>    }, {
>>      rlim_cur = 18446744073709551615,
>>      rlim_max = 18446744073709551615
>>    }, {
>>      rlim_cur = 18446744073709551615,
>>      rlim_max = 18446744073709551615
>>    }, {
>>      rlim_cur = 1024,
>>      rlim_max = 1024
>>    }, {
>>      rlim_cur = 819200,
>>      rlim_max = 819200
>>    }},
>>  used_math = 0,
>>  rcvd_sigterm = 0,
>>  oomkilladj = 0,
>>  comm = "klogd\000roc\000\000\000\000\000\000",
>>  link_count = 0,
>>  total_link_count = 0,
>>  sysvsem = {
>>    undo_list = 0x0
>>  },
>>  thread = {
>>    rsp0 = 1099873058120,
>>    rsp = 548682070920,
>>    userrsp = 182897429248,
>>    fs = 0,
>>    gs = 0,
>>    es = 0,
>>    ds = 0,
>>    fsindex = 0,
>>    gsindex = 0,
>>    debugreg0 = 0,
>>    debugreg1 = 0,
>>    debugreg2 = 0,
>>    debugreg3 = 0,
>>    debugreg6 = 0,
>>    debugreg7 = 0,
>>    cr2 = 0,
>>    trap_no = 0,
>>    error_code = 0,
>>    i387 = {
>>      fxsave = {
>>        cwd = 0,
>>        swd = 0,
>>        twd = 0,
>>        fop = 0,
>>        rip = 0,
>>        rdp = 281470681751424,
>>        mxcsr = 0,
>>        mxcsr_mask = 0,
>>        st_space = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
>> 0, 0, 0, 0, 0, 0, 0, 0, 0,
>> 0, 0, 0, 0, 0},
>>        xmm_space = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
>> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
>> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
>> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
>> , 0, 0, 0},
>>        padding = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
>> 0, 0, 0, 0, 0, 0}
>>      }
>>    },
>>    ioperm = 0,
>>    io_bitmap_ptr = 0x0,
>>    tls_array = {0, 0, 0}
>>  },
>>  fs = 0x10014c7a180,
>>  files = 0x10001a114c0,
>>  namespace = 0x100154fb900,
>>  signal = 0x10015184600,
>>  sighand = 0x0,
>>  blocked = {
>>    sig = {0}
>>  },
>>  real_blocked = {
>>    sig = {1099865524632}
>>  },
>>  pending = {
>>    list = {
>>      next = 0x10015180998,
>>      prev = 0x0
>>    },
>>    signal = {
>>      sig = {0}
>>    }
>>  },
>>  sas_ss_sp = 0,
>>  sas_ss_size = 0,
>>  notifier = 0,
>>  notifier_data = 0x0,
>>  notifier_mask = 0x0,
>>  security = 0x600000005,
>>  parent_exec_id = 1,
>>  self_exec_id = 1,
>>  alloc_lock = {
>>    lock = 1
>>  },
>>  proc_lock = {
>>    lock = 0
>>  },
>>  switch_lock = {
>>    lock = 0
>>  },
>>  journal_info = 0x0,
>>  reclaim_state = 0x10015469180,
>>  proc_dentry = 0x0,
>>  backing_dev_info = 0x10015b40940,
>>  io_context = 0x0,
>>  ptrace_message = 0,
>>  last_siginfo = 0x0,
>>  io_wait = 0xac9,
>>  rchar = 2292,
>>  wchar = 3,
>>  syscr = 32,
>>  syscw = 475,
>>  acct_rss_mem1 = 2743,
>>  acct_vm_mem1 = 4,
>>  acct_stimexpd = 4294967297,
>>  ckrm_tsklock = {
>>    lock = 0
>>  },
>>  ckrm_celock = {
>>    lock = 0
>>  },
>>  ce_data = 0xffffffff804f3f20,
>>  taskclass = 0x100164e1bc8,
>>  taskclass_link = {
>>    next = 0x10015b0f338,
>>    prev = 0xffffffff80537940
>>  },
>>  cpu_class = 0x0,
>>  demand_stat = {
>>    run = 0,
>>    total = 61218488692,
>>    last_sleep = 32000000,
>>    recalc_interval = 0,
>>    cpu_demand = 105133020
>>  },
>>  delays = {
>>    waitcpu_total = 3647587,
>>    runcpu_total = 23603870,
>>    iowait_total = 0,
>>    mem_iowait_total = 4294967311,
>>    runs = 0,
>>    num_iowaits = 0,
>>    num_memwaits = 0,
>>    splpar_total = 1431654400
>>  },
>>  map_base = 0,
>>  mempolicy = 0x0,
>>  il_next = 0,
>>  audit = 0x10
>> }
>>
>> crash>                
>>
>>
>> -- 
>> Crash-utility mailing list
>> Crash-utility(a)redhat.com
>> https://www.redhat.com/mailman/listinfo/crash-utility
>>     
>
>
>
>
> ------------------------------
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
>
>
> End of Crash-utility Digest, Vol 23, Issue 8
> ********************************************
>
>   
                                
                         
                        
                                
                                18 years, 1 month
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: Re: [Crash-utility] User Stack back trace of  the process
                                
                                
                                
                                    
                                        by Rajesh
                                    
                                
                                
                                          
May be I'm posting to wrong mailing list.kindly guide me...
I have modified the elf_core dump functionality, to take only text, data and stack segments. I'm not intrested in dynamic allocated memroy of the proces.
Below is the modification i have done in "binfmt_elf.c" file.
In maydump() function I'm checking for the VMA mapped to dynamic memory of the proces or not.
-------------------------------------------------------
if ((vma->vm_file == NULL) &&
                (!((current->mm->start_stack) < vma->vm_end)))
         return 0;
-------------------------------------------------------
It is working fine for single threaded processes, but when i take the core dump of the multi-threaded process, I only get the core dump of the process i kill. And in gdb I'm not able to switch between the threads.
Please let me know whether those modifications are correct or not.
--Regards,
rajesh
On Wed, 05 Sep 2007 Dave Anderson wrote :
>Rajesh wrote:
>>Sorry in my previous e-mail I mistyped.
>>
>>I want to dump only code and stack segments of a process.
>>
>>--Regards,
>>rajesh
>
>stack segments would have: (vma->vm_flags & VM_GROWSDOWN)
>
>
>>
>>
>>On Wed, 05 Sep 2007 Rajesh wrote :
>>  >Hi,
>>  >
>>  >Is there any way to find using kernel data structure, the VMA of a process belongs to stack or heap. It is easy to distinguish the VMA  belongs to code segment or not from vm_area_struct structure, using "vm_flags" variable.
>>  >
>>  >In "elf_core_dump()" function I'm planning to dump only code and data segments.
>>  >
>>  >Can any body please guide me...
>>  >
>>  >--Regards,
>>  >rajesh
>>  >
>>  >On Wed, 05 Sep 2007 Dave Anderson wrote :
>>  > >Rajesh wrote:
>>  > >>Dave,
>>  > >>
>>  > >>Thanks for your explanation.
>>  > >>
>>  > >>Well the reason behind my questions is, we have an application running on customer site and the application consumes around 60GB of system memory.
>>  > >>When this process receives the segmentation fault or signal abort, the kernel will start to take the process core dump. Here is the problem. Kernel takes at least  1hr (60-minutes) to come out from core dump. During this time the system is unresponsive (hung), and I feel it is because the system is entering into thrashing due to huge memory usage by the process. This long down time is not acceptable by the customer.
>>  > >>
>>  > >>So I started to find the better way or tackling the problem.
>>  > >>
>>  > >>1>First thing we thought is changing the system page size from 4KB to 8KB. Since this change could not be done on our x86_64 architecture, since x86_64 architecture doesnt support multi-page size option.
>>  > >>
>>  > >>2>We wrote a program using libbfd APIs and used with in our application. Whenever the SIGSEGV or SIGABRT is received by the process it will log the stack trace of all the threads within that process. This feature is not so effective or flexible as compared to process core dump.
>>  > >>
>>  > >>3>Last we thought of using kcore/vmcore to analyze the cause for SIGSEGV or SIGABRT.
>>  > >>
>>  > >>4>I have one more thought, making the elf_core_dump() function SMP. This function is responsible for dumping the core, and the function is present in /usr/src/linux/fs/binfmt_elf.c
>>  > >>
>>  > >>
>>  > >>Any comments/ideas are welcome.
>>  > >>
>>  > >>--Regards,
>>  > >>rajesh
>>  > >
>>  > >Maybe tinker with maydump()?
>>  > >
>>  > >If you know that the core dump contains the VMA's that are
>>  > >not necessary to dump, such as large shared memory segments,
>>  > >and you can identify them from the VMA, you can prevent
>>  > >them from being copied to the core dump.  There's this
>>  > >patch floating around, which may have been updated:
>>  > >
>>  > >  http://lkml.org/lkml/2007/2/16/149
>>  > >
>>  > >Dave
>>  > >
>>  > >
>>  > >
>>  > >
>>  >--
>>  >Crash-utility mailing list
>>  >Crash-utility(a)redhat.com
>>  >https://www.redhat.com/mailman/listinfo/crash-utility
>>
>>
>>
>><http://adworks.rediff.com/cgi-bin/AdWorks/click.cgi/www.rediff.com/signat... target=new >
>>
>
>
                                
                         
                        
                                
                                18 years, 2 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                        
                                
                                
                                        
                                                
                                        
                                        
                                        crash doesnt give complete backtrace
                                
                                
                                
                                    
                                        by Adhiraj Joshi
                                    
                                
                                
                                        Hi All,
I get a kernel panic on my system and I see a long backtrace on the  console
before the system freezes. I have kdump setup and analyse the generated
vmcore using crash. But the backtrace from crash is too short and it doesnt
give any relevant information. I wanted backtrace that appeared on the
console before freezing.
Any ideas on this?
Regards,
Adhiraj.
                                
                         
                        
                                
                                18 years, 2 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                        
                                
                                
                                        
                                                
                                        
                                        
                                        [PATCH] fix loop index
                                
                                
                                
                                    
                                        by Daisuke Nishimura
                                    
                                
                                
                                        Hi.
In x86_xen_kdump_p2m_create(), same valuable(i) is
used as for-loop index, where one for-loop is inside
the another for-loop.
As a result, if debug level is equal to or larger than
7, outer for-loop is repeated only once.
This patch fixes this bug.
Thanks.
Daisuke Nishimura.
diff -uprN crash-4.0-4.6.org/x86.c crash-4.0-4.6/x86.c
--- crash-4.0-4.6.org/x86.c	2007-08-28 00:51:11.000000000 +0900
+++ crash-4.0-4.6/x86.c	2007-09-06 10:13:37.000000000 +0900
@@ -4141,9 +4141,9 @@ x86_xen_kdump_p2m_create(struct xen_kdum
 		
 	        if (CRASHDEBUG(7)) {
 	                up = (ulong *)xkd->page;
-	                for (i = 0; i < 256; i++) {
+	                for (j = 0; j < 256; j++) {
 	                        fprintf(fp, "%08lx: %08lx %08lx %08lx %08lx\n",
-	                                (ulong)((i * 4) * sizeof(ulong)),
+	                                (ulong)((j * 4) * sizeof(ulong)),
 	                                *up, *(up+1), *(up+2), *(up+3));
 	                        up += 4;
 	                }
                                
                         
                        
                                
                                18 years, 2 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: Re: Re: [Crash-utility] User Stack back trace of  the process
                                
                                
                                
                                    
                                        by Rajesh
                                    
                                
                                
                                        Sorry in my previous e-mail I mistyped.
I want to dump only code and stack segments of a process.
 --Regards,
rajesh
On Wed, 05 Sep 2007 Rajesh wrote :
>Hi,
>
>Is there any way to find using kernel data structure, the VMA of a process belongs to stack or heap. It is easy to distinguish the VMA  belongs to code segment or not from vm_area_struct structure, using "vm_flags" variable.
>
>In "elf_core_dump()" function I'm planning to dump only code and data segments.
>
>Can any body please guide me...
>
>--Regards,
>rajesh
>
>On Wed, 05 Sep 2007 Dave Anderson wrote :
> >Rajesh wrote:
> >>Dave,
> >>
> >>Thanks for your explanation.
> >>
> >>Well the reason behind my questions is, we have an application running on customer site and the application consumes around 60GB of system memory.
> >>When this process receives the segmentation fault or signal abort, the kernel will start to take the process core dump. Here is the problem. Kernel takes at least  1hr (60-minutes) to come out from core dump. During this time the system is unresponsive (hung), and I feel it is because the system is entering into thrashing due to huge memory usage by the process. This long down time is not acceptable by the customer.
> >>
> >>So I started to find the better way or tackling the problem.
> >>
> >>1>First thing we thought is changing the system page size from 4KB to 8KB. Since this change could not be done on our x86_64 architecture, since x86_64 architecture doesnt support multi-page size option.
> >>
> >>2>We wrote a program using libbfd APIs and used with in our application. Whenever the SIGSEGV or SIGABRT is received by the process it will log the stack trace of all the threads within that process. This feature is not so effective or flexible as compared to process core dump.
> >>
> >>3>Last we thought of using kcore/vmcore to analyze the cause for SIGSEGV or SIGABRT.
> >>
> >>4>I have one more thought, making the elf_core_dump() function SMP. This function is responsible for dumping the core, and the function is present in /usr/src/linux/fs/binfmt_elf.c
> >>
> >>
> >>Any comments/ideas are welcome.
> >>
> >>--Regards,
> >>rajesh
> >
> >Maybe tinker with maydump()?
> >
> >If you know that the core dump contains the VMA's that are
> >not necessary to dump, such as large shared memory segments,
> >and you can identify them from the VMA, you can prevent
> >them from being copied to the core dump.  There's this
> >patch floating around, which may have been updated:
> >
> >  http://lkml.org/lkml/2007/2/16/149
> >
> >Dave
> >
> >
> >
> >
>--
>Crash-utility mailing list
>Crash-utility(a)redhat.com
>https://www.redhat.com/mailman/listinfo/crash-utility
                                
                         
                        
                                
                                18 years, 2 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: Re: [Crash-utility] User Stack back trace of  the process
                                
                                
                                
                                    
                                        by Rajesh
                                    
                                
                                
                                        Hi, 
Is there any way to find using kernel data structure, the VMA of a process belongs to stack or heap. It is easy to distinguish the VMA  belongs to code segment or not from vm_area_struct structure, using "vm_flags" variable.
In "elf_core_dump()" function I'm planning to dump only code and data segments.
Can any body please guide me...
--Regards,
rajesh
On Wed, 05 Sep 2007 Dave Anderson wrote :
>Rajesh wrote:
>>Dave,
>>
>>Thanks for your explanation.
>>
>>Well the reason behind my questions is, we have an application running on customer site and the application consumes around 60GB of system memory.
>>When this process receives the segmentation fault or signal abort, the kernel will start to take the process core dump. Here is the problem. Kernel takes at least  1hr (60-minutes) to come out from core dump. During this time the system is unresponsive (hung), and I feel it is because the system is entering into thrashing due to huge memory usage by the process. This long down time is not acceptable by the customer.
>>
>>So I started to find the better way or tackling the problem.
>>
>>1>First thing we thought is changing the system page size from 4KB to 8KB. Since this change could not be done on our x86_64 architecture, since x86_64 architecture doesnt support multi-page size option.
>>
>>2>We wrote a program using libbfd APIs and used with in our application. Whenever the SIGSEGV or SIGABRT is received by the process it will log the stack trace of all the threads within that process. This feature is not so effective or flexible as compared to process core dump.
>>
>>3>Last we thought of using kcore/vmcore to analyze the cause for SIGSEGV or SIGABRT.
>>
>>4>I have one more thought, making the elf_core_dump() function SMP. This function is responsible for dumping the core, and the function is present in /usr/src/linux/fs/binfmt_elf.c
>>
>>
>>Any comments/ideas are welcome.
>>
>>--Regards,
>>rajesh
>
>Maybe tinker with maydump()?
>
>If you know that the core dump contains the VMA's that are
>not necessary to dump, such as large shared memory segments,
>and you can identify them from the VMA, you can prevent
>them from being copied to the core dump.  There's this
>patch floating around, which may have been updated:
>
>  http://lkml.org/lkml/2007/2/16/149
>
>Dave
>
>
>
>
                                
                         
                        
                                
                                18 years, 2 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: Re: [Crash-utility] User Stack back trace of  the process
                                
                                
                                
                                    
                                        by Rajesh
                                    
                                
                                
                                        Dave,
Thanks for your explanation.
Well the reason behind my questions is, we have an application running on customer site and the application consumes around 60GB of system memory.
When this process receives the segmentation fault or signal abort, the kernel will start to take the process core dump. Here is the problem. Kernel takes at least  1hr (60-minutes) to come out from core dump. During this time the system is unresponsive (hung), and I feel it is because the system is entering into thrashing due to huge memory usage by the process. This long down time is not acceptable by the customer.
So I started to find the better way or tackling the problem.
1>First thing we thought is changing the system page size from 4KB to 8KB. Since this change could not be done on our x86_64 architecture, since x86_64 architecture doesnt support multi-page size option.
2>We wrote a program using libbfd APIs and used with in our application. Whenever the SIGSEGV or SIGABRT is received by the process it will log the stack trace of all the threads within that process. This feature is not so effective or flexible as compared to process core dump. 
3>Last we thought of using kcore/vmcore to analyze the cause for SIGSEGV or SIGABRT.
4>I have one more thought, making the elf_core_dump() function SMP. This function is responsible for dumping the core, and the function is present in /usr/src/linux/fs/binfmt_elf.c
Any comments/ideas are welcome.
--Regards,
rajesh
  
>
>Rajesh,
>
>Castor's patch/suggestion is the best/only option you have
>for this kind of thing.  I've not tried it, but since the
>crash utility's "vm -p" option delineates where each
>instantiated page of a given task is located, it's potentially
>possible to recreate an ELF core file of the specified
>task.  (Any swapped-out pages won't be in the vmcore...)
>
>The embedded gdb module inside of crash is invoked internally
>as "gdb vmlinux", and has no clue about any other user-space
>program.
>
>That being said, you can execute the gdb "add-symbol-file"
>command to load the debuginfo data from a user space
>program, and then examine user-space data from the context
>of that program.
>
>For example, when you run the crash utility on a live system,
>the default context is that of the "crash" utility itself:
>
>   $ ./crash
>
>   crash 4.0-4.6
>   Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007  Red Hat, Inc.
>   Copyright (C) 2004, 2005, 2006  IBM Corporation
>   Copyright (C) 1999-2006  Hewlett-Packard Co
>   Copyright (C) 2005, 2006  Fujitsu Limited
>   Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
>   Copyright (C) 2005  NEC Corporation
>   Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
>   Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
>   This program is free software, covered by the GNU General Public License,
>   and you are welcome to change it and/or distribute copies of it under
>   certain conditions.  Enter "help copying" to see the conditions.
>   This program has absolutely no warranty.  Enter "help warranty" for details.
>
>   GNU gdb 6.1
>   Copyright 2004 Free Software Foundation, Inc.
>   GDB is free software, covered by the GNU General Public License, and you are
>   welcome to change it and/or distribute copies of it under certain conditions.
>   Type "show copying" to see the conditions.
>   There is absolutely no warranty for GDB.  Type "show warranty" for details.
>   This GDB was configured as "i686-pc-linux-gnu"...
>
>         KERNEL: /boot/vmlinux-2.4.21-37.ELsmp
>      DEBUGINFO: /usr/lib/debug/boot/vmlinux-2.4.21-37.ELsmp.debug
>       DUMPFILE: /dev/mem
>           CPUS: 2
>           DATE: Tue Sep  4 16:36:53 2007
>         UPTIME: 15 days, 08:15:06
>   LOAD AVERAGE: 0.14, 0.06, 0.01
>          TASKS: 87
>       NODENAME: crash.boston.redhat.com
>        RELEASE: 2.4.21-37.ELsmp
>        VERSION: #1 SMP Wed Sep 7 13:28:55 EDT 2005
>        MACHINE: i686  (1993 Mhz)
>         MEMORY: 511.5 MB
>            PID: 9381
>        COMMAND: "crash"
>           TASK: dd63c000
>            CPU: 1
>          STATE: TASK_RUNNING (ACTIVE)
>   crash>
>
>Verify the current context:
>
>   crash> set
>       PID: 9381
>   COMMAND: "crash"
>      TASK: dd63c000
>       CPU: 0
>     STATE: TASK_RUNNING (ACTIVE)
>   crash>
>
>So, for example, the crash utility has a program_context
>data structure that starts like this:
>
>   struct program_context {
>           char *program_name;             /* this program's name */
>           char *program_path;             /* unadulterated argv[0] */
>           char *program_version;          /* this program's version */
>           char *gdb_version;              /* embedded gdb version */
>           char *prompt;                   /* this program's prompt */
>           unsigned long long flags;       /* flags from above */
>           char *namelist;                 /* linux namelist */
>           ...
>
>And it declares a data variable with the same name:
>
>   struct program_context program_context = { 0 };
>
>If I wanted to see a gdb-style dump of its contents, I can
>do this:
>
>   crash> add-symbol-file ./crash
>   add symbol table from file "./crash" at
>   Reading symbols from ./crash...done.
>   crash>
>
>Now the embedded gdb has the debuginfo data from the crash
>object file (which was compiled with -g), and it knows where
>the program_context structure is located in user space:
>
>   crash> p &program_context
>   $1 = (struct program_context *) 0x8391ea0
>   crash>
>
>Since 0x8391ea0 is not a kernel address, the "p" command cannot
>be used to display the data structure.  However, the crash
>utility's "struct" command has a little-used "-u" option, which
>indicates that the address that follows is a user-space address
> from the current context:
>
>   crash> struct program_context -u 0x8391ea0
>   struct program_context {
>     program_name = 0xbffff9b0 "crash",
>     program_path = 0xbffff9ae "./crash",
>     program_version = 0x82e9c12 "4.0-4.6",
>     gdb_version = 0x834ecdf "6.1",
>     prompt = 0x8400438 "crash> ",
>     flags = 844424965983303,
>     namelist = 0x83f5940 "/boot/vmlinux-2.4.21-37.ELsmp",
>     ...
>
>That all being said, this capability cannot be used to generate
>any kind of user-space backtrace.  You can do raw reads of the
>user-space stack, say from the point at which it entered kernel
>space, but whether that's of any help depends upon what you're
>looking for.
>
>Dave
>
                                
                         
                        
                                
                                18 years, 2 months