 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: [Crash-utility] [PATCH/RFC] Fix relocation address
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
----- "Simon Kagstrom" <simon.kagstrom(a)netinsight.net> wrote:
> On Thu, 17 Dec 2009 11:17:56 -0500 (EST)
> Dave Anderson <anderson(a)redhat.com> wrote:
> 
> > > > So I started looking into the code and found something which looks like
> > > > a typo in relocate() (patch below). Changing this makes crash work for me.
> > > 
> > > Actually it's not a typo -- your patch would presumably break with all kernels
> > > that have a CONFIG_PHYSICAL_START is greater than CONFIG_PHYSICAL_ALIGN, which
> > > is what the patch was written to handle.
> > > 
> > > What are your kernel's CONFIG_PHYSICAL_START and CONFIG_PHYSICAL_ALIGN
> > > values?  Does crash work with your kernel on the live system?  
> 
> You are right. I had problems with getting things working, so I've
> played around with various settings. I had CONFIG_PHYSICAL_START set to
> 0 and CONFIG_PHYSICAL_ALIGN set to 0x100000. Setting these to e.g.,
> 0x100000 and 0x100000 unbreaks things again.
> 
> I don't need to supply --reloc either then, not sure what I did wrong
> before. I'm sticking with sane settings from now on.
> 
> > > Anyway, I believe that the fix would require support for supplying a 
> > > negative --reloc value.
> > 
> > On the other hand, if the config values were the other way around, the 
> > problem didn't use to show up -- at least according to list item "1)"
> > below in the changelog:
> > 
> >             1) Configure the kernel with CONFIG_PHYSICAL_START less than
> >                or equal to CONFIG_PHYSICAL_ALIGN.  Having done that, there
> >                is no problem; the resultant vmlinux file will be loaded at
> >                the address for which it was compiled, which has always
> >                been the case.
> 
> > I wonder if you can use the unpatched crash, but supply a --reloc value that
> > will cause a wrap-around to the correct value?
> 
> Well, I suppose that would work if it was possible to supply a negative
> --reloc value, but I'm not sure it's really worth it. What would be
> nice would be to get a more descriptive error message.
Yeah, the problem is that the "do not match" errors can result from
a multitude of error scenarios.  Usually by entering a "-d <number>"
on the command line (the higher the debug number the more verbose),
the issue generating the failure typically is evident.
> 
> Thanks for the help, please ignore the patch.
OK for now -- and thanks for posting.  It's only a matter of time before
somebody else runs into the same thing.
Thanks,
  Dave
> 
> // Simon
                                
                         
                        
                                
                                15 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: [Crash-utility] [PATCH/RFC] Fix relocation address
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
----- "Simon Kagstrom" <simon.kagstrom(a)netinsight.net> wrote:
> Hi!
> 
> I'm having problems getting kdumps from my relocatable kernel (2.6.31-8)
> working with crash on a IA-32 board. I use makedumpfile to generate a
> compressed dump, and when I try to load it with crash I get
> 
>   ./crash vmlinux vmcore --reloc=0x100000
>   crash: invalid kernel virtual address: 98  type: "present"
>   WARNING: cannot read cpu_present_map
>   crash: invalid kernel virtual address: 908bd975  type: "online"
>   WARNING: cannot read cpu_online_map
>   crash: cannot determine base kernel version
>   crash: vmlinux and vmcore do not match!
> 
> specifying --reloc also fails:
> 
>   ./crash vmlinux vmcore --reloc=0x100000
>   crash: seek error: kernel virtual address: c01a2108  type:
> "cpu_possible_mask"
> 
> 
> So I started looking into the code and found something which looks like
> a typo in relocate() (patch below). Changing this makes crash work for me.
Actually it's not a typo -- your patch would presumably break with all kernels
that have a CONFIG_PHYSICAL_START is greater than CONFIG_PHYSICAL_ALIGN, which
is what the patch was written to handle.
What are your kernel's CONFIG_PHYSICAL_START and CONFIG_PHYSICAL_ALIGN
values?  Does crash work with your kernel on the live system?  
Anyway, I believe that the fix would require support for supplying a 
negative --reloc value.
> 
> Great tool by the way, leaves you longing for the next kernel panic
> ;-)
> 
> // Simon
> 
> --- orig-crash-4.1.2/symbols.c	2009-12-09 21:37:40.000000000 +0100
> +++ crash-4.1.2/symbols.c	2009-12-17 16:03:24.000000000 +0100
> @@ -671,7 +671,7 @@ relocate(ulong symval, char *symname, in
>  		break;
>  	}
>  
> -	return (symval - kt->relocate);
> +	return (symval + kt->relocate);
>  }
>  
>  /*
                                
                         
                        
                                
                                15 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        [PATCH/RFC] Fix relocation address
                                
                                
                                
                                    
                                        by Simon Kagstrom
                                    
                                
                                
                                        Hi!
I'm having problems getting kdumps from my relocatable kernel (2.6.31-8)
working with crash on a IA-32 board. I use makedumpfile to generate a
compressed dump, and when I try to load it with crash I get
  ./crash vmlinux vmcore --reloc=0x100000
  crash: invalid kernel virtual address: 98  type: "present"
  WARNING: cannot read cpu_present_map
  crash: invalid kernel virtual address: 908bd975  type: "online"
  WARNING: cannot read cpu_online_map
  crash: cannot determine base kernel version
  crash: vmlinux and vmcore do not match!
specifying --reloc also fails:
  ./crash vmlinux vmcore --reloc=0x100000
  crash: seek error: kernel virtual address: c01a2108  type: "cpu_possible_mask"
So I started looking into the code and found something which looks like
a typo in relocate() (patch below). Changing this makes crash work for
me.
Great tool by the way, leaves you longing for the next kernel panic ;-)
// Simon
--- orig-crash-4.1.2/symbols.c	2009-12-09 21:37:40.000000000 +0100
+++ crash-4.1.2/symbols.c	2009-12-17 16:03:24.000000000 +0100
@@ -671,7 +671,7 @@ relocate(ulong symval, char *symname, in
 		break;
 	}
 
-	return (symval - kt->relocate);
+	return (symval + kt->relocate);
 }
 
 /*
                                
                         
                        
                                
                                15 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: Request for ppc64 help from IBM
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
----- "Dave Anderson" <anderson(a)redhat.com> wrote:
> Somewhere between the RHEL5 (2.6.18-based) and RHEL6 timeframe,
> the ppc64 architecture has started using a virtual memmap scheme
> for the arrays of page structures used to describe/handle
> each physical page of memory.
... [ snip ] ...
> So my speculation (guess?) is that the ppc64.c ppc64_vtop()
> function needs updating to properly translate these addresses.
> 
> Since the ppc64 stuff in the crash utility was written by, and
> has been maintained by IBM (and since I am ppc64-challenged),
> can you guys take a look at what needs to be done?
[ sound of crickets... ]
Well that request apparently fell on deaf ears...
Here's my understanding of the situation.
In 2.6.26 the ppc64 architecture started using a new kernel virtual
memory region to map the kernel's page structure array(s), so that
now there are three kernel virtual memory regions:
  
  KERNEL   0xc000000000000000
  VMALLOC  0xd000000000000000
  VMEMMAP  0xf000000000000000
The KERNEL region is the unity-mapped region, where the underlying
physical address can be determined by manipulating the virtual address
itself.  
The VMALLOC region requires a page-table walk-through to find
the underlying physical address in a PTE.
The new VMEMMAP region is mapped in ppc64 firmware, where a
physical address of a given size is mapped to a VMEMMAP virtual 
address.  So for example, the page structure for physical page 0 
is at VMEMMAP address 0xf000000000000000, the page for physical 
page 1 is at f000000000000068, and so on.  Once mapped in the
firmware TLB (?) the virtual-to-physical translation is done
automatically while running in kernel mode.
The problem is that the physical-to-vmemmap address/size mapping
information is not stored in the kernel proper, so there is
no way for the crash utility to make the translation.  That
being the case, any crash command that needs to read the contents
of any page structure will fail.
The kernel mapping is performed here in 2.6.26 through 2.6.31:
  int __meminit vmemmap_populate(struct page *start_page,
                                 unsigned long nr_pages, int node)
  {
          unsigned long start = (unsigned long)start_page;
          unsigned long end = (unsigned long)(start_page + nr_pages);
          unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift;
  
          /* Align to the page size of the linear mapping. */
          start = _ALIGN_DOWN(start, page_size);
  
          for (; start < end; start += page_size) {
                  int mapped;
                  void *p;
  
                  if (vmemmap_populated(start, page_size))
                          continue;
  
                  p = vmemmap_alloc_block(page_size, node);
                  if (!p)
                          return -ENOMEM;
  
                  pr_debug("vmemmap %08lx allocated at %p, physical %08lx.\n",
                          start, p, __pa(p));
  
                  mapped = htab_bolt_mapping(start, start + page_size, __pa(p),
                                             pgprot_val(PAGE_KERNEL),
                                             mmu_vmemmap_psize, mmu_kernel_ssize);
                  BUG_ON(mapped < 0);
          }
  
          return 0;
  } 
  
So if the pr_debug() statement is turned on, it shows on my test system:
  vmemmap f000000000000000 allocated at c000000003000000, physical 03000000
This would make for an extremely simple virtual-to-physical translation
for the crash utility, but note that neither the unity-mapped virtual address
of 0xc000000003000000 nor its associated physical address of 0x3000000 are
stored anywhere, since "p" is a stack variable.  The htab_bolt_mapping()
function does not store the mapping information in the kernel either, it
just uses temporary stack variables before calling the ppc_md.hpte_insert()
function which eventually leads to a machine-dependent (directly to firmware)
function.  
So unless I'm missing something, nowhere along the vmemmap call-chain are the 
VTOP address/size particulars stored anywhere -- say for example, in a 
/proc/iomem-like "resource" data structure.
(FWIW, I note that in 2.6.32, CONFIG_PPC_BOOK3E arches still use the normal page
tables to map the memmap array(s).  I don't know whether BOOK3E arch is the
most common or not...)
In any case, not being able to read the page structure contents has a
significant effect on the crash utility.  This is about the only thing
that can be done for these kernels, where a warning gets printed during
initialization, and any command that attempts to read a page structure
will subsequently fail:
  
  # crash vmlinux vmcore
  
  crash 4.1.2p1
  Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009  Red Hat, Inc.
  Copyright (C) 2004, 2005, 2006  IBM Corporation
  Copyright (C) 1999-2006  Hewlett-Packard Co
  Copyright (C) 2005, 2006  Fujitsu Limited
  Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
  Copyright (C) 2005  NEC Corporation
  Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
  Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
  This program is free software, covered by the GNU General Public License,
  and you are welcome to change it and/or distribute copies of it under
  certain conditions.  Enter "help copying" to see the conditions.
  This program has absolutely no warranty.  Enter "help warranty" for details.
   
  GNU gdb 6.1
  Copyright 2004 Free Software Foundation, Inc.
  GDB is free software, covered by the GNU General Public License, and you are
  welcome to change it and/or distribute copies of it under certain conditions.
  Type "show copying" to see the conditions.
  There is absolutely no warranty for GDB.  Type "show warranty" for details.
  This GDB was configured as "powerpc64-unknown-linux-gnu"...
  
  WARNING: cannot translate vmemmap kernel virtual addresses:
           commands requiring page structure contents will fail
  
        KERNEL: vmlinux                        
      DUMPFILE: vmcore
          CPUS: 2
          DATE: Thu Dec 10 05:40:35 2009
        UPTIME: 21:44:59
  LOAD AVERAGE: 0.11, 0.03, 0.01
         TASKS: 196
      NODENAME: ibm-js20-04.lab.bos.redhat.com
       RELEASE: 2.6.31-38.el6.ppc64
       VERSION: #1 SMP Sun Nov 22 08:15:30 EST 2009
       MACHINE: ppc64  (unknown Mhz)
        MEMORY: 2 GB
         PANIC: "Oops: Kernel access of bad area, sig: 11 [#1]" (check log for details)
           PID: 10656
       COMMAND: "runtest.sh"
          TASK: c000000072156420  [THREAD_INFO: c000000072058000]
           CPU: 0
         STATE: TASK_RUNNING (PANIC)
  
  crash> kmem -i
  kmem: cannot translate vmemmap address: f000000000000000
  crash> kmem -p
        PAGE       PHYSICAL      MAPPING       INDEX CNT FLAGS
  kmem: cannot translate vmemmap address: f000000000000000
  crash> kmem -s
  CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE
  kmem: cannot translate vmemmap address: f00000000030db44
  crash> 
  
Can any of the IBM engineers on this list (or any ppc64 user)
confirm my findings?  Maybe I'm missing something, but I don't
see it.
And if you agree, perhaps you can work on an upstream solution to
store the vmemmap-to-physical data information?
Dave
                                
                         
                        
                                
                                15 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        calling crash from another program (or vice versa)
                                
                                
                                
                                    
                                        by James Washer
                                    
                                
                                
                                        Often, I'd like to be able to run one crash command, massage the data
produced, and run follow up commands using the massaged data
A (possibly crazy) example, run the mount command, collect the
superblocks addresses, for each super_block, get the s_inodes list head,
traverse each list head to the inode, for each inode, find it's i_data
(address_space) and get the number of pages.. Now.. sum these up and
print a table of filesystem mounts points and the number of cached pages
for each... Perhaps, I'd even traverse the struct pages to provide a
count of clean and dirty pages for each file system.
I do do this by hand. (i.e. mount > mount.file; perlscript mount.file >
crash-script-step-1, then, back in crash I do ". crash-script-step-1 >
data-file-2; and repeat with more massaging).. This is gross, prone to
error, and not terribly fast.
I'd love to start crash as a child of perl and either use expect (which
is a bit of a hack) or better yet, have some machine interface to crash
(ala gdbmi)...
I know.. it's open source, I should write it myself. I just don't want
to reinvent the wheel, if someone else already has done something like
this.
Perhaps I need to learn sial. But what little sial I've looked at seems
a bit low level for my needs.
Has anyone had much luck using expect with crash?
thanks
 - jim
                                
                         
                        
                                
                                15 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Request for ppc64 help from IBM
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
Somewhere between the RHEL5 (2.6.18-based) and RHEL6 timeframe, 
the ppc64 architecture has started using a virtual memmap scheme
for the arrays of page structures used to describe/handle
each physical page of memory.
In RHEL5, the page structures in the memmap array were unity-mapped
(i.e., the physical address is or'd with c000000000000000), as 
"kmem -n" shows below in the sparsemem data breakdown under MEM_MAP:
  
  crash> kmem -n
  ... [ snip ] ...
  NR      SECTION        CODED_MEM_MAP        MEM_MAP       PFN
   0  c000000000750000  c000000000760000  c000000000760000  0               
   1  c000000000750008  c000000000760000  c000000000763800  256             
   2  c000000000750010  c000000000760000  c000000000767000  512             
   3  c000000000750018  c000000000760000  c00000000076a800  768             
   4  c000000000750020  c000000000760000  c00000000076e000  1024            
   5  c000000000750028  c000000000760000  c000000000771800  1280            
   6  c000000000750030  c000000000760000  c000000000775000  1536            
   7  c000000000750038  c000000000760000  c000000000778800  1792            
   8  c000000000750040  c000000000760000  c00000000077c000  2048            
   9  c000000000750048  c000000000760000  c00000000077f800  2304            
  10  c000000000750050  c000000000760000  c000000000783000  2560            
  11  c000000000750058  c000000000760000  c000000000786800  2816            
  12  c000000000750060  c000000000760000  c00000000078a000  3072            
  ...
also shown via the memmap page structure listing displayed by 
"kmem -p":
  
  crash> kmem -p
        PAGE       PHYSICAL      MAPPING       INDEX CNT FLAGS
  c000000000760000        0                0        0  1 400
  c000000000760038    10000                0        0  1 400
  c000000000760070    20000                0        0  1 400
  c0000000007600a8    30000                0        0  1 400
  c0000000007600e0    40000                0        0  1 400
  c000000000760118    50000                0        0  1 400
  c000000000760150    60000                0        0  1 400
  c000000000760188    70000                0        0  1 400
  c0000000007601c0    80000                0        0  1 400
  c0000000007601f8    90000                0        0  1 400
  ...
In RHEL6 (2.6.31-38.el6) the memmap page array is apparently
virtually memmap'd -- using a virtual range of memory starting
at a heretofore-unseen virtual address range starting at
f000000000000000:
  
  crash> kmem -n
  ... [ snip ] ...
  NR      SECTION        CODED_MEM_MAP        MEM_MAP       PFN
   0  c000000002160000  f000000000000000  f000000000000000  0               
   1  c000000002160020  f000000000000000  f000000000006800  256             
   2  c000000002160040  f000000000000000  f00000000000d000  512             
   3  c000000002160060  f000000000000000  f000000000013800  768             
   4  c000000002160080  f000000000000000  f00000000001a000  1024            
   5  c0000000021600a0  f000000000000000  f000000000020800  1280            
   6  c0000000021600c0  f000000000000000  f000000000027000  1536            
   7  c0000000021600e0  f000000000000000  f00000000002d800  1792            
   8  c000000002160100  f000000000000000  f000000000034000  2048            
   9  c000000002160120  f000000000000000  f00000000003a800  2304            
  10  c000000002160140  f000000000000000  f000000000041000  2560            
  ... [ snip ] ...
  crash> kmem -p
        PAGE       PHYSICAL      MAPPING       INDEX CNT FLAGS
  f000000000000000        0                0        0  0 0
  f000000000000068    10000                0        0  0 0
  f0000000000000d0    20000                0        0  0 0
  f000000000000138    30000                0        0  0 0
  f0000000000001a0    40000                0        0  0 0
  f000000000000208    50000                0 -4611686016392006416  0 0
  f000000000000270    60000                0        0  0 0
  f0000000000002d8    70000                0        0  0 0
  f000000000000340    80000                0        0  0 0
  f0000000000003a8    90000                0 -4611686016730798344  0 0
  f000000000000410    a0000                0        0  0 0
  f000000000000478    b0000                0        0  0 0
  f0000000000004e0    c0000                0        0  0 c0000000651534e0
  f000000000000548    d0000                0        0  0 0
  ...
But as can be seen in the "kmem -p" output, and when using other
commands that actually read the data in the page structure, the
data read is either bogus or the readmem() of the address just fails
the virtual address translation and indicates that the page is not mapped.
Because the page structures' virtual address is not unity-mapped, 
the page address gets translated via page table walk-through in the
same manner as vmalloc()'d addresses.  In the ppc64 architecture,
the vmalloc range starts at d000000000000000:
  crash> mach
  ...
  KERNEL VIRTUAL BASE: c000000000000000
  KERNEL VMALLOC BASE: d000000000000000
  ...
Since the ppc64 virtual-to-physical address translations of
these f000000000000000-based addresses returns either a 
bogus physical address or fails entirely, this in turn causes 
bizarre errors in crash commands that actually read the contents
of page structures -- such as "kmem -s", where slub data is 
stored in the page structure.
So my speculation (guess?) is that the ppc64.c ppc64_vtop()
function needs updating to properly translate these addresses.
Since the ppc64 stuff in the crash utility was written by, and 
has been maintained by IBM (and since I am ppc64-challenged), 
can you guys take a look at what needs to be done?
Thanks,
  Dave
                                
                         
                        
                                
                                15 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        [ANNOUNCE] crash version 4.1.2 is available
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
 - Fix for 2.6.31 or later x86_64 CONFIG_NEED_MULTIPLE_NODES kernels
   running on systems that have multiple NUMA nodes.  By default, those
   kernels use the "page" (or "lpage") percpu memory allocators, which 
   utilize vmalloc space for percpu memory.  Without the patch, the
   crash session would fail during initialization with the error message
   "crash: cannot determine idle task addresses from init_tasks[] or
   runqueues[]", followed by "crash: cannot resolve init_task_union".
   (anderson(a)redhat.com)
 - Fix for the snap.c extension module to properly handle NUMA systems
   with multiple nodes, or single node systems whose first unity-mapped
   PT_LOAD segment starts on a non-zero physical address.  Without the 
   patch, a crash session on the resultant vmcore would fail with the 
   error message: "crash: vmlinux and <filename> do not match!"
   (anderson(a)redhat.com)
 - Added a defensive mechanism to handle corrupt Elf32_Phdr/Elf64_Phdr 
   structures in an ELF vmcore.  Without the patch, a hand-carved bogus
   p_offset field in a Elf32_Phdr/Elf64_Phdr structure could possibly 
   cause a segmentation violation during inialization.  With the fix, 
   if an invalid Elf32_Phdr or Elf64_Phdr p_offset field is encountered, 
   a warning message will be displayed, and the crash session will bail 
   out gracefully, or continue on if possible.
   (anderson(a)redhat.com)
 - Added a defensive mechanism to handle corrupt Elf32_Ehdr/Elf64_Ehdr
   structures in an ELF vmcore.  Without the patch, a hand-carved bogus
   e_phnum field in a Elf32_Phdr/Elf64_Phdr structure could possibly
   cause a segmentation violation during inialization.  With the fix,
   if an invalid Elf32_Ehdr or Elf64_Ehdr e_phnum field is encountered, 
   a warning message will be displayed and the crash session will bail
   out gracefully.
   (anderson(a)redhat.com)
 - More non-functional changes for future integration of gdb-7.0 and 
   for addressing Fedora packaging guidelines.
   (anderson(a)redhat.com)
 - Fix for the x86 "bt [-t|-T]" commands when the backtrace passes 
   through three stacks, which can happen when an interrupt is taken 
   while operating on a per-cpu soft IRQ stack, and the crash occurs
   while operating on the per-cpu hard IRQ stack.  Without the patch, 
   the "bt" command terminates after displaying backtrace on the hard 
   IRQ stack; "bt -t" displays the stack contents of the hard IRQ stack
   but stops with the error message "bt: non-process stack address for 
   this task: <task-address>"; "bt -T" displays the the same error
   message as "bt -t", but displays the stack contents of the process
   stack.  With the fix, all three "bt" invocations will display the
   backtraces or kernel text addresses on all three stacks, correctly
   transitioning from the hard IRQ stack to the soft IRQ stack to the
   process stack.
   (anderson(a)redhat.com)
 - When handcrafting the backtrace starting points for the "bt" command
   by using the -S options, and the starting stack address is not in 
   the task's process stack, a message gets displayed that indicates
   "non-process stack address for this task".  However, if the starting
   stack address is a legitimate non-process stack address, such as a
   hard or soft IRQ stack address, or an x86_64 exception stack address,
   the message is confusing, and has been removed.
   (anderson(a)redhat.com)
 Download from: http://people.redhat.com/anderson
                                
                         
                        
                                
                                15 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: [Crash-utility] fuzzing crash(8)
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
----- "Dave Anderson" <anderson(a)redhat.com> wrote:
> ----- "Adrien Kunysz" <adk(a)redhat.com> wrote:
> 
> > Adrien Kunysz wrote:
> > > Actually that patch fixes all the crashes I found with my previous round
> > > of black box fuzzing on x86_64 (using zzuf if anyone is interested).  I
> > > am currently playing with bunny 
> > > (http://code.google.com/p/bunny-the-fuzzer/) but I am a bit doubtful it
> > > will find anything useful in any decent amount of time without some
> > > manual work, oh well CPU time is cheap :)
> >
> > I wasn't expecting Bunny to find anything for a few days but it only took
> > about three hours :)
> >
> > If we take the same x86_64 vmcore again:
> >
> > 00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
> > 00000010  04 00 3e 00 01 00 00 00  00 00 00 00 00 00 00 00  |..>.............|
> > 00000020  40 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |@...............|
> > 00000030  00 00 00 00 40 00 38 00  03 80 00 00 00 00 00 00  |....@.8.........|
> >
> > and mess a bit with byte 0x39:
> >
> > 00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
> > 00000010  04 00 3e 00 01 00 00 00  00 00 00 00 00 00 00 00  |..>.............|
> > 00000020  40 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |@...............|
> > 00000030  00 00 00 00 40 00 38 00  03 00 00 00 00 00 00 00  |....@.8.........|
You've got the two dumps above backwards, but as it turns out, a manual corruption
of the ELF header's e_phnum field should be pretty easy to handle -- try the attached
patch.
Thanks,
  Dave
                                
                         
                        
                                
                                15 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: [Crash-utility] fuzzing crash(8)
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
----- "Dave Anderson" <anderson(a)redhat.com> wrote:
I did the same thing to a vmcore (i.e. handcrafting the PT_NOTE
segment's p_offset field like you did), and was able to get the
crash session up with the attached patch.
Does it work for you?
Dave
                                
                         
                        
                                
                                15 years, 11 months