Re: [Crash-utility] Cell/B.E. SPU commands extension
                                
                                
                                
                                    
                                        by Lucio Correia
                                    
                                
                                
                                        > Lucio Correia wrote:
> > Hi,
> >
> > I've developed this crash extension to analyze SPU specific data for
> > Cell/B.E. processor. This extension makes use of some important data
> > saved by this kernel patch (that is not mainline yet)
> > http://ozlabs.org/pipermail/cbe-oss-dev/2007-May/001848.html during the
> > crash dump.
> >
> > I would like to check if there is any issue with the code.
> >
> 
> Functionally it looks fine.
> 
> I changed the _init() function to just do an error(INFO, ...)
> so I could load the extensions, and the only suggestion
> I can make is purely aesthetic, which would be to make
> the "help" messages 80 characters or less like the regular
> commands are.  In other words, the "DESCRIPTION" section
> outputs, and the sentences in in the "spuctx" EXAMPLE
> section are kind of ugly the way that they run on with
> no linefeeds.
> 
> But like I said before, I don't see any issues/problems
> with the code -- pretty nifty extension...
> 
> Dave
> 
Thanks for the comments, Dave. I'm correcting these issues.
Regards,
-- 
Lucio Correia
Software Engineer
IBM LTC Brazil
                                
                         
                        
                                
                                18 years, 1 month
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        [RFC] Crash extension for SystemTap
                                
                                
                                
                                    
                                        by Satoru MORIYA
                                    
                                
                                
                                        Hi,
Here is an extension(shared object) of the crash to retrieve the trace
data of systemtap scripts.
I'd like to analyze what caused the kernel panic by using the systemtap.
However, currently the systemtap's trace data can't be retrieved from a
dumped image easily. So, I developed a crash's extension which retrieves
the data recorded by systemtap from the dumped image.
Here is a brief document of this extension. This extension supports the new
utt-based buffer as well as the bulk-mode buffer of old systemtap module.
I have tested this extention on the following system.
  * FC6, i386, kernel-2.6.21, systemtap-0.5.14, crash-4.0-1.1
  * FC6, i386, kernel-2.6.20, systemtap-0.5.13/14, crash-4.0-1.1
  * RHEL5, i386, kernel-2.6.18-8.el5, systemtap-0.5.12, crash-4.0-3.14
Preparation
==============
(A) Build the shared-object(stplog.so).
1. Put Makefile and stplog.c into a directory ($DIR)
    $ cd $DIR
2. Make the symbolic link to the crash source code directory
    $ ln -s $WHERE_CRASH_PLACED crash
3. Build
    $ make
(B) Make the crash dump which includes SystemTap trace data.
    (*)If you analyze the live system memory, ignore this section.
1. Install kdump
     If you use FC6, see following URL.
     http://fedoraproject.org/wiki/FC6KdumpKexecHowTo?highlight=%28kdump%29
2. Use SystemTap
    $ stap foo.stp
3. Panic
    $ echo c > /proc/sysrq-trigger
How to use
==============
1. start crash
    $ crash vmlinux vmcore
    (*) If you analyze the live system memory, you don't need "vmcore".
         $ crash vmlinux
2. load the shared-object
    crash> extend $(WHERE_OBJ_PLACED)/stplog.so
3. retrieve the data
    crash> stplog -m <mod_name>
    (*) <mod_name> is the name of trace module from which you retrieve data.
4. You can get output files under the directory whose name is <mod_name>.
Output
==============
stplog command makes a file per channel buffer of relayfs(equivalent to per cpu).
And it also removes padding bytes.
I believe this command is very useful for system administrators
if they monitor their systems with SystemTap.
Best Regards,
---
Satoru MORIYA
Linux Technology Center
Hitachi, Ltd., Systems Development Laboratory
E-mail: satoru.moriya.br(a)hitachi.com
                                
                         
                        
                                
                                18 years, 2 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        [Patch] linux-2.6.22 on i386
                                
                                
                                
                                    
                                        by Ken'ichi Ohmichi
                                    
                                
                                
                                        
Hi Dave,
I found the problem that the crash utility fails during the initialization.
This problem happened on i386 linux-2.6.22 like the following:
  $ crash vmlinux vmcore
  
  crash 4.0-4.4
  Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007  Red Hat, Inc.
  Copyright (C) 2004, 2005, 2006  IBM Corporation
  Copyright (C) 1999-2006  Hewlett-Packard Co
  Copyright (C) 2005, 2006  Fujitsu Limited
  Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
  Copyright (C) 2005  NEC Corporation
  Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
  Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
  This program is free software, covered by the GNU General Public License,
  and you are welcome to change it and/or distribute copies of it under
  certain conditions.  Enter "help copying" to see the conditions.
  This program has absolutely no warranty.  Enter "help warranty" for details.
  
  GNU gdb 6.1
  Copyright 2004 Free Software Foundation, Inc.
  GDB is free software, covered by the GNU General Public License, and you are
  welcome to change it and/or distribute copies of it under certain conditions.
  Type "show copying" to see the conditions.
  There is absolutely no warranty for GDB.  Type "show warranty" for details.
  This GDB was configured as "i686-pc-linux-gnu"...
  
  crash: invalid size request: 0  type: "__per_cpu_offset"
  $ 
The cause is that the array number of the symbol "__per_cpu_offset"
cannot be taken. In linux-2.6.21, i386's "__per_cpu_offset" is defined
in include/asm-generic/percpu.h like the following:
  extern unsigned long __per_cpu_offset[NR_CPUS];
But in linux-2.6.22, it is defined in include/asm-i386/percpu.h like
the following, and the array number is not described in the debugging
information.
  extern unsigned long __per_cpu_offset[];
Here is the patch for solving the problem. If the array number is not
taken, the crash utility assumes that it is  the defined value NR_CPUS.
Or, should get_array_length() be fixed to get the array number from
init/main.c ?
Thanks
Ken'ichi Ohmichi
--- crash-4.0-4.4.org/kernel.c	2007-07-21 04:19:23.000000000 +0900
+++ crash-4.0-4.4/kernel.c	2007-07-24 19:51:55.000000000 +0900
@@ -170,7 +170,7 @@ kernel_init()
 		else
 			i = get_array_length("__per_cpu_offset", NULL, 0);
 		get_symbol_data("__per_cpu_offset",
-			sizeof(long)*(i <= NR_CPUS ? i : NR_CPUS),
+			sizeof(long)*((i && (i <= NR_CPUS)) ? i : NR_CPUS),
 			&kt->__per_cpu_offset[0]);
                 kt->flags |= PER_CPU_OFF;
 	}
_
                                
                         
                        
                                
                                18 years, 3 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        ATA scsi driver misbehavior under kdump capture kernel
                                
                                
                                
                                    
                                        by Cliff Wickman
                                    
                                
                                
                                        
I've run into a problem with the ATA SCSI disk driver when running in a
kdump dump-capture kernel.
I'm running on 2-processor x86_64 box.  It has 2 scsi disks, /dev/sda and
/dev/sdb
My kernel is 2.6.22, and built to be a dump capturing kernel loaded by kexec.
When I boot this kernel by itself, it finds both sda and sdb.
But when it is loaded by kexec and booted on a panic it only finds sda.
Any ideas from those familiar with the ATA driver?
-Cliff Wickman
 SGI
I put some printk's into it and get this:
Standalone:
                                                   [nv_adma_error_handler]
cpw: ata_host_register probe port 1 (error_handler:ffffffff81348625)
cpw: ata_host_register call ata_port_probe
cpw: ata_host_register call ata_port_schedule
cpw: ata_host_register call ata_port_wait_eh
cpw: ata_port_wait_eh entered
cpw: ata_port_wait_eh, preparing to wait
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
cpw: ata_dev_configure entered
cpw: ata_dev_configure testing class
cpw: ata_dev_configure class is ATA_DEV_ATA
ata2.00: ATA-6: ST3200822AS, 3.01, max UDMA/133
ata2.00: 390721968 sectors, multi 16: LBA48
cpw: ata_dev_configure exiting
cpw: ata_dev_configure entered
cpw: ata_dev_configure testing class
cpw: ata_dev_configure class is ATA_DEV_ATA
cpw: ata_dev_configure exiting
cpw: ata_dev_set_mode printing:
ata2.00: configured for UDMA/133
cpw: ata_port_wait_eh, finished wait
cpw: ata_port_wait_eh exiting
cpw: ata_host_register done with probe port 1
When loaded with kexec and booted on a panic:
cpw: ata_host_register probe port 1 (error_handler:ffffffff81348625)
cpw: ata_host_register call ata_port_probe
cpw: ata_host_register call ata_port_schedule
cpw: ata_host_register call ata_port_wait_eh
cpw: ata_port_wait_eh entered
cpw: ata_port_wait_eh, preparing to wait
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
cpw: ata_port_wait_eh, finished wait
cpw: ata_port_wait_eh exiting
cpw: ata_host_register done with probe port 1
                                
                         
                        
                                
                                18 years, 3 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        crash version 4.0-4.5 is available
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
  - Addresses FC7/upstream x86 kernels that have been configured such
    that the vmlinux symbol values do not match their relocated values
    when loaded.  If CONFIG_PHYSICAL_START contains an value that is
    greater then CONFIG_PHYSICAL_ALIGN, then this mismatch occurs.
    Since the crash utility and its embedded gdb have always expected
    that the compiled-in kernel symbol addresses are "real", the virtual
    to physical translation fails, leading to an initialization-time
    failure with the message: "crash: vmlinux and /dev/crash do not
    match!" (/dev/mem or the dumpfile name may replace /dev/crash).
    To deal with this issue, there are several alternatives:
     1) Configure the kernel with CONFIG_PHYSICAL_START less than
        or equal to CONFIG_PHYSICAL_ALIGN.  Having done that, there
        is no problem; the resultant vmlinux file will be loaded at
        the address for which it was compiled, which has always
        been the case.
     2) Since /proc/kallsyms uses the same format as a System.map file,
        and since it reflects the relocated symbol addresses, it
        can be placed on the crash command line as if it were
        a System.map file.  (Note that the System.map file created
        by these relocated kernels contains the same "wrong" symbol
        values as the vmlinux file from which it was created.)
     3) On a live system that has /proc/kallsyms (i.e., the kernel was
        configured with CONFIG_KALLSYMS), this version of the crash
        utility will replace/patch the vmlinux symbol values with those
        seen in /proc/kallsyms.  The relocation value will be displayed
        as a WARNING message during initialization.
     4) On a dumpfile, the relocation will not be performed automatically
        as on a live system.  It will require the addition of the
        /proc/kallsyms on the command line, or if run on a different
        host, a copy of the crashed system's /proc/kallsyms may be
        used.
     5) Alternatively on a dumpfile, a new command line option has been
        created to specify the relocation amount.  For example, if a
        kernel was configured with a CONFIG_PHYSICAL_START value of 16MB
        and a CONFIG_PHYSICAL_ALIGN of 4MB, that results in a relocation
        of 12MB.  To specify that, enter "crash --reloc=12m ..." on the
        command line.  (Recall that if crash is run on the live system,
        a WARNING message will specify the relocation amount.)
    Using /proc/kallsyms or a --reloc=[size] as a command line argument
    is similar to using a System.map file, in that it results in the loss
    of the use of line number debug data.  (anderson(a)redhat.com)
  - Fix for x86 2.6.22 kernel initialization-time failure indicating:
    "crash: invalid size request: 0  type: __per_cpu_offset"
    (oomichi(a)mxs.nes.nec.co.jp)
  - Fix to recognize the 2.6.22 kernel's replacement of kmalloc slab
    subsystem from the "./mm/slab.c" file to CONFIG_SLUB-configured
    kernels that use the infrastructure in "./mm/slub.c".  Without this
    fix, crash sessions would fail during initialization with the message
    "crash: invalid structure member offset: kmem_cache_s_c_num".
    (anderson(a)redhat.com)
  - Cliff Wickman sent an additional patch for the LKCD kerntypes
    support he introduced in version 4.0-4.4, which addresses this
    message that is seen during initialization on 2.6.22 kernels:
    "WARNING: cannot determine pgdat list for this kernel/architecture".
    (cpw(a)sgi.com)
  - NOTE: The CONFIG_SLUB change in the 2.6.22 kernel will require a
    significant update in the crash utility in order for "kmem -[sS]"
    options to work again.
  - NOTE: 2.6.22 kernels have replaced the O(1) scheduler with the
    new CFS scheduler.  As a result, the "runq" command fails, which
    will require a crash utility update to recognize and display the
    contents of each cpu's run queue.
Download from: http://people.redhat.com/anderson
                                
                         
                        
                                
                                18 years, 3 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        More 2.6.22 shifting sands...
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
Just a head's up...
While tinkering around with a 2.6.22-8 kernel, after applying
Ken'ichi's patch to handle to the initialization-time fatal
error:
   crash: invalid size request: 0  type: "__per_cpu_offset"
it gets past that problem, but then fails like this:
   crash: invalid structure member offset: kmem_cache_s_c_num
          FILE: memory.c  LINE: 7042  FUNCTION: kmem_cache_init()
   [./crash] error trace: 8084dda => 80983a8 => 80ac45e => 8132074
     8132074: OFFSET_verify+118
     80ac45e: kmem_cache_init+416
     80983a8: vm_init+10365
     8084dda: main_loop+144
This problem is due to the replacement of the venerable kmalloc
slab subsystem in mm/slab.c with the new CONFIG_SLUB in mm/slub.c.
There's also the potential of CONFIG_SLOB in mm/slob.c, but it
appears that CONFIG_SLUB will be the kmalloc subsystem of the
future.
This initialization-time failure can be worked around by using
the "--no_kmem_cache" command line option.  In the short term,
I will fix it such that CONFIG_SLUB kernels will be recognized,
and kmem_cache_init() will not be called.  This obviously breaks
"kmem -s", and that command option will need to be completely
re-done to deal with this new subsystem.
Also, now that the CFS scheduler has been put into place, the
"runq" command no longer works.  Like "kmem -s" for CONFIG_SLUB,
that command will require a new implementation.
Dave
                                
                         
                        
                                
                                18 years, 3 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                        
                                
                                
                                        
                                                
                                        
                                        
                                        kdump on x86_64 - why not discover a scsi device?
                                
                                
                                
                                    
                                        by Cliff Wickman
                                    
                                
                                
                                        
Hi all,
  Can anyone tell me why a dump capture kernel cannot discover my
root device?
  I'm using an x86_64 box.
  My 2.6.16 kernel can load either a 2.6.16 or 2.6.22 capture kernel
and it can successfully find the root device.
  The system has /dev/sdba and /dev/sdb.  I'm using a root on sdb.
  But when my 2.6.22 kernel loads either a 2.6.16 or 2.6.22 capture kernel
that capture kernel cannot find /dev/sdb.
  Must be a configuration issue.
  Maybe involving some combination of CONFIG_KEXEC and CONFIG_CRASH_DUMP.
  But I don't see how those can affect disk discovery.
  Any ideas?
-Cliff
                                
                         
                        
                                
                                18 years, 3 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: [Crash-utility] relocatable FC7/upstream x86 kernels
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
Dave Anderson wrote:
 > So, whereas the vmlinux file shows these symbol values:
 >
 >   $ nm -Bn vmlinux
 >   ...
 >   c1000000 T _text
 >   c1000000 T startup_32
 >   c1001000 T startup_32_smp
 >   c1001080 t checkCPUtype
 >   c1001101 t is486
 >   c1001108 t is386
 >   c1001175 t check_x87
 >   c10011a0 T setup_pda
 >   c10011c2 t setup_idt
 >   c10011df t rp_sidt
 >   c1001262 t early_divide_err
 >   c1001268 t early_illegal_opcode
 >   c1001271 t early_protection_fault
 >   c1001278 t early_page_fault
 >   c100127f t early_fault
 >   c10012a7 t hlt_loop
 >   c10012ac t ignore_int
 >   c10012f0 T _stext
 >   c10012f0 t run_init_process
 >   c10012f0 T stext
 >   c1001304 t init_post
 >   ...
 >
 > But when loaded into memory, they are all changed to reflect that
 > the kernel was loaded at at 4MB physical instead of 16MB:
 >
 >   $ cat /proc/kallsyms
 >   c0400000 T _text
 >   c0400000 T startup_32
 >   c0401000 T startup_32_smp
 >   c0401080 t checkCPUtype
 >   c0401101 t is486
 >   c0401108 t is386
 >   c0401175 t check_x87
Interesting -- I never thought /proc/kallsyms would have ever
come in so handy...
crash fails on my FC7 kernel in the "do not match" manner:
   # crash /vmlinux-2.6.21-1.3194.fc7
   crash 4.0-4.3
   Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007  Red Hat, Inc.
   Copyright (C) 2004, 2005, 2006  IBM Corporation
   Copyright (C) 1999-2006  Hewlett-Packard Co
   Copyright (C) 2005, 2006  Fujitsu Limited
   Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
   Copyright (C) 2005  NEC Corporation
   Copyright (C) 1999, 2002  Silicon Graphics, Inc.
   Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
   This program is free software, covered by the GNU General Public License,
   and you are welcome to change it and/or distribute copies of it under
   certain conditions.  Enter "help copying" to see the conditions.
   This program has absolutely no warranty.  Enter "help warranty" for details.
   GNU gdb 6.1
   Copyright 2004 Free Software Foundation, Inc.
   GDB is free software, covered by the GNU General Public License, and you are
   welcome to change it and/or distribute copies of it under certain conditions.
   Type "show copying" to see the conditions.
   There is absolutely no warranty for GDB.  Type "show warranty" for details.
   This GDB was configured as "i686-pc-linux-gnu"...
   crash: /vmlinux-2.6.21-1.3194.fc7 and /dev/crash do not match!
   Usage:
     crash [-h [opt]][-v][-s][-i file][-d num] [-S] [mapfile] [namelist] [dumpfile]
   Enter "crash -h" for details.
   #
But since /proc/kallsyms has a format similar to a System.map
file, if you throw it on the crash command line, crash thinks
that it is a System.map, and things just work:
   # crash /vmlinux-2.6.21-1.3194.fc7 /proc/kallsyms
   crash 4.0-4.3
   Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007  Red Hat, Inc.
   Copyright (C) 2004, 2005, 2006  IBM Corporation
   Copyright (C) 1999-2006  Hewlett-Packard Co
   Copyright (C) 2005, 2006  Fujitsu Limited
   Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
   Copyright (C) 2005  NEC Corporation
   Copyright (C) 1999, 2002  Silicon Graphics, Inc.
   Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
   This program is free software, covered by the GNU General Public License,
   and you are welcome to change it and/or distribute copies of it under
   certain conditions.  Enter "help copying" to see the conditions.
   This program has absolutely no warranty.  Enter "help warranty" for details.
   GNU gdb 6.1
   Copyright 2004 Free Software Foundation, Inc.
   GDB is free software, covered by the GNU General Public License, and you are
   welcome to change it and/or distribute copies of it under certain conditions.
   Type "show copying" to see the conditions.
   There is absolutely no warranty for GDB.  Type "show warranty" for details.
   This GDB was configured as "i686-pc-linux-gnu"...
   please wait... (patching 28475 gdb minimal_symbol values)
     SYSTEM MAP: /proc/kallsyms
   DEBUG KERNEL: /vmlinux-2.6.21-1.3194.fc7 (2.6.21-1.3194.fc7)
       DUMPFILE: /dev/crash
           CPUS: 4
           DATE: Wed Jul 25 13:47:55 2007
         UPTIME: 23:10:09
   LOAD AVERAGE: 0.08, 0.03, 0.01
          TASKS: 111
       NODENAME: ibm-crichton-01.lab.boston.redhat.com
        RELEASE: 2.6.21-1.3194.fc7
        VERSION: #1 SMP Wed May 23 22:35:01 EDT 2007
        MACHINE: i686  (2666 Mhz)
         MEMORY: 1 GB
            PID: 958
        COMMAND: "crash"
           TASK: c1934d30  [THREAD_INFO: ce64b000]
            CPU: 3
          STATE: TASK_RUNNING (ACTIVE)
   crash> q
Of course if you're running with 2.6.22, you'd then probably bump
into  the issue reported by Ken'ichi:
   [Crash-utility] [Patch] linux-2.6.22 on i386
   https://www.redhat.com/archives/crash-utility/2007-July/msg00061.html
So applying Ken'ichi's patch, and then using /proc/kallsyms, should get
you up and running for now...  (presuming you've got CONFIG_KALLSYMS
configured)
Dave
                                
                         
                        
                                
                                18 years, 3 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: [Crash-utility] - creeping schizophrenia?
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        Amit Kale wrote:
> Is vmlinux a relocatable object file in FC7? If it is, we can make gdb
> load it at a different address easily by using "add-symbol-file"
> command with offsets.
> 
It is, as well as in RHEL5 for that matter.  But until these
kernels, the typical kernel configuration has always created
kernel symbols that were unity-mapped.  (at least for the primary
kernel -- we don't care about supporting crash running on
the secondary kdump kernel).
The embedded gdb module inside of crash is invoked essentially
as "gdb vmlinux".  The "add-symbol-file" mechanics are currently
used for loading kernel module objects, but I've never even
considered using it on the same file.  Interesting concept
though, doing it on the vmlinux file itself a "second" time,
but using a base address.  I'm certainly not sure how that
would work.
The good news is that there's always been the capability in
crash to use a non-matching vmlinux, but then override the
symbol values in the vmlinux by adding the System.map file from
the "real" kernel as a command line argument.  The override
is done early on by back-patching all of the symbols that gdb
read from the non-matching vmlinux.  Unfortunately the System.map
file for these new kernels also has the "wrong" symbol values,
since they are just pulled from the vmlinux file.  I didn't notice
until yesterday that /proc/kallsyms contains the "resolved" symbol
values.  So with some additional hackery, the same back-patching
concept can be used; but I don't see how to do it "automatically",
except for live systems that have /proc/kallsyms.
> Unfortunately AFAIK core files only contain data sections and there is
> no way of specifying relocated code section addresses.
> 
Are you still talking about the vmlinux file or the kdump vmcore?
The core files aren't a problem for crash, since it only pulls
out physical memory information from the PT_LOAD segments.
And the gdb module inside of crash doesn't know anything
about core files -- it just knows that it was invoked
as "gdb vmlinux", regardless whether crash itself is running
against a dumpfile or a live system.
> Perhaps we can create a /proc/maps file for kernel just like
> /proc/self/maps for processes.
Whatever -- although that won't help when running crash on dumpfiles
on systems other than the host where the crash took place, right?
Like I said, there may be something in the vmlinux file, but
I couldn't find anything obvious.
Dave
                                
                         
                        
                                
                                18 years, 3 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        relocatable FC7/upstream x86 kernels (was: crash-4.0-4.3 and linux-2.6.22.1-20.fc7)
                                
                                
                                
                                    
                                        by Dave Anderson
                                    
                                
                                
                                        
In the "heading-down-a-slippery-slope" department, I've hacked
up a version of crash that is capable of dealing with the
relocatable x86 FC7/upstream kernels, whose kernel symbol values
in the vmlinux file do not match up with their counterparts
when the kernel is actually loaded.
In the "vmlinux and /dev/crash do not match" FC7/upstream
scenario, the kernel gets compiled with:
   CONFIG_PHYSICAL_START=0x1000000   (16MB)
   CONFIG_PHYSICAL_ALIGN=0x400000    (4MB)
In that case, the kernel symbols start at PAGE_OFFSET (c0000000)
plus the CONFIG_PHYSICAL_START value, or c1000000.  However, despite
its name of "CONFIG_PHYSICAL_START", the kernel is actually loaded at
4MB physical, so the real "physical start" location looks to be
controlled by the CONFIG_PHYSICAL_ALIGN value.  (Vivek, correct
me if I'm wrong...)
So, whereas the vmlinux file shows these symbol values:
   $ nm -Bn vmlinux
   ...
   c1000000 T _text
   c1000000 T startup_32
   c1001000 T startup_32_smp
   c1001080 t checkCPUtype
   c1001101 t is486
   c1001108 t is386
   c1001175 t check_x87
   c10011a0 T setup_pda
   c10011c2 t setup_idt
   c10011df t rp_sidt
   c1001262 t early_divide_err
   c1001268 t early_illegal_opcode
   c1001271 t early_protection_fault
   c1001278 t early_page_fault
   c100127f t early_fault
   c10012a7 t hlt_loop
   c10012ac t ignore_int
   c10012f0 T _stext
   c10012f0 t run_init_process
   c10012f0 T stext
   c1001304 t init_post
   ...
But when loaded into memory, they are all changed to reflect that
the kernel was loaded at at 4MB physical instead of 16MB:
   $ cat /proc/kallsyms
   c0400000 T _text
   c0400000 T startup_32
   c0401000 T startup_32_smp
   c0401080 t checkCPUtype
   c0401101 t is486
   c0401108 t is386
   c0401175 t check_x87
   c04011a0 T setup_pda
   c04011c2 t setup_idt
   c04011df t rp_sidt
   c0401262 t early_divide_err
   c0401268 t early_illegal_opcode
   c0401271 t early_protection_fault
   c0401278 t early_page_fault
   c040127f t early_fault
   c04012a7 t hlt_loop
   c04012ac t ignore_int
   c04012f0 T _stext
   c04012f0 t run_init_process
   c04012f0 T stext
   c0401304 t init_post
   ...
So in the case above, it amounts to a 12MB relocation from
from the compiled-in value to the loaded value.  And so
if I:
  (1) hack in the relocation value when reading/storing
      the vmlinux symbols, and later on
  (2) back-patch all of the "incorrect" symbols stored by gdb from
      the vmlinux file -- in the same manner as when a System.map
      file used,
then crash comes up fine, and everything seems to work OK.
(Although, as is the case when a System.map file is
used to back-patch gdb's notion of symbol values, line
numbers from gdb are unavailable)
Anyway, I can't find anything obvious in the vmlinux file
that indicates what the relocation value would be.  On a
live system, the vmlinux symbols can be matched with
/proc/kallsyms if it exists.  If /proc/kallsyms doesn't
exist, or if running against a dumpfile, the only option
I can think of is adding a crash command line "relocation"
argument.
On the other hand, it's preferable to configure the kernel
such that the virtual address for which it is compiled results
in "unity-mapped" kernel virtual addresses.  That has always
been the case, where the kernel is compiled with a base virtual
address of c0100000 or c0400000, gets loaded at a base physical
address of 1MB or 4MB respectively, so that a virtual-to-
physical translation can be done by subtracting the c0000000
(PAGE_OFFSET) unity-map identifier.  To make that happen with
the FC7/upstream kernels, the CONFIG_PHYSICAL_START address
needs to be equal to or less than the CONFIG_PHYSICAL_ALIGN
value.  In other words, I've rebuilt with these two
combinations:
   CONFIG_PHYSICAL_START=0x100000    (1MB)
   CONFIG_PHYSICAL_ALIGN=0x400000    (4MB)
or
   CONFIG_PHYSICAL_START=0x400000    (4MB)
   CONFIG_PHYSICAL_ALIGN=0x400000    (4MB)
and in both cases the kernel gets compiled for c0400000 as
a base virtual address.
In any case, there is hope for handling such kernels.
Dave
                                
                         
                        
                                
                                18 years, 3 months