Dave,

Thank you so much for the advice of using the "--reloc" command line option.  That was exactly what I needed.  I see that my system works with the command:

crash --reloc=15m -S System.map vmlinux vmcore

I looked into my Linux kernel configuration file and saw the following two lines:

CONFIG_PHYSICAL_START=0x1000000
CONFIG_PHYSICAL_ALIGN=0x100000

The difference between the two values is 15 MBytes.  The change log documentation that you pointed me to states that I can change these values within my Linux Kernel Configuration file to make the START value less than or equal to the ALIGN value. As I start my research to better understand the resulting system behavior changes that will result from this update can you think of any negative consequences that I should be aware of before I make this change? 

Thank you again,
Patrick


On Thu, Aug 29, 2013 at 4:49 PM, Dave Anderson <anderson@redhat.com> wrote:


----- Original Message -----
> Hello Crash Utility Community,
>
> I am hoping that someone in the Crash Analysis community can provide some
> assistance with a problem that I am having to analyze vmcore files gathered
> from our 32-bit machines. I am working to add kexec to our systems so that
> we can run the crash utility (version 7.0.1) on our appliances and I am
> having trouble with our 32-bit systems. Fortunately my 64-bit systems are
> working fine so I know that can I make the technology work. I believe that
> the crash analysis tool does not like the System.map file and I am trying to
> get to the root cause of this problem.

If the vmlinux file that you're using matches the vmcore, then please don't
use any -S or System.map argument -- just enter: "crash vmlinux vmcore"

System.map files are only required if the symbol values in the vmlinux
file are different from those in the running kernel.  It doesn't sound
like that's the case in your environment.

Secondly, if the session doesn't start that way, please provide the
debug output generated by entering:

 $ crash -d8 vmlinux vmcore

> My problem originally manifests itself when I try to decode the vmcore file.
> After intentionally creating an oops panic event I upload the vmcore file to
> my build machine and run crash on that system. While the vmcore file is
> generated on an appliance I run crash analysis program on the build system
> that produced the Linux kernel since the appliances are meant to be deployed
> into the field and will not be accessible for running crash analysis events.
>
> build# crash -S System.map vmlinux vmcore
> crash 7.0.1
> ...
> crash: read error: kernel virtual address: c1363c5c type: "cpu_possible_mask"
>
> So I then tried to find what this symbol is within the map:
>
> build# crash --minimal -S System.map vmlinux vmcore
> ...
> crash> sym cpu_possible_mask
> c1363c5c (R) cpu_possible_mask
> crash>

When you were in the "minimal" session, were you able to "rd" the cpu_possible_mask
address?  i.e.

  crash> rd c1363c5c

or what did this show:

  crash> rd linux_banner 10

> From this I can only see that the addresses match up. So I then decided to
> run the crash utility on the appliance itself to see what happens. I copied
> the crash utility to the appliance and the uncompressed kernel image to the
> appliance as well. The appliance boots from a "bzImage" file and the crash
> utility can't use the bzImage file for processing so I needed to manually
> copy the uncompressed kernel image to the box.

Right, crash is only interested in the vmlinux ELF file from which the
bzImage file was generated.

> I then run the following commands on our appliance for data gathering
> purposes:
>
> root@appliance:/var/crash# crash -S /boot/System.map
> vmlinux-2.6.32.24-sf.pentM-37
> ...
> WARNING: cannot read linux_banner string
> crash: /boot/System.map and /dev/mem do not match!
>
> root@appliance:/var/crash# ls -l /boot/System.map
> lrwxrwxrwx 1 root root 32 Aug 28 22:32 /boot/System.map ->
> System.map-2.6.32.24-sf.pentM-37
> root@appliance:/var/crash#
>
> root@appliance:/var/crash# cat /proc/version
> Linux version 2.6.32.24sf.pentM-37 (build@ajax) (gcc version 4.7.1 (GCC) ) #1
> PREEMPT Mon Aug 26 22:26:34 UTC 2013
> root@appliance:/var/crash#
>
> So from everything I can see the Linux kernel and the System.map file are in
> version agreement but the crash utility disagrees with me. The crash utility
> is the judge so something is wrong. My goal is to find out how I can get the
> information that is needed to determine the problem.

OK, while running on the appliance itself, again, try running without
the System.map argument.  It will presumably still fail as shown above.
On that appliance, what is the output from these commands:

 $ cat  /proc/kallsyms | grep cpu_possible_mask
 $ nm -Bn /usr/lib/debug/lib/modules/3.9.10-100.fc17.x86_64/vmlinux | grep cpu_possible_mask
 $ grep cpu_possible_mask /boot/System.map

If they are not the same, it is possible you may need to use the "--reloc <size>"
command line argument.  That is required for 32-bit x86 kernels that are configured
as described here:

 http://people.redhat.com/anderson/crash.changelog.html#4_0_4_5

Dave

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility