Hello Crash Utility Community,

I am hoping that someone in the Crash Analysis community can provide some assistance with a problem that I am having to analyze vmcore files gathered from our 32-bit machines.  I am working to add kexec to our systems so that we can run the crash utility (version 7.0.1) on our appliances and I am having trouble with our 32-bit systems.  Fortunately my 64-bit systems are working fine so I know that can I make the technology work.  I believe that the crash analysis tool does not like the System.map file and I am trying to get to the root cause of this problem.  

My problem originally manifests itself when I try to decode the vmcore file.  After intentionally creating an oops panic event I upload the vmcore file to my build machine and run crash on that system.  While the vmcore file is generated on an appliance I run crash analysis program on the build system that produced the Linux kernel since the appliances are meant to be deployed into the field and will not be accessible for running crash analysis events.

  build# crash -S System.map vmlinux vmcore
  crash 7.0.1
  ...
  crash: read error: kernel virtual address: c1363c5c  type: "cpu_possible_mask"

So I then tried to find what this symbol is within the map:

  build# crash --minimal -S System.map vmlinux vmcore
  ...
  crash> sym cpu_possible_mask
  c1363c5c (R) cpu_possible_mask
  crash> 

From this I can only see that the addresses match up.  So I then decided to run the crash utility on the appliance itself to see what happens.  I copied the crash utility to the appliance and the uncompressed kernel image to the appliance as well.  The appliance boots from a "bzImage" file and the crash utility can't use the bzImage file for processing so I needed to manually copy the uncompressed kernel image to the box.

I then run the following commands on our appliance for data gathering purposes:

  root@appliance:/var/crash# crash -S /boot/System.map vmlinux-2.6.32.24-sf.pentM-37 
  ...
  WARNING: cannot read linux_banner string                               
  crash: /boot/System.map and /dev/mem do not match!

  root@appliance:/var/crash# ls -l /boot/System.map
  lrwxrwxrwx 1 root root 32 Aug 28 22:32 /boot/System.map -> System.map-2.6.32.24-sf.pentM-37
  root@appliance:/var/crash# 

  root@appliance:/var/crash# cat /proc/version 
  Linux version 2.6.32.24sf.pentM-37 (build@ajax) (gcc version 4.7.1 (GCC) ) #1 PREEMPT Mon Aug 26 22:26:34 UTC 2013
  root@appliance:/var/crash# 

So from everything I can see the Linux kernel and the System.map file are in version agreement but the crash utility disagrees with me. The crash utility is the judge so something is wrong.  My goal is to find out how I can get the information that is needed to determine the problem.

We have build machines that produce Linux kernels for multiple appliances each based upon various processors.  Some are 32-bit and some are 64-bit.  I have made identical changes to the Linux kernel for all of our systems and the 64-bit versions work well while the 32-bit versions do not.  All of our appliances are able to gather the vmcore file after an oops event and the analysis is the only problem.

Is there a way I can determine if a System.map file is good on a system?  Maybe there is some other utility that I can run to analyze a System.map file to see if it agrees with the running system?  My system boots from a compressed bzImage file but I cannot extract an uncompressed Linux kernel file from that bzImage.  I have tried but I have not been successful.  For my testing I need to copy the uncompressed vmlinux file from the build machine since the build machine uses that file as an artifact to create the final bzImage file.

Any ideas for gathering better information would be greatly appreciated.

Thank you,
Patrick