Re: [Crash-utility] crash seek error, failed to read vmcore file

Thursday, 22 April 2010

On Wed, 2010-04-21 at 09:58 -0400, Dave Anderson wrote:
...
 ----- "Pavan Naregundi" <pavan(a)linux.vnet.ibm.com&gt;
wrote:

 > On Tue, 2010-04-20 at 09:14 -0400, Dave Anderson wrote:
 > > ----- "Pavan Naregundi" <pavan(a)linux.vnet.ibm.com&gt; wrote:
 > > 
 > > The cause for seek errors depends upon the type
 > > of dumpfile.
 > > 
 > > You didn't mention which type of dumpfile the vmcore
 > > is, so I'll presume that it's either an ELF-format
 > > kdump or a compressed kdump created by makedumpfile.
 > > 
 > > So presuming that it's a compressed kdump, the seek error 
 > > most likely comes from here in read_diskdump() in diskdump.c:
 > > 
 > >         if ((pfn >= dd->header->max_mapnr) || !page_is_ram(pfn))
 > >                 return SEEK_ERROR;
 > > 
 > > where the requested physical address pfn values are larger
 > > than the max_mapnr value advertised in the header.
 > > 
 > > When you do any "crash -d# ...", the dumpfile header will
 > > be dumped first.  What does that show?
 > > 
 > > Dave
 > 
 > 
 > Dave,
 > 
 > Dumpfile is compressed kdump created by makedumpfile.
 > 
 > header shows the following values: 
 > max_mapnr: 32768
 > block_shift: 16
 > 
 > Yes. Adding some debug printf's shows me that (pfn >=
 > dd->header->max_mapnr) fails. 
 > 
 > For example: in the first seek error,
 > crash: seek error: kernel virtual address: c0000000af715480  type:
 > "kmem_cache buffer"
 > 
 > paddr: af715480 => pfn=44913
 > 
 > crash -d8 log: http://pastebin.com/qrCvyPfR
 > 
 > Thanks..Pavan

 OK, so the compressed dumpfile has exactly 32768 pages of physical
 memory, or exactly 2GB.  That being the case, the crash utility
 will fail all readmem attempts above that value, and obviously 
 there is critical data above the artificial 2GB threshold.  

 The question at hand is why kdump is creating a truncated dumpfile
 with a max_mapnr of 32768:

 (1) makedumpfile determines the "max_mapnr" value based upon the 
     highest physical address found in any of the PT_LOAD segments
     of the /proc/vmcore file on the secondary kernel.
 (2) the /proc/vmcore PT_LOAD segments were pre-calculated during
     the primary kernel's kdump initialization phase, based upon
     the values found in the set of "/proc/device-tree/memory@xxx/reg"
     files existing in the primary kernel, where the "xxx" is the
     starting physical address of the memory region, and the "reg"
     file in that directory contains the size of the memory region. 

 For whatever reason, those files showed a maximum of 2GB of
 physical memory.  (If you do not use makedumpfile, and then do
 a "readelf -a" of the resultant vmcore file, you will see 
 the PT_LOAD segment values.)

 Does the SLES11 vmlinux-2.6.32.10-0.4.99.25.62005-ppc64 kernel
 contain this patch?:

http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.gi...

 I ask because we also have an outstanding bugzilla that exhibits similar
 behavior, where an abnormally small ppc64 vmcore file gets created
 because there was only a single /proc/device-tree/memory@0 directory
 file that showed just a small subset of the total physical memory.
 Typically there are many of those "memory@xxx" directories, but in
 the failing scenario, there was only one /proc/device-tree/memory@0
 directory.

 Anyway, there's (unproven) speculation that the kernel patch above
 is related to the problem.

 In any case, unfortunately, there's nothing can be done from the crash
 utility's perspective. 

 Dave 
Thank you Dave.

Our SLES11 does not have the above patch you mentioned, but at the same
time system is not AMS enabled and CONFIG_CMM is also not set in the
config file..

This system also has /proc/device-tree/memory@0 dir only..

Regards..Pavan

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Crash-utility] crash seek error, failed to read vmcore file