Crash-utility January 2008

devel@lists.crash-utility.osci.io

11 participants
13 discussions

crash versions 4.0-4.9 and later will not work with SLES9 IA64 dumps

by Alan Tyson

Hello and Happy New Year, Changes that were made in 4.0-4.9 to get information out of the LKCD header of the dump file now prevent crash from opening SLES9 IA64 lkcd dumps. The reason is that the LKCD header as defined in crash does not match that used in SLES9. Now, I'm not familiar with any other distros that use LKCD v9 so don't know if this problem is unique to SLES9 or if it affects others. On SLES9, the element dha_kernel_addr of struct _dump_header_asm_s is not at the end of the structure. It's in between dha_header_size and dha_pt_regs so lkcd_dump_init_v8_arch() ends up with a zero for the load address instead of 0x04000000 (in the case I'm looking at). Bernard, do you know if this is the case with all LKCD v9 distros? If so I don't mind creating lkcd*v9* functions. Otherwise I'd be open to suggestions as to how we get round this. Perhaps I just need to keep a different version of crash for IA64 SLES9.... Thanks, Alan Tyson, HP.

17 years, 6 months

2
1
0 / 0

Question for LKCD maintainers

by Dave Anderson

Long after I stopped tinkering with the LKCD code in crash, changes were contributed to support physical memory zones in the LKCD dumpfile format. Specifically there is this piece of save_offset() in lkcd_common.c: /* find the zone */ for (ii=0; ii < lkcd->num_zones; ii++) { if (lkcd->zones[ii].start == zone) { if (lkcd->zones[ii].pages[page].offset != 0) { if (lkcd->zones[ii].pages[page].offset != off) { error(INFO, "conflicting page: zone %lld, " "page %lld: %lld, %lld != %lld\n", (unsigned long long)zone, (unsigned long long)page, (unsigned long long)paddr, (unsigned long long)off, (unsigned long long) \ lkcd->zones[ii].pages[page].offset); abort(); } ret = 0; } else { lkcd->zones[ii].pages[page].offset = off; ret = 1; } break; } } The call to abort() above kills the crash session, which is both annoying and unnecessary. I am seeing it in a customer dumpfile, who have their own dumping scheme that is based upon LKCD version 7. I understand that this may be a problem with their LKCD port, but nonetheless, it's the only place in the crash utility that doesn't recover gracefully from dumpfile access errors. Anyway, I would like to either: 1. change the error(INFO...) to error(FATAL...) so that run-time commands encountering this error will just fail, and the session will return to the crash> prompt, or 2. return 0, so that a "seek error" can be subsequently displayed by the readmem() command. Number 2 is preferable, because it yields more clues as to where the readmem() came from, but since I don't know much about the LKCD physical memory zones stuff, is there any reason that shouldn't be done? Thanks, Dave

17 years, 6 months

3
4
0 / 0

Re: [Crash-utility] [PATCH] Improve error handling when architecture doesn't match

by Dave Anderson

> Bernhard Walle wrote: > > Dave Anderson wrote: > > > > Actually this patch has just turned up different issues > > that have to be handled, because the e_type and e_phnum > > get deferred until after the e_machine and endianness > > are checked. > > > > Among them the fact that an i386 xen guest core file > > taken by an x86_64 host has the e_machine type set > > to x86_64 (don't ask me why they did that...), and has > > an e_phnum of 0. Anyway, that requires the e_phnum > > to be checked *before* the machine type and endianness. > > > > And another, since the e_type doesn't get checked > > until *after* the machine type and endianness, > > it allows the vmlinux file (ET_EXEC) to get passed > > through, which can generate a bogus error message > > about the vmlinux file! > > > > And there's probably others... > > > > There was a method to my madness in the way it's written > > now. I'm going to have to spend some more time with > > this because I don't want to introduce false alarms > > or print error messages that don't make any sense... > > *Arrrg*, sorry for not taking all this into account. I only tested > with a few Kdump dumps from different architectures, but not with Xen > dumps. > > You're right, and I'll send a new patch that tries to handle all this. > But probably next year ... No need -- I've got it all in place. Prior to the generic "not a supported file format" fatal error message, the following mismatches will be explicitly reported: 1. Machine type mismatches in netdump, kdump, diskdump and xendump ELF dumpfiles. 2. Machine type mismatches in compressed diskdump and compressed kdump (via makedumpfile) dumpfiles. 3. Machine type mismatches in vmlinux files. 4. Endian mismatches in netdump, kdump, diskdump and xendump ELF core dumpfiles. 5. Endian mismatches in vmlinux files. This was long overdue -- thanks a lot for getting the ball rolling. Dave

17 years, 6 months

1
0
0 / 0

← Newer
1
2
Older →

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility January 2008