Re: [Crash-utility] [RFI] Support Fujitsu's sadump dump format
by tachibana@mxm.nes.nec.co.jp
Hi Hatayama-san,
On 2011/06/29 12:12:18 +0900, HATAYAMA Daisuke <d.hatayama(a)jp.fujitsu.com> wrote:
> From: Dave Anderson <anderson(a)redhat.com>
> Subject: Re: [Crash-utility] [RFI] Support Fujitsu's sadump dump format
> Date: Tue, 28 Jun 2011 08:57:42 -0400 (EDT)
>
> >
> >
> > ----- Original Message -----
> >> Fujitsu has stand-alone dump mechanism based on firmware level
> >> functionality, which we call SADUMP, in short.
> >>
> >> We've maintained utility tools internally but now we're thinking that
> >> the best is crash utility and makedumpfile supports the sadump format
> >> for the viewpoint of both portability and maintainability.
> >>
> >> We'll be of course responsible for its maintainance in a continuous
> >> manner. The sadump dump format is very similar to diskdump format and
> >> so kdump (compressed) format, so we estimate patch set would be a
> >> relatively small size.
> >>
> >> Could you tell me whether crash utility and makedumpfile can support
> >> the sadump format? If OK, we'll start to make patchset.
I think it's not bad to support sadump by makedumpfile. However I have
several questions.
- Do you want to use makedumpfile to make an existing file that sadump has
dumped small?
- It isn't possible to support the same form as kdump-compressed format
now, is it?
- When the information that makedumpfile reads from a note of /proc/vmcore
(or a header of kdump-compressed format) is added by an extension of
makedumpfile, do you need to modify sadump?
Thanks
tachibana
> >
> > Sure, yes, the crash utility can always support another dumpfile format.
> >
>
> Thanks. It helps a lot.
>
> > It's unclear to me how similar SADUMP is to diskdump/compressed-kdump.
> > Does your internal version patch diskdump.c, or do you maintain your
> > own "sadump.c"? I ask because if your patchset is at all intrusive,
> > I'd prefer it be kept in its own file, primarily for maintainability,
> > but also because SADUMP is essentially a black-box to anybody outside
> > Fujitsu.
>
> What I meant when I used ``similar'' is both literally and
> logically. The format consists of diskdump header-like header, two
> kinds of bitmaps used for the same purpose as those in diskump format,
> and memory data. They can be handled in common with the existing data
> structure, diskdump_data, non-intrusively, so I hope they are placed
> in diskdump.c.
>
> On the other hand, there's a code to be placed at such specific
> area. sadump is triggered depending on kdump's progress and so
> register values to be contained in vmcore varies according to the
> progress: If crash_notes has been initialized when sadump is
> triggered, sadump packs the register values in crash_notes; if not
> yet, packs registers gathered by firmware. This is sadump specific
> processing, so I think putting it in specific sadump.c file is a
> natural and reasonable choise.
>
> Anyway, I have not made any patch set for this. I'll post a patch set
> when I complete.
>
> Again, thanks a lot for the positive answer.
>
> Thanks.
> HATAYAMA, Daisuke
>
>
> _______________________________________________
> kexec mailing list
> kexec(a)lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
1 year, 1 month
Re: [Crash-utility] Crash dump of RHEL5, what does "SHARED" memory represent in kmem -i output?
by anderson@prospeed.net
> From: James Washer <washer(a)trlp.com>
> To: Crash-utility(a)redhat.com
> Subject: [Crash-utility] Crash dump of RHEL5, what does "SHARED"
> memory represent in kmem -i output?
> Message-ID: <E06C76EC-884A-4C02-9566-D647E0B575CA(a)trlp.com>
> Content-Type: text/plain; charset=us-ascii
>
> In the data below, what memory usage is reported as SHARED? Does this
> imply SysV shared memory, or pages shared between processes, or something
> else all together?
>
>
> kmem-i
>
> PAGES TOTAL PERCENTAGE
> TOTAL MEM 9228936 35.2 GB ----
> FREE 20260 79.1 MB 0% of TOTAL MEM
> USED 9208676 35.1 GB 99% of TOTAL MEM
> SHARED 9051322 34.5 GB 98% of TOTAL MEM
> BUFFERS 151 604 KB 0% of TOTAL MEM
> CACHED 2455 9.6 MB 0% of TOTAL MEM
> SLAB 27419 107.1 MB 0% of TOTAL MEM
>
> TOTAL HIGH 0 0 0% of TOTAL MEM
> FREE HIGH 0 0 0% of TOTAL HIGH
> TOTAL LOW 9228936 35.2 GB 100% of TOTAL MEM
> FREE LOW 20260 79.1 MB 0% of TOTAL LOW
>
> TOTAL SWAP 4194302 16 GB ----
> SWAP USED 2294571 8.8 GB 54% of TOTAL SWAP
> SWAP FREE 1899731 7.2 GB 45% of TOTAL SWAP
>
>
>
> - jim
Hi Jim,
Purportedly it displays -- as the comment states:
/*
* Get shared pages from dump_mem_map(). Note that this is done
* differently than the kernel -- it just tallies the non-reserved
* pages that have a count of greater than 1.
*/
It has a check in there for the different usage of page->count,
where in certain kernel versions, -1 means 0.
Dave Anderson (on vacation)
13 years, 5 months
Crash dump of RHEL5, what does "SHARED" memory represent in kmem -i output?
by James Washer
In the data below, what memory usage is reported as SHARED? Does this imply SysV shared memory, or pages shared between processes, or something else all together?
kmem-i
PAGES TOTAL PERCENTAGE
TOTAL MEM 9228936 35.2 GB ----
FREE 20260 79.1 MB 0% of TOTAL MEM
USED 9208676 35.1 GB 99% of TOTAL MEM
SHARED 9051322 34.5 GB 98% of TOTAL MEM
BUFFERS 151 604 KB 0% of TOTAL MEM
CACHED 2455 9.6 MB 0% of TOTAL MEM
SLAB 27419 107.1 MB 0% of TOTAL MEM
TOTAL HIGH 0 0 0% of TOTAL MEM
FREE HIGH 0 0 0% of TOTAL HIGH
TOTAL LOW 9228936 35.2 GB 100% of TOTAL MEM
FREE LOW 20260 79.1 MB 0% of TOTAL LOW
TOTAL SWAP 4194302 16 GB ----
SWAP USED 2294571 8.8 GB 54% of TOTAL SWAP
SWAP FREE 1899731 7.2 GB 45% of TOTAL SWAP
- jim
13 years, 5 months
[ANNOUNCE] crash version 5.1.7 is available
by Dave Anderson
Download from: http://people.redhat.com/anderson
Changelog:
- Fix for the x86_64 "bt" command in the highly-unlikely event that
a non-crashing CPU receives a NMI immediately after receiving an
interrupt from another source in a 2.6.29 and later kernel. In
those kernels, the IRQ entry-point symbols "IRQ0x00_interrupt"
through "IRQ0x##_interrupt" no longer exist, but the entry points
exist as memory locations starting at the symbol "irq_entries_start".
Without the patch, if a shutdown NMI interrupt gets received while in
one of the entry point stubs, "bt" will fail with the error message
"bt: cannot transition from exception stack to current process stack".
(anderson(a)redhat.com)
- The x86 and x86_64 "bt -e" and "bt -E" commands will display symbolic
translations of kernel-mode exception RIP values.
(anderson(a)redhat.com)
- Clarified two initialization-time CRASHDEBUG(1) messages to make it
obvious that the two linux_banner strings being compared originate
from the memory source or the kernel namelist file.
(anderson(a)redhat.com)
- Fix for the x86 "bt" command to handle cases where the shutdown NMI
was received when a task had just completed an exception, interrupt,
or signal handler, and was about to return to user-space. Without
the patch, the backtrace would be proceeded with the error message
"bt: cannot resolve stack trace", display the trace without the
kernel-entry exception frame, and then dump the text symbols found
on the stack and all possible exception frames.
(anderson(a)redhat.com)
- Fix for 2.6.33 and later kernels that are not configured CONFIG_SMP.
Without the patch, they fail during initialization with the error
message "crash: invalid structure member offset: module_percpu".
(nakayama.ts(a)ncos.nec.co.jp)
- Prepare for the imminent change in size of the vm_flags member of
the vm_area_struct to be 64-bits in size for all architectures now
that 32 bits have been consumed. The crash utility code had been
handling the older change of the vm_flags member from a short to a
long, but that would not account for the future change to a 64-bit
member on 32-bit architectures.
(anderson(a)redhat.com)
- Update of the "vm -f <flags>" option to the current upstream
state. Without the patch, only 23 of the currently-existing 32
bit flags were being translated.
(anderson(a)redhat.com)
- Fix for the "kmem -s", "kmem -S", "kmem -s <address>" and
"kmem <address>" command options if none of the NUMA nodes in
in a multi-node CONFIG_SLAB system have a node ID of 0. Without
the patch, "kmem -s" and "kmem -S" show all slab caches as if they
contain no slabs; if an <address> is specified, the correct slab
cache is found, but the command indicates "kmem: <slab-cache-name>:
address not found in cache: <address>".
(anderson(a)redhat.com)
- Cosmetic fix for the "kmem -[sS]" options if a CONFIG_SLAB kernel
slab cache contains 100000 or more slabs, or uses a slab size of
1 or more megabytes. Without the patch, the output utilizes more
than 80 columns.
(anderson(a)redhat.com)
- If a task was in user-space when a crash occurred, the user-space
registers are saved in per-cpu NT_PRSTATUS ELF notes in either
version 4 compressed kdump headers, or in dumpfile headers created
by the Fujitsu "sadump" facility. In that case, the "bt" command
will dump the x86 or x86_64 user-space register set.
(wency(a)cn.fujitsu.com)
- Fix for the x86 "bt" command to handle cases where the shutdown NMI
was received when a task had just received an interrupt, but before
it had created a full exception frame on the kernel stack and called
the interrupt handler. Without the patch, the backtrace would be
proceeded with the error message "bt: cannot resolve stack trace",
display the trace without the kernel-entry exception frame, and then
dump the text symbols found on the stack and all possible exception
frames.
(anderson(a)redhat.com)
- Fix for the x86 "bt" command to handle cases where the shutdown NMI
was received when a task was in the act of being switched to.
Without the patch, the backtrace would be proceeded with the error
message "bt: cannot resolve stack trace", display the trace without
the kernel-entry exception frame, and then dump the text symbols
found on the stack and all possible exception frames.
(anderson(a)redhat.com)
13 years, 6 months
[PATCH] output regs if sp is in user stack
by Wen Congyang
We have sadump, and it can work when the OS is out of controll(
for example: dead loop, dead lock). When we use sadump, some
user application may be running, and the sp/ip is in user stack.
We should deal with it like kvm dump.
13 years, 6 months