Hi,
From: HATAYAMA Daisuke <d.hatayama(a)jp.fujitsu.com>
Subject: Re: [Crash-utility] [RFC] gcore subcommand: a process coredump feature
Date: Tue, 03 Aug 2010 15:17:00 +0900 (東京 (標準時))
Hello Iguchi-san,
Thanks for your comments.
From: "S.Iguchi" <iguchi.sg(a)ncos.nec.co.jp>
Subject: Re: [Crash-utility] [RFC] gcore subcommand: a process coredump feature
Date: Tue, 03 Aug 2010 13:10:09 +0900 (JST)
> Hi, Hatayama-san
>
> I have a mostly same purpose extension with your patch.
> But your patch is great! , because supporting latest kernel and
> also dump filter masking.
>
> my current extention file is attached.
> Yes, my code is quite buggy, ugly and not enough against latest kernel
> than yours.
> (sigh ... I didnot know fill_vma_cache(), so do "vm -p" everytime before
dump.)
>
> BTW, I have some comments.
> I'd like to add some features below to yours.
> or if you will do, it is happy for me. :)
>
> - support i386
> - support elf32 binary on x86-64
> - support old kernel (before 2.6.17)
>
> as Dave said, if your patch committed as extension,
> I could submit some patches to that.
>
> How about this?
As I've written in the first entry, I have a plan to support RHEL4,
RHEL5 and RHEL6 on i386, x86_64 and IA64, and the latest upstream
kernel, too. Next table shows correspondence of community's kernel
versions.
RHEL4 RHEL5 RHEL6 upstream
---------------------------------
2.6.9 2.6.18 2.6.32 2.6.35
So, it could probably be enough for your first and third requests.
Ugh, i didnt check RHEL4 ... sorry.
thank you for your explanation.
On the other hand, I've not planned to support ia32 emulation
over
both x86_64 and ia64.
OK.
it is enough for me to support ia32 emulation on x86-64 ...
if your extension applied, I'll think about it.
Thanks.
Regards,
Seigo Iguchi
>
> Best regards,
> Seigo Iguchi
>
>
> From: HATAYAMA Daisuke <d.hatayama(a)jp.fujitsu.com>
> Subject: [Crash-utility] [RFC] gcore subcommand: a process coredump feature
> Date: Mon, 02 Aug 2010 18:00:02 +0900 (東京 (標準時))
>
>> Hello,
>>
>> For some weeks I've developed gcore subcommand for crash utility which
>> provides process coredump feature for crash kernel dump, strongly
>> demanded by users who want to investigate user-space applications
>> contained in kernel crash dump.
>>
>> I've now finished making a prototype version of gcore and found out
>> what are the issues to be addressed intensely. Could you give me any
>> comments and suggestions on this work?
>>
>>
>> Motivation
>> ==========
>>
>> It's a relatively familiar technique that in a cluster system a
>> currently running node triggers crash kernel dump mechanism when
>> detecting a kind of a critical error in order for the running, error
>> detecting server to cease as soon as possible. Concequently, the
>> residual crash kernel dump contains a process image for the erroneous
>> user application. At the case, developpers are interested in user
>> space, rather than kernel space.
>>
>> There's also a merit of gcore that it allows us to use several
>> userland debugging tools, such as GDB and binutils, in order to
>> analyze user space memory.
>>
>>
>> Current Status
>> ==============
>>
>> I confirm the prototype version runs on the following configuration:
>>
>> Linux Kernel Version: 2.6.34
>> Supporting Architecture: x86_64
>> Crash Version: 5.0.5
>> Dump Format: ELF
>>
>> I'm planning to widen a range of support as follows:
>>
>> Linux Kernel Version: Any
>> Supporting Architecture: i386, x86_64 and IA64
>> Dump Format: Any
>>
>>
>> Issues
>> ======
>>
>> Currently, I have issues below.
>>
>> 1) Retrieval of appropriate register values
>>
>> The prototype version retrieves register values from a _wrong_
>> location: a top of the kernel stack, into which register values are
>> saved at any preemption context switch. On the other hand, the
>> register values that should be included here are the ones saved at
>> user-to-kernel context switch on any interrupt event.
>>
>> I've yet to implement this. Specifically, I need to do the following
>> task from now.
>>
>> (1) list all entries from user-space to kernel-space execution path.
>>
>> (2) divide the entries according to where and how the register
>> values from user-space context are saved.
>>
>> (3) compose a program that retrieves the saved register values from
>> appropriate locations that is traced by means of (1) and (2).
>>
>> Ideally, I think it's best if crash library provides any means of
>> retrieving this kind of register values, that is, ones saved on
>> various stack frames. Is there such a plan to do?
>>
>>
>> 2) Getting a signal number for a task which was during core dump
>> process at kernel crash
>>
>> If a target task is halfway of core dump process, it's better to know
>> a signal number in order to know why the task was about to be core
>> dumped.
>>
>> Unfortunately, I have no choice but backtrace the kernel stack to
>> retrieve a signal number saved there as an argument of, for example,
>> do_coredump().
>>
>>
>> 3) Kernel version compatibility
>>
>> crash's policy is to support all kernel versions by the latest crash
>> package. On the other hand, the prototype is based on kernel 2.6.34.
>> This means more kernel versions need to be supported.
>>
>> Well, the question is: to what versions do I need to really test in
>> addition to the latest upstream kernel? I think it's practically
>> enough to support RHEL4, RHEL5 and RHEL6.
>>
>>
>> Build Instruction
>> =================
>>
>> $ tar xf crash-5.0.5.tar.gz
>> $ cd crash-5.0.5/
>> $ patch -p 1 < gcore.patch
>> $ make
>>
>>
>> Usage
>> =====
>>
>> Use help subcommand of crash utility as ``help gcore''.
>>
>>
>> Attached File
>> =============
>>
>> * gcore.patch
>>
>> A patch implementing gcore subcommand for crash-5.0.5.
>>
>> The diffstat output is as follows.
>>
>> $ diffstat gcore.patch
>> Makefile | 10 +-
>> defs.h | 15 +
>> gcore.c | 1858 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> gcore.h | 639 ++++++++++++++++++++
>> global_data.c | 3 +
>> help.c | 28 +
>> netdump.c | 27 +
>> tools.c | 37 ++
>> 8 files changed, 2615 insertions(+), 2 deletions(-)
>>
>> --
>> HATAYAMA Daisuke
>> d.hatayama(a)jp.fujitsu.com