----- Original Message -----
gcore extension module provides a means to create ELF core dump for
user-mode process that is contained within crash kernel dump. I design
this to behave as kernel's ELF core dumper.
For previous discussion, see:
https://www.redhat.com/archives/crash-utility/2010-August/msg00001.html
A few observations...
I'll fix unwind_x86_64.h to prevent this build warning:
# make extensions
...
gcc -Wall -I.. -I./libgcore -fPIC -DX86_64 -c -o libgcore/gcore_x86.o
libgcore/gcore_x86.c
In file included from libgcore/gcore_x86.c:19:
../unwind_x86_64.h:61:1: warning: "offsetof" redefined
In file included from libgcore/gcore_x86.c:17:
../defs.h:60:1: warning: this is the location of the previous definition
...
But the gcore.mk file should gracefully fail to build on non-supported
architectures. It ends up spewing ~200 lines of error messages when
attempted, for example, on a ppc64 machine:
# make extensions
gcc -m64 -Wall -I.. -I./libgcore -fPIC -DPPC64 -c -o libgcore/gcore_coredump.o
libgcore/gcore_coredump.c
In file included from libgcore/gcore_coredump.c:17:
./libgcore/gcore_defs.h:355:1: warning: "ELF_NGREG" redefined
In file included from /usr/include/asm/sigcontext.h:13,
from /usr/include/bits/sigcontext.h:28,
from /usr/include/signal.h:339,
from ../defs.h:38,
from libgcore/gcore_coredump.c:16:
/usr/include/asm/elf.h:92:1: warning: this is the location of the previous definition
In file included from libgcore/gcore_coredump.c:17:
./libgcore/gcore_defs.h:356: error: invalid application of ‘sizeof’ to incomplete
type ‘struct user_regs_struct’
./libgcore/gcore_defs.h:356: error: conflicting types for ‘elf_gregset_t’
/usr/include/asm/elf.h:124: note: previous declaration of ‘elf_gregset_t’ was here
./libgcore/gcore_defs.h:490: error: conflicting types for ‘__kernel_old_uid_t’
/usr/include/asm/posix_types.h:28: note: previous declaration of
‘__kernel_old_uid_t’ was here
./libgcore/gcore_defs.h:491: error: conflicting types for ‘__kernel_old_gid_t’
/usr/include/asm/posix_types.h:29: note: previous declaration of
‘__kernel_old_gid_t’ was here
libgcore/gcore_coredump.c:25: error: expected ‘)’ before ‘*’ token
libgcore/gcore_coredump.c:33: error: expected declaration specifiers or ‘...’ before
‘Elf_Ehdr’
... [ cut ] ...
./libgcore/gcore_defs.h:490: error: conflicting types for ‘__kernel_old_uid_t’
/usr/include/asm/posix_types.h:28: note: previous declaration of
‘__kernel_old_uid_t’ was here
./libgcore/gcore_defs.h:491: error: conflicting types for ‘__kernel_old_gid_t’
/usr/include/asm/posix_types.h:29: note: previous declaration of
‘__kernel_old_gid_t’ was here
make[3]: [gcore.so] Error 1 (ignored)
#
Your documentation implies that the command would only work on
certain kernel versions:
Compared with the previous version, this release:
- supports more kernel versions, and
- collects register values more accurately (but still not perfect).
Support Range
=============
|----------------+----------------------------------------------|
| ARCH | X86, X86_64 |
|----------------+----------------------------------------------|
| Kernel Version | RHEL4.8, RHEL5.5, RHEL6.0 and Vanilla 2.6.36 |
|----------------+----------------------------------------------|
But, for example, on a 2.6.34-2.fc14 kernel (presumably unsupported),
it seems to work OK on some tasks, but on others it doesn't work so well.
Here, the "less" command can be dumped OK kernel:
crash> sys | grep RELEASE
RELEASE: 2.6.34-2.fc14.x86_64
crash> ps
... [ cut ] ...
2080 1490 0 ffff880079ed2480 RU 7.6 289900 159684 crash
2084 1 0 ffff880077a7a480 IN 0.1 248592 1936 rsyslogd
2090 2080 5 ffff880079ed4900 IN 0.0 105432 828 less
crash> gcore -v0 2090
Saved core.2090.less
crash>
But with the same (full) 2.6.34-2.fc14 dumpfile, it can't seem to handle
dumping the crash utility itself, and just hangs:
crash> swap
FILENAME TYPE SIZE USED PCT PRIORITY
/dev/dm-1 PARTITION 18579452k 0k 0% -1
crash> ps
... [ cut ] ...
2080 1490 0 ffff880079ed2480 RU 7.6 289900 159684 crash
2084 1 0 ffff880077a7a480 IN 0.1 248592 1936 rsyslogd
2090 2080 5 ffff880079ed4900 IN 0.0 105432 828 less
crash> gcore -v1 2080
gcore: Restoring the thread group ...
gcore: done.
gcore: Retrieving note information ...
< hangs forever >
...
I would have thought that it would either work-for-all or work-for-none
with respect to a particular kernel version?
In any case, if it's going to fail, perhaps there should be some mechanism
in place that would prevent it from hanging, and instead print a message
that the kernel version is not supported? Or if a particular data structure
is different than the "supported" versions, it should fail immediately?
Just a thought...
Also I note that "gcore -v7" fails -- shouldn't it be accepted as an
argument?
crash> gcore -v7 2080
gcore: invalid vlevel: 7.
crash>
Thanks,
Dave
TODO
====
I have still remaining tasks to do:
- Improvement on register collection for active tasks
- Improvement on callee-saved register collection on x86_64
- Support core dump for tasks running in x86_32 compatibility mode
Usage
=====
1) Expand source files under extensions directory.
Arrange the attached source files as shown below:
./extensions/gcore.c
./extensions/gcore.mk
./extensions/libgcore/gcore_coredump.c
./extensions/libgcore/gcore_coredump_table.c
./extensions/libgcore/gcore_defs.h
./extensions/libgcore/gcore_dumpfilter.c
./extensions/libgcore/gcore_global_data.c
./extensions/libgcore/gcore_regset.c
./extensions/libgcore/gcore_verbose.c
./extensions/libgcore/gcore_x86.c
2) Type ``make extensions''; then, ``gcore.so'' is generated under
extensions directory.
3) Type ``extend gcore.so'' to load gcore extension module.
Look at help message for actual usage: I attach the help message at
the end of this mail.
4) Type ``extend -u gcore.so'' to unload gcore extension module.
Help Message
============
NAME
gcore - gcore - retrieve a process image as a core dump
SYNOPSIS
gcore
gcore [-v vlevel] [-f filter] [pid | taskp]*
This command retrieves a process image as a core dump.
DESCRIPTION
-v Display verbose information according to vlevel:
progress library error page fault
---------------------------------------
0
1 x
2 x
4 x (default)
7 x x x
-f Specify kinds of memory to be written into core dumps according to
the filter flag in bitwise:
AP AS FP FS ELF HP HS
------------------------------
0
1 x
2 x
4 x
8 x
16 x x
32 x
64 x
127 x x x x x x x
AP Anonymous Private Memory
AS Anonymous Shared Memory
FP File-Backed Private Memory
FS File-Backed Shared Memory
ELF ELF header pages in file-backed private memory areas
HP Hugetlb Private Memory
HS Hugetlb Shared Memory
If no pid or taskp is specified, gcore tries to retrieve the process
image
of the current task context.
The file name of a generated core dump is core.<pid> where pid is PID
of
the specified process.
For a multi-thread process, gcore generates a core dump containing
information for all threads, which is similar to a behaviour of the
ELF
core dumper in Linux kernel.
Notice the difference of PID on between crash and linux that ps
command in
crash utility displays LWP, while ps command in Linux thread group
tid,
precisely PID of the thread group leader.
gcore provides core dump filtering facility to allow users to select
what
kinds of memory maps to be included in the resulting core dump. There
are
7 kinds memory maps in total, and you can set it up with set command.
For more detailed information, please see a help command message.
EXAMPLES
Specify the process you want to retrieve as a core dump. Here assume
the
process with PID 12345.
crash> gcore 12345
Saved core.12345
crash>
Next, specify by TASK. Here assume the process placing at the address
f9d7000 with PID 32323.
crash> gcore f9d78000
Saved core.32323
crash>
If multiple arguments are given, gcore performs dumping process in the
order the arguments are given.
crash> gcore 5217 ffff880136d72040 23299 24459 ffff880136420040
Saved core.5217
Saved core.1130
Saved core.1130
Saved core.24459
Saved core.30102
crash>
If no argument is given, gcore tries to retrieve the process of the
current
task context.
crash> set
PID: 54321
COMMAND: "bash"
TASK: e0000040f80c0000
CPU: 0
STATE: TASK_INTERRUPTIBLE
crash> gcore
Saved core.54321
When a multi-thread process is specified, the generated core file name
has
the thread leader's PID; here it is assumed to be 12340.
crash> gcore 12345
Saved core.12340
It is not allowed to specify two same options at the same time.
crash> gcore -v 1 1234 -v 1
Usage: gcore
gcore [-v vlevel] [-f filter] [pid | taskp]*
gcore -d
Enter "help gcore" for details.
It is allowed to specify -v and -f options in a different order.
crash> gcore -v 2 5201 -f 21 ffff880126ff9520 5205
Saved core.5174
Saved core.5217
Saved core.5167
crash> gcore 5201 ffff880126ff9520 -f 21 5205 -v 2
Saved core.5174
Saved core.5217
Saved core.5167
Signed-off-by: HATAYAMA Daisuke <d.hatayama(a)jp.fujitsu.com>
[Text File:gcore.c]
[Text File:gcore.mk]
[Text File:gcore_coredump.c]
[Text File:gcore_coredump_table.c]
[Text File:gcore_defs.h]
[Text File:gcore_dumpfilter.c]
[Text File:gcore_global_data.c]
[Text File:gcore_regset.c]
[Text File:gcore_verbose.c]
[Text File:gcore_x86.c]