Dave,
Thanks for your explanation.
Well the reason behind my questions is, we have an application running on customer site
and the application consumes around 60GB of system memory.
When this process receives the segmentation fault or signal abort, the kernel will start
to take the process core dump. Here is the problem. Kernel takes at least 1hr
(60-minutes) to come out from core dump. During this time the system is unresponsive
(hung), and I feel it is because the system is entering into thrashing due to huge memory
usage by the process. This long down time is not acceptable by the customer.
So I started to find the better way or tackling the problem.
1>First thing we thought is changing the system page size from 4KB to 8KB. Since this
change could not be done on our x86_64 architecture, since x86_64 architecture doesnt
support multi-page size option.
2>We wrote a program using libbfd APIs and used with in our application. Whenever the
SIGSEGV or SIGABRT is received by the process it will log the stack trace of all the
threads within that process. This feature is not so effective or flexible as compared to
process core dump.
3>Last we thought of using kcore/vmcore to analyze the cause for SIGSEGV or SIGABRT.
4>I have one more thought, making the elf_core_dump() function SMP. This function is
responsible for dumping the core, and the function is present in
/usr/src/linux/fs/binfmt_elf.c
Any comments/ideas are welcome.
--Regards,
rajesh
Rajesh,
Castor's patch/suggestion is the best/only option you have
for this kind of thing. I've not tried it, but since the
crash utility's "vm -p" option delineates where each
instantiated page of a given task is located, it's potentially
possible to recreate an ELF core file of the specified
task. (Any swapped-out pages won't be in the vmcore...)
The embedded gdb module inside of crash is invoked internally
as "gdb vmlinux", and has no clue about any other user-space
program.
That being said, you can execute the gdb "add-symbol-file"
command to load the debuginfo data from a user space
program, and then examine user-space data from the context
of that program.
For example, when you run the crash utility on a live system,
the default context is that of the "crash" utility itself:
$ ./crash
crash 4.0-4.6
Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
KERNEL: /boot/vmlinux-2.4.21-37.ELsmp
DEBUGINFO: /usr/lib/debug/boot/vmlinux-2.4.21-37.ELsmp.debug
DUMPFILE: /dev/mem
CPUS: 2
DATE: Tue Sep 4 16:36:53 2007
UPTIME: 15 days, 08:15:06
LOAD AVERAGE: 0.14, 0.06, 0.01
TASKS: 87
NODENAME:
crash.boston.redhat.com
RELEASE: 2.4.21-37.ELsmp
VERSION: #1 SMP Wed Sep 7 13:28:55 EDT 2005
MACHINE: i686 (1993 Mhz)
MEMORY: 511.5 MB
PID: 9381
COMMAND: "crash"
TASK: dd63c000
CPU: 1
STATE: TASK_RUNNING (ACTIVE)
crash>
Verify the current context:
crash> set
PID: 9381
COMMAND: "crash"
TASK: dd63c000
CPU: 0
STATE: TASK_RUNNING (ACTIVE)
crash>
So, for example, the crash utility has a program_context
data structure that starts like this:
struct program_context {
char *program_name; /* this program's name */
char *program_path; /* unadulterated argv[0] */
char *program_version; /* this program's version */
char *gdb_version; /* embedded gdb version */
char *prompt; /* this program's prompt */
unsigned long long flags; /* flags from above */
char *namelist; /* linux namelist */
...
And it declares a data variable with the same name:
struct program_context program_context = { 0 };
If I wanted to see a gdb-style dump of its contents, I can
do this:
crash> add-symbol-file ./crash
add symbol table from file "./crash" at
Reading symbols from ./crash...done.
crash>
Now the embedded gdb has the debuginfo data from the crash
object file (which was compiled with -g), and it knows where
the program_context structure is located in user space:
crash> p &program_context
$1 = (struct program_context *) 0x8391ea0
crash>
Since 0x8391ea0 is not a kernel address, the "p" command cannot
be used to display the data structure. However, the crash
utility's "struct" command has a little-used "-u" option, which
indicates that the address that follows is a user-space address
from the current context:
crash> struct program_context -u 0x8391ea0
struct program_context {
program_name = 0xbffff9b0 "crash",
program_path = 0xbffff9ae "./crash",
program_version = 0x82e9c12 "4.0-4.6",
gdb_version = 0x834ecdf "6.1",
prompt = 0x8400438 "crash> ",
flags = 844424965983303,
namelist = 0x83f5940 "/boot/vmlinux-2.4.21-37.ELsmp",
...
That all being said, this capability cannot be used to generate
any kind of user-space backtrace. You can do raw reads of the
user-space stack, say from the point at which it entered kernel
space, but whether that's of any help depends upon what you're
looking for.
Dave