Re: [Crash-utility] gdb on KDUMP files
by Pete Delaney
> Nowadays it is only enough to use during configure:
> --enable-64-bit-bfd
I tried
configure --enable-64-bit-bfd --enable-largefile
And gdb still has problems accessing memory in the KDUMP that the crash-utility can read.
For example crash can walk the task list but when the gdb macro tries
To access the memory of the second task gdb says it can't access memory.
-piet
--
Pete/Piet Delaney
O: +1 408 935-1813
C: +1 408 646-8557
H: +1 408 243-8872
Home Email: piet.delaney(a)gmail.com
-----Original Message-----
From: gdb-owner(a)sourceware.org [mailto:gdb-owner@sourceware.org] On Behalf Of Jan Kratochvil
Sent: Friday, October 17, 2014 4:56 AM
To: Andreas Arnez
Cc: Discussion list for crash utility usage, maintenance and development; GDB Development
Subject: Re: gdb on KDUMP files
On Fri, 17 Oct 2014 13:24:01 +0200, Andreas Arnez wrote:
> > 4. Ability to use 64-bit files on 32-bit platforms (to handle PAE)
This was:
https://bugzilla.redhat.com/show_bug.cgi?id=457187
Nowadays it is only enough to use during configure:
--enable-64-bit-bfd
Additionally Fedora is carrying for Linux kernel support:
http://pkgs.fedoraproject.org/cgit/gdb.git/tree/gdb-6.5-bz203661-emit-rel...
dsicussed in the thread:
https://sourceware.org/ml/gdb/2006-08/msg00137.html
Jan
10 years
[ANNOUNCE] crash gcore command, version 1.3.0-rc2 is released
by HATAYAMA Daisuke
This is the release of crash gcore command, version 1.3.0-rc2.
The version 1.3.0 is going to newly add ARM64 support, including
compat mode, and PPC64 support, and the purpose of this serise of rc
version releases is for verification by other architecture
maintainers. Please give me a verfication result as a reply to this
mail.
The remaining changes are all bugfixes.
# The changes include those that appeared in v1.3.0-rc.
ChangeLog:
[new features]
- Add ARM64 support. In addition to native ARM64 build, like crash
utility, we can build x86_64 executable of crash gcore command for
ARM64 crash dump by make target=ARM64, just like crash utility.
(anderson(a)redhat.com)
- Add ARM64 compat mode support. This allows gcore to create
corefiles for tasks running in 32-bit compatible mode on ARM64.
(weishu(a)marvell.com)
- Add PPC64 support. This includes both big-endian and little-endian
formats.
(mtoman(a)redhat.com, anderson(a)redhat.com)
[bugfixes]
- Correct a read buffer size for NT_FPREGSET as sizeof(struct
user_i387_struct). So far we had used sizeof(union thread_xstate)
falsely as a read buffer size but it had accidentally been equal to
sizeof(struct user_i387_struct). However, the following patch
extended union thread_xstate and sizeof(union thread_xstate) became
larger than sizeof(struct user_i387_struct):
commit e7d820a5e549b3eb6c3f9467507566565646a669
Author: Qiaowei Ren <qiaowei.ren(a)intel.com>
Date: Thu Dec 5 17:15:34 2013 +0800
x86, xsave: Support eager-only xsave features, add MPX support
Some features, like Intel MPX, work only if the kernel uses eagerfpu
model. So we should force eagerfpu on unless the user has explicitly
disabled it.
Add definitions for Intel MPX and add it to the supported list.
[ hpa: renamed XSTATE_FLEXIBLE to XSTATE_LAZY and added comments ]
Signed-off-by: Qiaowei Ren <qiaowei.ren(a)intel.com>
Link: http://lkml.kernel.org/r/9E0BE1322F2F2246BD820DA9FC397ADE014A6115@SHSMSX1...
Signed-off-by: H. Peter Anvin <hpa(a)linux.intel.com>
Without this patch, for vmcores whose kernel versions are v3.14 or
later, gcore results in segmentation fault due to a buffer overrite
of NT_FPREGSET.
(d.hatayama(a)jp.fujitsu.com)
- Although ELF_DATA is defined in gcore_defs.h, ELFDATA2LSB is used
directly at elf{64,32}_fill_elf_header(). There's so far been no
problem since the exisitng supported architectures are all
little-endian systems. Fix this to support PPC64 that uses
little-endian format.
(anderson(a)redhat.com)
- Fix a bug that registers in NT_PRSTATUS note information is
broken. This had been since v1.2.2 when O(1) note informaiton
collection was added. Without this fix, we can never get reliable
register values for failure analysis.
(weishu(a)marvell.com)
- Fix a bug that NT_386_IOPERM note information is not collected. So
far, ioperm_get() had always returned 1. As a result, NT_386_IOPERM
note information had never been not included in a generated core
file even if it is available for a given task on a given crash
dump.
(d.hatayama(a)jp.fujitsu.com)
- Add new member offset initialization for struct
nsproxy::pid_ns_for_children. In upstream, the following patch
renamed struct nsproxy::pid_ns into struct
nsproxy::pid_ns_for_children.
$ git log -1 c2b1df2e
commit c2b1df2eb42978073ec27c99cc199d20ae48b849
Author: Andy Lutomirski <luto(a)amacapital.net>
Date: Thu Aug 22 11:39:16 2013 -0700
Rename nsproxy.pid_ns to nsproxy.pid_ns_for_children
nsproxy.pid_ns is *not* the task's pid namespace. The name
should clarify that.
This makes it more obvious that setns on a pid namespace is weird --
it won't change the pid namespace shown in procfs.
Signed-off-by: Andy Lutomirski <luto(a)amacapital.net>
Reviewed-by: "Eric W. Biederman" <ebiederm(a)xmission.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Without this fix, gcore exited abnormally at its initialization
part and so core file is never generated.
(d.hatayama(a)jp.fujitsu.com)
- Fix a bug that a wrong way of checking return value of
fopen(). fopen() returns NULL in case of error, but gcore had seen
it as returning a minus integer. As a result, gcore continues
execution after the check even in case of error and then exits
abnormally at the first call of fwrite() with the broken file
pointer gcore failed to open.
From users' viewpoint, we face this bug when trying to overwrite an
existing corefile with more priviledged permission and resulting in
EPERM failure.
(d.hatayama(a)jp.fujitsu.com)
MD5 CheckSum:
$ md5sum ./crash-gcore-command-1.3.0-rc2.tar.gz
07757d2ee044b19cac6b652de0d757fc ./crash-gcore-command-1.3.0-rc2.tar.gz
--
Thanks.
HATAYAMA, Daisuke
10 years, 1 month
Regarding crash-gcore-command
by Subramanian Karunanithi
Hi,
I am using crash-gcore-command 1.2.0.
I am trying to cross compile this tool for PPC arch. However, looks like
gcore_defs.h is having only x86, x86_64 and ARM capability.
Is there any plan to support this tool for PPC?
Regards,
Subramanian. K
10 years, 1 month
Kernel dump file access library
by Petr Tesarik
Hi all,
during this year's SUSE HackWeek, my colleague started work on enabling
kernel core files in gdb. I realized that there would be at least four
different programs implementing read access to kernel dump files:
1. the crash utility
2. makedumpfile (when re-filtering)
3. kdumpid (my project to get kernel version from a dump file)
4. gdb-kdump (started by my colleague during HackWeek)
At this point, I felt that's too much re-inventing the wheel again and
again, so I took my current code from kdumpid and adapted it as a
library that can be used by everybody:
https://github.com/ptesarik/libkdumpfile
In its current shape, it's usable, but far from complete.
Things that work already:
- identify kdump file format
- parsed meta-information from the header
- open ELF, diskdump, makedumpfile, LKCD
- read data by physical address (incl. Xen Dom0)
- read data by Xen machine address
Things still on my TODO list:
- more formats: sadump, kvmdump, libvirt, xc_core, xc_save
- determine phys_base in ELF files
- determine kernel release if not found in headers
Ideally, I would like to replace all current implementations with this
library, so if a new file format appears, or a new feature is added to
one of the files, it can be immediately used by all kdump-related tools.
Please let me know what you think.
Oh, and if you're developing such a tool, let me know which features
should be added.
Regards,
Petr Tesarik
10 years, 1 month
Crash in crash
by Karlsson, Jan
Hi Dave
I have a vmcore file for ARM64 that crashes Crash during startup. The core file is created at a hardware watchdog (I believe) so there is no panic message or something similar in the log.
This is the printout from Crash running under gdb, after the copyrights and config information:
please wait... (determining panic task)
Program received signal SIGSEGV, Segmentation fault.
0x000000000047ed40 in tgid_quick_search (tgid=5040) at memory.c:4114
4114 if (tgid == last->tgid) {
(gdb) bt
#0 0x000000000047ed40 in tgid_quick_search (tgid=5040) at memory.c:4114
#1 0x000000000047f046 in get_task_mem_usage (task=18446743799318107136, tm=0x7fffffff6f40)
at memory.c:4186
#2 0x000000000047c679 in vm_area_dump (task=18446743799318107136, flag=10, vaddr=0, ref=0x0)
at memory.c:3671
#3 0x000000000047ec08 in in_user_stack (task=18446743799318107136, vaddr=0) at memory.c:4063
#4 0x00000000004fd9fe in arm64_get_dumpfile_stackframe (frame=<synthetic pointer>,
bt=<optimized out>) at arm64.c:1077
#5 arm64_get_stack_frame (bt=0x7fffffffc690, pcp=0x7fffffff9560, spp=0x7fffffff9568)
at arm64.c:1103
#6 0x00000000004de409 in back_trace (bt=0x7fffffffc690) at kernel.c:2533
#7 0x00000000004d1563 in foreach (fd=0x7fffffffc7c0) at task.c:6161
#8 0x00000000004d2bbd in panic_search () at task.c:6425
#9 0x00000000004d4454 in get_panic_context () at task.c:5364
#10 task_init () at task.c:491
#11 0x000000000046146e in main_loop () at main.c:801
#12 0x00000000006467a3 in captured_command_loop (data=<optimized out>) at main.c:258
#13 0x000000000064535b in catch_errors (func=0x646790 <captured_command_loop>, func_args=0x0,
errstring=0x873235 "", mask=6) at exceptions.c:557
#14 0x0000000000647726 in captured_main (data=<optimized out>) at main.c:1064
#15 0x000000000064535b in catch_errors (func=0x646aa0 <captured_main>, func_args=0x7fffffffe030,
errstring=0x873235 "", mask=6) at exceptions.c:557
#16 0x0000000000647a84 in gdb_main (args=<optimized out>) at main.c:1079
#17 0x0000000000647abe in gdb_main_entry (argc=<optimized out>, argv=<optimized out>)
at main.c:1099
#18 0x000000000045f61f in main (argc=3, argv=0x7fffffffe188) at main.c:758
(gdb) p tt->last_tgid
$1 = (struct tgid_context *) 0x0
Source code for tgid_quick_search:
static struct tgid_context *
tgid_quick_search(ulong tgid)
{
struct tgid_context *last, *next;
tt->tgid_searches++;
last = tt->last_tgid;
if (tgid == last->tgid) {
tt->tgid_cache_hits++;
return last;
}
....
}
So 'last' becomes 0 which causes the crash.
After some more investigation I have seen that "tt->last_tgid" is initialized in function sort_tgid_array in task.c, but that function seems to be called at a later stage.
By adding a line in tgid_quick_search:
static struct tgid_context *
tgid_quick_search(ulong tgid)
{
struct tgid_context *last, *next;
tt->tgid_searches++;
if (tt->last_tgid == 0) sort_tgid_array(); // added line
last = tt->last_tgid;
if (tgid == last->tgid) {
tt->tgid_cache_hits++;
return last;
}
...
I can run Crash on this core file. However I do not know if this is the best way to fix the problem.
Jan
Jan Karlsson
Senior Software Engineer
System Assurance
Sony Mobile Communications
Tel: +46 703 062 174
jan.karlsson(a)sonymobile.com<mailto:Firstname.Lastname@sonymobile.com>
sonymobile.com<http://sonymobile.com/>
[cid:image001.gif@01CFED2C.FA76D730]
10 years, 1 month
Re: [Crash-utility] gdb on KDUMP files
by Pete Delaney
I'm glad to see this discussion today....
> Nowadays it is only enough to use during configure:
> --enable-64-bit-bfd
I'll give it a try. I provided O_LARGEFILE to the gdb configure but
I didn't know about this option. With everything going 64-bit these
days, why isn't it the default. I'm running gdb on a 64 bit machine
and having trouble reading 64 bit core files. Seems like this should
work correctly without any additional configure options.
About 8 years ago I could read a 32 bit KDUMP with gdb
and, as I recall, each CPU looked like a thread; just like kgdb
displayed CPU's as threads. I also think embedded JTAG setups
should do the same.
Are you implying that with:
--enable-64-bit-bfd
I should be able to do that now on a 64-bit machine looking
At 64 bit core dumps and see the back trace for the current
CPU's at the time of the KDUMP?
I found the Documentation/kdump/gdbmacros.txt out of date
And had to fix them to work. :(
-piet
On Fri, 17 Oct 2014 13:24:01 +0200, Andreas Arnez wrote:
> > 4. Ability to use 64-bit files on 32-bit platforms (to handle PAE)
This was:
https://bugzilla.redhat.com/show_bug.cgi?id=457187
Nowadays it is only enough to use during configure:
--enable-64-bit-bfd
Additionally Fedora is carrying for Linux kernel support:
http://pkgs.fedoraproject.org/cgit/gdb.git/tree/gdb-6.5-bz203661-emit-rel...
dsicussed in the thread:
https://sourceware.org/ml/gdb/2006-08/msg00137.html
Jan
10 years, 1 month
gdb on KDUMP files
by Pete Delaney
Hi:
Six years ago Dave and I were discussing using gdb on KDUMP files:
http://www.redhat.com/archives/crash-utility/2008-March/msg00039.html
At the time you weren't sure of gdb could read 64bit elf headers.
I'm trying to look at KDUMP files with gdb and seeing similar problems.
I configured and built gdb with -enable-largefile but it didn't help.
I also tried uncommenting _LARGE_FILE in gdb/config.h thought I doubt that's correct.
Wondering what's been done in the last six years on this. The kernel I'm
Running with is only using 4GB but is running in 64 bit mode.
Looking at the kexec src if seems KEXEC use to support:
# KEXEC_ARGS="--elf32-core-headers"
This was used to force the dump to be elf32 just so gdb could read the core file
It appears that this support was dropped for 64 bit machines by Vivek Goyak
Who seems was concerned for the health of the crash utility:
http://lse.sourceforge.net/kdump/patches/1.101-kdump10/broken-out/x86_64-...
The Linux kernel Documentation/kdump/kdump.txt was last
Updated on July 4th of this year and clearly says that gdb can
Read KDUMP files but says the crash-dumping kernel should
Be started with --elf32-core-headers kernel option:
420 Analysis
421 ========
422
423 Before analyzing the dump image, you should reboot into a stable kernel.
424
425 You can do limited analysis using GDB on the dump file copied out of
426 /proc/vmcore. Use the debug vmlinux built with -g and run the following
427 command:
428
429 gdb vmlinux <dump-file>
430
431 Stack trace for the task on processor 0, register display, and memory
432 display work fine.
433
434 Note: GDB cannot analyze core files generated in ELF64 format for x86.
435 On systems with a maximum of 4GB of memory, you can generate
436 ELF32-format headers using the --elf32-core-headers kernel option on the
437 dump kernel.
But I can't fine the string elf32-core-headers in the kernel source code.
Looking at the gdb Bugzilla page:
https://sourceware.org/bugzilla/buglist.cgi?quicksearch=elf64
Reading a few bug reports seems to indicate that gdb supports 64 bit elf.
I'm just trying to get the normal stack back-trace to work,
With formal and local variables, from a crash dump as well
as mapping the normal kernel memory so I can follow list like
that for the the system tasks.
With gdb I can read the init_task with both gdb and crash but can only
follow the list with crash.
Anyone know what's going on?
--
Pete/Piet Delaney
O: +1 408 935-1813
C: +1 408 646-8557
H: +1 408 243-8872
Home Email: piet.delaney(a)gmail.com<mailto:piet.delaney@gmail.com>
[Unity_Email_Sig]
10 years, 1 month
[PATCH] crash-gcore-command extension module: PPC64 support
by Dave Anderson
Hello Daisuke,
Attached is a patch that introduces support for the PPC64 architecture.
The patch was written by Michal Toman (mtoman(a)redhat.com). It is based
upon crash-gcore-command-1.3.0-rc.
The patch supports both big-endian and little-endian formats. However,
it does require the ELF_DATA fix to elf64_fill_elf_header() that I reported
yesterday. I have attached a separate patch to fix elf64_fill_elf_header
and elf32_fill_elf_header().
Please include these two patches in crash-gcore-command-1.3.0.
Thanks,
Dave
10 years, 1 month
FW: Number of cpus on ARM
by Karlsson, Jan
Hi
Unfortunately I found another older example where my patch below did not work.
In that one only cpu 0 where online but 0,1,2,3 where active. So maybe:
return MAX(get_cpus_active(), get_highest_cpu_online()+1);
might work better. Someone with better knowledge about this than I have should look at the problem.
Jan
Jan Karlsson
Senior Software Engineer
System Assurance
Sony Mobile Communications
Tel: +46 703 062 174
jan.karlsson(a)sonymobile.com<mailto:Firstname.Lastname@sonymobile.com>
sonymobile.com<http://sonymobile.com/>
[cid:image001.gif@01CFE865.529F8280]
From: Karlsson, Jan
Sent: den 15 oktober 2014 10:49
To: Discussion list for crash utility usage, maintenance and development
Subject:
Hi
I have seen a problem when it comes to the number of cpus for ARM (32-bits).
static int
arm_get_smp_cpus(void)
{
return MAX(get_cpus_active(), get_cpus_online());
}
In one of my example, "help -k" gives me:
cpu_possible_map: 0 1 2 3
cpu_present_map: 0 1 2 3
cpu_online_map: 0 3
cpu_active_map: 3
So the number of cpus will become 2. However there are code in a number of places that will then only accept cpu 0 and 1 as cpus to handle.
When I changed to code to be the same as for ARM64 things worked as expected:
static int
arm_get_smp_cpus(void)
{
return MAX(get_cpus_online(), get_highest_cpu_online()+1);
}
Jan
Jan Karlsson
Senior Software Engineer
System Assurance
Sony Mobile Communications
Tel: +46 703 062 174
jan.karlsson(a)sonymobile.com<mailto:Firstname.Lastname@sonymobile.com>
sonymobile.com<http://sonymobile.com/>
[cid:image001.gif@01CFE865.529F8280]
10 years, 1 month