Re: [Crash-utility] Cell/B.E. SPU commands extension
by Lucio Correia
> Lucio Correia wrote:
> > Hi,
> >
> > I've developed this crash extension to analyze SPU specific data for
> > Cell/B.E. processor. This extension makes use of some important data
> > saved by this kernel patch (that is not mainline yet)
> > http://ozlabs.org/pipermail/cbe-oss-dev/2007-May/001848.html during the
> > crash dump.
> >
> > I would like to check if there is any issue with the code.
> >
>
> Functionally it looks fine.
>
> I changed the _init() function to just do an error(INFO, ...)
> so I could load the extensions, and the only suggestion
> I can make is purely aesthetic, which would be to make
> the "help" messages 80 characters or less like the regular
> commands are. In other words, the "DESCRIPTION" section
> outputs, and the sentences in in the "spuctx" EXAMPLE
> section are kind of ugly the way that they run on with
> no linefeeds.
>
> But like I said before, I don't see any issues/problems
> with the code -- pretty nifty extension...
>
> Dave
>
Thanks for the comments, Dave. I'm correcting these issues.
Regards,
--
Lucio Correia
Software Engineer
IBM LTC Brazil
17 years, 2 months
[RFC] Crash extension for SystemTap
by Satoru MORIYA
Hi,
Here is an extension(shared object) of the crash to retrieve the trace
data of systemtap scripts.
I'd like to analyze what caused the kernel panic by using the systemtap.
However, currently the systemtap's trace data can't be retrieved from a
dumped image easily. So, I developed a crash's extension which retrieves
the data recorded by systemtap from the dumped image.
Here is a brief document of this extension. This extension supports the new
utt-based buffer as well as the bulk-mode buffer of old systemtap module.
I have tested this extention on the following system.
* FC6, i386, kernel-2.6.21, systemtap-0.5.14, crash-4.0-1.1
* FC6, i386, kernel-2.6.20, systemtap-0.5.13/14, crash-4.0-1.1
* RHEL5, i386, kernel-2.6.18-8.el5, systemtap-0.5.12, crash-4.0-3.14
Preparation
==============
(A) Build the shared-object(stplog.so).
1. Put Makefile and stplog.c into a directory ($DIR)
$ cd $DIR
2. Make the symbolic link to the crash source code directory
$ ln -s $WHERE_CRASH_PLACED crash
3. Build
$ make
(B) Make the crash dump which includes SystemTap trace data.
(*)If you analyze the live system memory, ignore this section.
1. Install kdump
If you use FC6, see following URL.
http://fedoraproject.org/wiki/FC6KdumpKexecHowTo?highlight=%28kdump%29
2. Use SystemTap
$ stap foo.stp
3. Panic
$ echo c > /proc/sysrq-trigger
How to use
==============
1. start crash
$ crash vmlinux vmcore
(*) If you analyze the live system memory, you don't need "vmcore".
$ crash vmlinux
2. load the shared-object
crash> extend $(WHERE_OBJ_PLACED)/stplog.so
3. retrieve the data
crash> stplog -m <mod_name>
(*) <mod_name> is the name of trace module from which you retrieve data.
4. You can get output files under the directory whose name is <mod_name>.
Output
==============
stplog command makes a file per channel buffer of relayfs(equivalent to per cpu).
And it also removes padding bytes.
I believe this command is very useful for system administrators
if they monitor their systems with SystemTap.
Best Regards,
---
Satoru MORIYA
Linux Technology Center
Hitachi, Ltd., Systems Development Laboratory
E-mail: satoru.moriya.br(a)hitachi.com
17 years, 3 months
Re: [Crash-utility] Scripting infrastructure in crash
by Luc Chouinard
I'd would like very much to get the sial interpreter into the crash tool.
It's a matter of mating the apiops (defined in sial_api.h) to the gdb framework.
Things are a bit hectic right now but I should be able to look at in the next few weeks.
It's definitively something I'll look at in the short term.
Now, if anyone wants to start looking, please do. keep me in the loop.
Luc
----- Original Message ----
From: Dave Anderson <anderson(a)redhat.com>
To: sachinp(a)in.ibm.com; "Discussion list for crash utility usage, maintenance and development" <crash-utility(a)redhat.com>
Sent: Wednesday, May 2, 2007 9:44:20 AM
Subject: Re: [Crash-utility] Scripting infrastructure in crash
"Sachin P. Sant" wrote:
> Dave, i came across one of the Crash TODO list items about having
> a scripting infrastructure in crash.
>
> I was trying to evaluate the Alicia utility [mentioned in todo list].
> Here are some of my observations about Alicia.
>
> 1] It is a wrapper on top of crash.
> 2] One can write perl based scripts to extract infromation from dumps.
> 3] It has a nice report generation functionality which presents
> the data in text as well as html format.
> 4] Provides functions which could be used to read data from crash
> dumps.
> 5] It is easy to use and effective too.
>
> Attached here is a sample script which i tried using Alicia to
> display block and character devices. [ I know dev command already
> does this stuff .. but for the sake of trying out the Alicia i chose
> to write such a script ].
>
> Also encountered few problems while trying out Alicia.
> 1] On PPC64 arch came across data type overflow problem while
> executing the attached script.
> 2] On s390/s390x architecture class function provided by Alicia
> seems broken.
> 3] Alicia is a wrapper on top of crash.
>
> Other dump solutions [ lkcd ] has sial scripting which is c like
> and very effective. Not sure how difficult it will be to implement
> something like sial in crash.
>
> Do you have any plans of having scripting infrastructure in
> crash ? If yes your thoughts on Alicia / sial / < any other stuff >
>
I personally do not.
But, a few days ago, Luc Chouinard started looking into what it
would take to support sial scripting.
Dave
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
17 years, 4 months
Broken vtop on kernel 2.6.20?
by Alex Sidorenko
Hi Dave,
when I try to use 'vtop' for process pages on 2.6.20 kernel (Ubuntu/Feisty) on
x86 architecture, I get error messages about page table. The easiest way to
reproduce is to run 'ps -a' on a live kernel:
PID: 0 TASK: c03a2440 CPU: 0 COMMAND: "swapper"
ps: no user stack
PID: 0 TASK: df838560 CPU: 1 COMMAND: "swapper"
ps: no user stack
PID: 1 TASK: df838a90 CPU: 1 COMMAND: "init"
ps: read error: physical address: 7f2f0000 type: "page table"
Running crash with -d8:
PID: 1 TASK: df838a90 CPU: 1 COMMAND: "init"
<readmem: df838a90, KVADDR, "fill_task_struct", 1328, (ROE|Q), 8d2eac0>
<readmem: dfb71e40, KVADDR, "fill_mm_struct", 432, (ROE|Q), 8d8bf80>
GETBUF(128 -> 1)
FREEBUF(1)
GETBUF(128 -> 1)
FREEBUF(1)
GETBUF(128 -> 1)
FREEBUF(1)
GETBUF(128 -> 1)
FREEBUF(1)
arg_start: bf991ecf arg_end: bf991ee1 (18)
env_start: bf991ee1 env_end: bf991ff1 (272)
GETBUF(291 -> 1)
<readmem: dfb6f000, KVADDR, "pgd page", 4096, (FOE), 843cf90>
<readmem: dfb6f000, KVADDR, "pmd page", 4096, (FOE), 843cf90>
<readmem: 7f2f0000, PHYSADDR, "page table", 4096, (FOE), 843efa0>
ps: read error: physical address: 7f2f0000 type: "page table"
The same crash-4.0-4.1 works fine on live 2.6.15 kernel. Did the page table
layout change between 2.6.15 and 2.6.20 ?
Regards,
Alex
--
------------------------------------------------------------------
Alexandre Sidorenko email: alexs(a)hplinux.canada.hp.com
Global Solutions Engineering: Unix Networking
Hewlett-Packard (Canada)
------------------------------------------------------------------
17 years, 6 months
[PATCH] Add TARGET_CFLAGS to extension compilation
by Lucio Correia
There is a problem when compiling extensions on PPC64 and Cell
platforms. This occurs because TARGET_CFLAGS is not used by extension
compilation command, what causes the extensions to be compiled as
32-bit. This patch corrects this issue.
Signed-off-by: Lucio Correia <luciojhc(a)br.ibm.com>
diff -Nurp crash-4.0-4.1.orig/extensions/Makefile crash-4.0-4.1/extensions/Makefile
--- crash-4.0-4.1.orig/extensions/Makefile 2007-04-26 17:45:59.000000000 -0300
+++ crash-4.0-4.1/extensions/Makefile 2007-05-25 15:09:02.000000000 -0300
@@ -34,8 +34,9 @@ link_defs:
ln -s ../defs.h; fi
echo.so: ../defs.h echo.c
- gcc -nostartfiles -shared -rdynamic -o echo.so echo.c -fPIC -D$(TARGET)
+ gcc -nostartfiles -shared -rdynamic -o echo.so echo.c -fPIC \
+ -D$(TARGET) $(TARGET_CFLAGS)
dminfo.so: ../defs.h dminfo.c
- gcc -nostartfiles -shared -rdynamic -o dminfo.so dminfo.c -fPIC -D$(TARGET)
-
+ gcc -nostartfiles -shared -rdynamic -o dminfo.so dminfo.c -fPIC \
+ -D$(TARGET) $(TARGET_CFLAGS)
diff -Nurp crash-4.0-4.1.orig/Makefile crash-4.0-4.1/Makefile
--- crash-4.0-4.1.orig/Makefile 2007-04-26 17:45:59.000000000 -0300
+++ crash-4.0-4.1/Makefile 2007-05-25 14:53:22.000000000 -0300
@@ -543,4 +543,5 @@ extensions: make_configure
@make --no-print-directory do_extensions
do_extensions:
- @(cd extensions; make -i OBJECTS="$(EXTENSION_OBJECT_FILES)" TARGET=$(TARGET))
+ @(cd extensions; make -i OBJECTS="$(EXTENSION_OBJECT_FILES)" \
+ TARGET=$(TARGET) TARGET_CFLAGS=$(TARGET_CFLAGS))
17 years, 6 months
2.6.22 breaks crash
by Troy Heber
There was recent commit[1] to rename the "thread_info" member of the
task_struct to "stack":
- struct thread_info *thread_info;
+ void *stack;
To resolve it we simply need to change the hardcoded value for the offset
lookup in task.c:
--- task.c.ori 2007-05-24 09:54:43.000000000 -0600
+++ task.c 2007-05-24 10:41:50.000000000 -0600
@@ -161,7 +161,7 @@ task_init(void)
}
MEMBER_OFFSET_INIT(task_struct_thread_info, "task_struct",
- "thread_info");
+ "stack");
if (VALID_MEMBER(task_struct_thread_info)) {
MEMBER_OFFSET_INIT(thread_info_task, "thread_info", "task");
MEMBER_OFFSET_INIT(thread_info_cpu, "thread_info", "cpu");
However, I'm not sure what the best way to keep backwards compatibility for
kernels < 2.6.22.
Troy
[1] f7e4217b007d1f73e7e3cf10ba4fea4a608c603f
17 years, 6 months
Re: [Crash-utility] [RFC] Crash extension for SystemTap
by Dave Anderson
> Hi Satoru,
>
> I think you also meant to attach the stplog.c (and its own Makefile?)
> to your post?
>
> Is the format of the trace data used by systemtap always the same?
> I.e., is it always a kernel buffer filled with ASCII data?
>
> You mention that the command makes a file in a subdirectory of the
> running crash session. Wouldn't it be more flexible to dump the
> output to the terminal? And then if you want to save it to a file,
> just do something like this:
>
> crash> stplog -m mod_name > outputfile
In a subsequently-posted Red Hat bugzilla, it's been clarified that
the systemtap data is of an undetermined format, so terminal output
would not be advisable. So ignore that suggestion...
Also, my understanding is that this extension will be part of the
systemtap package, given that the manner of accessing the data
may change over time based upon the version of systemtap.
Thanks,
Dave
17 years, 6 months
dom0 analysis for IA64
by Itsuro ODA
Hi Dave,
The attached patch enables to analyze dom0 linux from
whole memory dump on IA64. (for crash-4.0-4.1)
It is just quick hack.
(I was asked from IA64 Xen developers and made it.)
Each domain manages own machine memory by domain.arch.mm.pgd
in IA64. It is 3-level page table.
I thougnt the mfn of domain.arch.mm.pgd can be regarded as
p2m_mfn.
I intended to modify as less existent code as possible.
But this patch is a bit tricky. And the memory usage is
large if the machine memory layout is sparse.
(maybe xen_kdump_p2m should be prepare for each arch ?)
Would you consider to support dom0 analysis for IA64 ?
I prepared two sample dumps. Please find from the following
URLs.
1) http://people.valinux.co.jp/~oda/20070510-sample-dump-1.tar
contents:
- vmcore.gz
This is taken by a hard assist dump. netdump style ELF vmcore.
So XEN_ELFNOTE_CRASH_INFO does not exist.
- vmcore.ka.gz
It is coverted to kdump style and added XEN_ELFNOTE_CRASH_INFO
manually.
- vmlinux.debug.gz
for dom0 analysis
- xen-syms-2.6.18-8.el5.gz
for xencrash
To get p2m_mfn, xencrash's doms command is usefull.
--------------------------------------------------------------------------
# crash xen-syms-2.6.18-8.el5 vmcore
...
crash> doms
DID DOMAIN ST T MAXPAGE TOTPAGE VCPU SHARED_I P2M_MFN
32753 f000000007ac8080 RU O 0 0 0 0 ----
32754 f000000007acc080 RU X 0 0 0 0 ----
> 32767 f000000007ff8080 RU I 0 0 4 0 ----
0 f000000007aa4080 RU 0 10000 fc28 1 f000000007a88000 1abb7
>* 1 f000000007a78080 RU U 10603 10603 3 f000000007a5c000 1a909
crash>
----------------------------------------------------------------------------
Then normal crash session with --p2m_mfn option.
----------------------------------------------------------------------------
# crash --p2m_mfn=1abb7 vmlinux.debug vmcore
...
----------------------------------------------------------------------------
vmcore.ka has XEN_ELFNOTE_CRASH_INFO. so --p2m_mfn option not need.
----------------------------------------------------------------------------
# crash vmlinux.debug vmcore.ka
...
----------------------------------------------------------------------------
--p2m_mfn option is effective only if a vmcore has XEN_ELFNOTE_CRASH_INFO
now.
I think specifying --p2m_mfn option is regarded as the vmcore is
XEN_CORE_DUMPFILE(). The patch supports this.
I think it is necessary for dumps which does not have
XEN_ELFNOTE_CRASH_INFO such as above sample.
2) http://people.valinux.co.jp/~oda/20070510-sample-dump-2.tar
contents:
- vmcore.tiger.iomem_machine.gz
taken by Xen kdump
- vmlinux-xen-ia64.bz2
- xen-syms-ia64.bz2
I asked Xen kdump developper (simon@valinux) to add "p2m_mfn" to
XEN_ELFNOTE_CRASH_INFO.
So this change of Xen kdump is not open yet.
If this is OK for crash, it will be commited.
Thanks.
--
Itsuro ODA <oda(a)valinux.co.jp>
17 years, 6 months
Seek error type: "tss_struct ist array" problem on 8-CPU AMD system
by Jansen, Frank
Looking through the changelog, I saw that the 'tss_struct ist array'
problem on 8-CPU systems had been addressed previously. However, I'm
running into this issue on an AMD server with crash 4.0-4.1 and RHEL4
Update 5 (2.6.9-55.Elsmp).
The output from the crash invocation is the following:
+++
[root@well-rhel4564-ps3 dump]# /fpj/crash System_map.2.6.9-55.ELsmp
vmlinux.debug.2.6.9-55.ELsmp ap3.1178895173.dmp
crash 4.0-4.1
Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public
License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for
details.
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
crash: seek error: kernel virtual address: 10408119e84 type:
"tss_struct ist array"
---
The server is a 4 dual-core AMD (2.8GHz) with 64GB.
Any insights into how best to troubleshoot this are much appreciated.
Thanks,
Frank Jansen
"The difference between genius and stupidity is that genius has its
limits." Albert Einstein
17 years, 6 months