Re: [Crash-utility] Crash-utility Digest, Vol 52, Issue 1
by Dave Anderson
----- "d00fy" <d00fy(a)163.com> wrote:
> > when I analyze the crash dump file(/proc/vmcore), crash utility show
> > me this error message, What does it mean exactly?
> > a bug or configuration error?
> > grub.conf
> > kernel /boot/vmlinuz-2.6.23 console=ttyS0,9600 root=/dev/sda2 ro quiet
> > crashkernel=256M@16M
> >
> > load cmd:
> > /sbin/kexec -p --command-line="root=/dev/sda2 irqpoll maxcpus=1 nousb"
> > --initrd=/boot/initrd.img-2.6.23 /boot/vmlinuz-2.6.23
> > crash:
> > 4.1.1
> > box:
> > Intel(R) Pentium(R) 4 CPU 3.00GHz
> > 4G RAM
>
> > At what point did it happen?
>
> >Dave
>
> I insert a module which would trigger a kernel crash, in this module
> I use request_irq(irq, myinterrupt, ...)to register a irq on the nic,
> and in the function myinterrupt I
> just write something at somewhere (int *p = 00202020; *p = 10), the
> kernel would crash soon , but I cann't open the /proc/vmcore with
> crash utility after warm reboot, the error look like:
> --> crash: invalid kernel virtual address: 5fb001 type: "pgd page"
Neither the kexec invocation details nor how you made the kernel crash
is of much consequence, but rather I was interested in when the crash
utility itself failed. In other words, was it during crash invocation,
or perhaps during a particular crash command? I'm guessing that it's
during invocation since you say it "can't open the /proc/vmcore"?
Anyway, if you still have a copy of the /proc/vmcore file and the relevant
vmlinux, please post the output of "crash -d4 vmlinux vmcore", and perhaps
I can help. If you rebooted without saving the /proc/vmcore file to disk,
then there's not much I can do.
Thanks,
Dave
14 years, 10 months
Degradation with crash 5.0.0 on x86
by Shahar Luxenberg
Hi,
Environment: Red Hat Enterprise Linux Server release 5.2 (Tikanga), x86, 2.6.18-92.el5
I've installed crash 5.0.0 and noticed lots of error messages during startup of the form:
'crash: input string too large: "804328c4:" (9 vs 8)'
This doesn't happen with crash 4.1.2
While debugging it a little, I've noticed that BUG_x86 is calling gdb with the x/i command:
sprintf(buf1, "x/%ldi 0x%lx", spn->value - sp->value, sp->value);
The return buffer (buf2) is: 0x80430800: push %ebp
On 4.1.2, the return buffer (buf2) is: 0x80430800 <do_exit>: push %ebp
This explains the problem since parse_line will parse the line differently returning '0x80430800:' on arglist[0] and nothing on arglist[2] (crash 5.0.0) while returning 0x80430800 on arglist[0] and 'push' on arglist[2].
Have you noticed this kind of problem?
Thanks,
Shahar.
14 years, 10 months
Re: [Crash-utility] Crash-utility Digest, Vol 52, Issue 1
by d00fy
when I analyze the crash dump file(/proc/vmcore), crash utility show
> me this error message, What does it mean exactly?
> a bug or configuration error?
> grub.conf
> kernel /boot/vmlinuz-2.6.23 console=ttyS0,9600 root=/dev/sda2 ro quiet
> crashkernel=256M@16M
>
> load cmd:
> /sbin/kexec -p --command-line="root=/dev/sda2 irqpoll maxcpus=1 nousb"
> --initrd=/boot/initrd.img-2.6.23 /boot/vmlinuz-2.6.23
> crash:
> 4.1.1
> box:
> Intel(R) Pentium(R) 4 CPU 3.00GHz
> 4G RAM
> At what point did it happen?
>Dave
I insert a module which would trigger a kernel crash, in this module
I use request_irq(irq, myinterrupt, ...)to register a irq on the nic, and in the function myinterrupt I
just write something at somewhere (int *p = 00202020; *p = 10), the kernel would crash soon , but I cann't open the /proc/vmcore with crash utility after warm reboot, the error look like:
--> crash: invalid kernel virtual address: 5fb001 type: "pgd page"
14 years, 10 months
[ANNOUNCE] crash version 5.0.0 is available
by Dave Anderson
Changelog:
- Updated embedded gdb version to FSF gdb-7.0.
(anderson(a)redhat.com)
- Fix for the ppc64 "irq" command where the "irq_desc_t" is no longer
recognized as a typedef for "struct irq_desc". Without the patch,
the command fails with the error message: "irq: invalid structure
size: irqdesc".
(anderson(a)redhat.com)
- Fix for 2.6.26 and later ppc64 CONFIG_SPARSEMEM_VMEMMAP kernels to
recognize VMEMMAP_REGION virtual addresses. The kernel's memmap page
structure array(s) are mapped in that region, and without the fix,
the vmemmap virtual addresses were being erroneously translated using
the kernel's page tables that map the VMALLOC_REGION. This in turn
led to bogus data being read for all page structure content requests,
resulting in invalid error messages for commands such as "kmem -s",
"kmem -p", "kmem -f", etc. A secondary issue is that there is no
current manner for the crash utility to be able to translate vmemmap
addresses because there is no record of the the mapping stored in the
kernel. That being the case, any command that needs to read the
contents of a page structure will fail. During initialization, the
message "WARNING: cannot translate vmemmap kernel virtual addresses:
commands requiring page structure contents will fail" will alert the
user of the problem. During runtime, an attempt to read the contents
of a vmemmap'd page structure will fail with the error message
"<command>: cannot translate vmemmap address: <vmemmap address>".
(anderson(a)redhat.com)
- Fix for segmentation violation when running the "ps -r" command
option on 2.6.25 or later kernels.
(anderson(a)redhat.com)
- Fix for the "mount" command on 2.6.32 and later kernels. Without the
patch, the command would fail immediately with the error message
"mount: invalid structure member offset: super_block_s_dirty". Also,
the "mount -i" option will no longer be supported in 2.6.32 and later
kernels because the super_block.s_dirty linked list no longer exists.
(anderson(a)redhat.com)
- Fix for the "bt" command on 2.6.29 and later x86_64 kernels to
always recognize and display BUG()-induced exception frames. Without
the patch, the backtrace would potentially not display the exception
frame.
(anderson(a)redhat.com)
- Fix for the "rd" and "kmem" commands to prevent the unnecessary
"WARNING: sparsemem: invalid section number: <number>" message
when testing whether an address is represented by a page structure
in CONFIG_SPARSEMEM_EXTREME kernels.
(anderson(a)redhat.com)
- Fix for a 4.0-8.11 regression that introduced a bug in determining
the number of cpus in ppc64 kernels when the cpu_possible_[map/mask]
has more cpus than the cpu_online_[map/mask]. In that case, the
kernel contains per-cpu runqueue data and "swapper" tasks for the
extra cpus. Without the patch, on systems with a possible cpu count
that is larger than its online cpu count:
(1) the "sys" command will reflect the possible cpu count.
(2) the "ps" command will show the existent-but-unused "swapper"
tasks as active on the extra cpus.
(3) the "set" command will allow the current context to be set to
any of the existent-but-unused "swapper" tasks.
(4) the "runq" command will display existent-but-unused runqueue
data for the extra cpus.
(5) the "bt" command on the existent-but-unused "swapper" tasks will
indicate: "bt: cannot determine NT_PRSTATUS ELF note for active
task: <task>" on dumpfiles, and "(active)" on live systems.
(anderson(a)redhat.com)
Download from: http://people.redhat.com/anderson
14 years, 10 months
Identity of a Xen Dom0 vmcore file
by Louis Bouchard
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello and a very happy new year to everyone !
I'm still at work on crashdc and making good progress. I have a working
version for the standard i386 kernels of RHEL5, SLES10 and SLES11 working.
I'm currently working on making sure that all type of kernels delivered
by the distros will work fine with crashdc and I'm hitting a snag with
Xen on RHEL5 (for Dom0). No big deal, but when there is more than 4 Gb
of memory, kdump loads the PAE kernel even if a Xen kernel was
previously running. So I cannot use 'uname' to identify the kernel
context to use the appropriate debuginfo kernel.
So if I want to be able to choose the adequate vmcore from the debuginfo
package, I must rely on the vmcore file produced to identify which one
(i.e. PAE, XEN or std) to use.
It looks like the first few bytes of the xen vmcore file might hold the
information I'm looking for. Doing the following 'od' on a vmcore file
created from a Xen kernel gives me this :
od -N2000 -S3 vmcore.xen | more
0000544 CORE
0001010 Xen
0001050 Xen
0001140 VMCOREINFO_XEN
Same on a PAE or standard vmcore brings back this :
od -N2000 -S3 vmcore.pae | more
0000544 CORE
0001010 VMCOREINFO
I might search the web for a description of the vmcore file format, but
I'm sure that many of you here can point me to the right direction and,
maybe, tell me if I'm making the right assumption in thinking that those
values can safely identify a Xen kernel.
I've looked at xendump.[c|h] from the crash utility sources and it looks
like there is a Xen specific signature but I might be interpreting this
in the wrong way.
Maybe someone care to comment ?
TIA and Kind Regards,
- --
Louis Bouchard, Linux Support Engineer
Team lead, EMEA Linux Competency Center,
Linux Ambassador, HP
HP Services 1 Ave du Canada
HP France Z.A. de Courtaboeuf
louis.bouchard(a)hp.com 91 947 Les Ulis
http://www.hp.com/go/linux France
http://www.hp.com/fr
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAktEsqEACgkQDvqokHrhnCyigwCg3srT8BipzXlgOhsYrIbVXjRW
XUcAn3SwFVcEGZhYSPvCNOJJNQZDfN1a
=MqgA
-----END PGP SIGNATURE-----
14 years, 10 months
crash: invalid kernel virtual address: 5fb001 type: "pgd page"
by d00fy
when I analyze the crash dump file(/proc/vmcore), crash utility show me this error message, What does it mean exactly?
a bug or configuration error?
grub.conf
kernel /boot/vmlinuz-2.6.23 console=ttyS0,9600 root=/dev/sda2 ro quiet crashkernel=256M@16M
load cmd:
/sbin/kexec -p --command-line="root=/dev/sda2 irqpoll maxcpus=1 nousb" --initrd=/boot/initrd.img-2.6.23 /boot/vmlinuz-2.6.23
crash:
4.1.1
box:
Intel(R) Pentium(R) 4 CPU 3.00GHz
4G RAM
14 years, 10 months