On Thu, Mar 07, 2013 at 10:46:14PM +0200, Mika Westerberg wrote:
On Thu, Mar 07, 2013 at 08:46:52PM +0100, Per Fransson wrote:
> Hi,
>
> >> Are you able to test with the same GDB version that is embedded in crash?
> >>
> >> I'm running out of ideas (I sure hope Per figures out what is going on
:-)).
> >
> > Thanks for that vote of confidence =o)
> >
> > Alas, I must confess,
> > all I have is a guess
> > that "dot text" is partly a mess
> > ...no less
> >
> > /Per
> >
> >> Are you able to make the vmlinux/vmcore pair available to us?
> >>
>
> Short version:
> ##############
> It's the compressed kernel, including decompression stub, occupying
> the first 0x3ffe00 bytes of .text
>
> Long version:
> #############
> I extracted the .text parts of the vmlinux and dump and diffed:
>
> vmlinux
> -------
> .text:
> start: 0x0081c0 = 33216
> size: 0x4a85d4 = 4883924
> dd bs=1 count=4883924 skip=33216 if=vmlinux of=vmlinux_text
>
> dump
> ----
> .text:
> start: 0xc01081c0 - 0xc0000000 + 0x94 = 1081940
> size: 0x4a85d4 = 4883924
> dd bs=1 count=4883924 skip=1081940 if=233_0128.dump of=dump_text
>
> First I thought I had made a mistake, because, to begin with, it's all
> different, but then, after 0x3ffe00 bytes (that'd correspond to
> virtual address 0xc01081c0+0x3ffe00=0xc0507fc0 in crash), it's all
> identical (except for __v7_setup_stack, which for some reason is
> placed in .text.) This made me run 'string' on the data in those first
> 0x3ffe00 bytes of .text and they looked kind of familiar, for example:
>
> 3baf -- System halted
> 3bc1 Attempting division by 0!
> 3bdb Uncompressing Linux...
> 3bf2 decompressor returned an error
> 3c11 done, booting the kernel.
>
> and that's when I realized it was the image from
> arch/arm/boot/compressed. I think you've loaded a new (capture?)
> kernel at the location where the original one was loaded before
> pulling out the dump.
Nice finding!
If you look the command line, it says crashkernel=4k@0x8140000 which might
explain why the dump capture kernel has overridden the memory. However, there
should be some check in the kexec code which prevents this.
I can verify your analysis:
crash> rd 0xc0108000 1000
c0108000: e1a00000 e1a00000 e1a00000 e1a00000 ................
c0108010: e1a00000 e1a00000 e1a00000 e1a00000 ................
c0108020: ea000002 016f2818
....
This matches, like you said the decompressor code. There is 0x016f2818 which is
the magic number.
Also the point where the decompressor code ends can be seen at 0xc0507fc0 (as
you already pointed out):
c0507fb0: 00000000 00000000 00000000 00000000 ................
c0507fc0: e1a00005 e59f1110 ebf75bab e3500000 .........[....P.
c0507fd0: 13e08000 1a000007 e59d3038 e1a00004 ........80......
Lei, last time I used ARM kdump one needed to load the dump capture kernel
into different address (the one specified in crashkernel=size@address option).
There was arch/arm/mach-<yourmach>/Makefile.boot where this load address could
be configured. Not sure how it is handled at present.