Hi Dave,
On Wed, Feb 28, 2018 at 8:48 PM, Dave Anderson <anderson(a)redhat.com> wrote:
Hi Bhupesh,
Thanks -- mystery solved!
I kind of hate to simply use the file name string as the qualifier, though.
Looking at the ELF headers of the vmlinux.o and vmlinux file, there are
several notable differences:
$ readelf -a vmlinux
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: AArch64
Version: 0x1
Entry point address: 0xffff000008080000
Start of program headers: 64 (bytes into file)
Start of section headers: 186677936 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 4
Size of section headers: 64 (bytes)
Number of section headers: 42
Section header string table index: 41
...
$ readelf -a vmlinux.o
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: AArch64
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 478638376 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 64 (bytes)
Number of section headers: 34816
Section header string table index: 34815
...
There is the "Type", the "Entry point address", and perhaps most
notably,
the vmlinux.o's "Number of program headers:" count of 0, which obviously
would preclude the file from being usable.
Indeed. I was loath to use the string 'vmlinux.o' directly also, but I
was worried if modifying the checks in ''find_booted_kernel()'' would
break existing systems ...
I would prefer to beef up the currently-existing is_kernel() function,
and use it instead of is_elf_file(). Let me take a look at what works.
... sure, let me know if I can help in any way with the same.
Regards,
Bhupesh
----- Original Message -----
> I found this particular issue while compiling and installing
> a kernel from source on a RHEL arm64 machine. I used the following
> commands to build and install the kernel from source:
>
> # make olddefconfig
> # make
> # make modules_install INSTALL_MOD_STRIP=1 && make install
>
> This compiles and installs both 'vmlinux' and 'vmlinux.o' at
> the following standard location - '/lib/modules/<kernel>/build/
>
> Now the problem can be reproduced using the following steps:
> 1. Update grub to boot the freshly compiled kernel.
> 2. Reboot machine.
> 3. Now, run crash on live system, it errors out with the following
> messages:
>
> crash: invalid kernel virtual address: 8470 type: "possible"
> WARNING: cannot read cpu_possible_map
> crash: invalid kernel virtual address: 8270 type: "present"
> WARNING: cannot read cpu_present_map
> crash: invalid kernel virtual address: 8070 type: "online"
> WARNING: cannot read cpu_online_map
> crash: invalid kernel virtual address: 8670 type: "active"
> WARNING: cannot read cpu_active_map
>
> crash: cannot resolve "_stext"
>
> 4. Enabling some debug messages in 'find_booted_kernel()' function
> tells us that it finds 'vmlinux.o' earlier than 'vmlinux' and
accepts
> that as the kernel being boot'ed:
>
> find_booted_kernel: check: /lib/modules/4.14.0/build/vmlinux.o
> find_booted_kernel: found: /lib/modules/4.14.0/build/vmlinux.o
>
> 5. Now the problem happens due to following check inside
> 'find_booted_kernel()' function:
>
> if (mount_point(kernel) ||
> !file_readable(kernel) ||
> !is_elf_file(kernel))
> continue;
>
> 6. Since 'vmlinux.o' is a elf file, is readable and is not
> mount'able, so the check in point 5 fails and we incorrectly
> accept this as the kernel being boot'ed, even though
> there was a 'vmlinux' present inside
> '/lib/modules/<kernel>/build'.
>
> 7. Now, later when crash tries to access symbols (like _stext)
> using this kernel symbol file, we see errors.
>
> Fix this by skipping 'vmlinux.o' from the check in point 5,
> so that we can select 'vmlinux' correctly as the kernel file.
>
> After this fix, crash no longer errors out and we can use
> full functionality on the crash prompt.
>
> Signed-off-by: Bhupesh Sharma <bhsharma(a)redhat.com>
> ---
> filesys.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/filesys.c b/filesys.c
> index 1b44ad5cefa8..9f240d6115a1 100644
> --- a/filesys.c
> +++ b/filesys.c
> @@ -623,9 +623,21 @@ find_booted_kernel(void)
>
> sprintf(kernel, "%s%s", searchdirs[i],
dp->d_name);
>
> + /* We may run into a special case with
> + * 'vmlinux.o' been present inside
> + * '/lib/modules/<kernel>/build/',
> + * which has some special characteristics -
> + * this is a elf file, is readable and is not
> + * mount'able.
> + *
> + * So we need to handle this properly here to
> + * make sure that we get to the real 'vmlinux'
> + * rather than the 'vmlinux.o'
> + */
> if (mount_point(kernel) ||
> !file_readable(kernel) ||
> - !is_elf_file(kernel))
> + !is_elf_file(kernel) ||
> + !(strcmp(dp->d_name, "vmlinux.o")))
> continue;
>
> if (CRASHDEBUG(1))
> --
> 2.7.4
>
>