Hi Bhupesh,
Thanks -- mystery solved!
I kind of hate to simply use the file name string as the qualifier, though.
Looking at the ELF headers of the vmlinux.o and vmlinux file, there are
several notable differences:
$ readelf -a vmlinux
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: AArch64
Version: 0x1
Entry point address: 0xffff000008080000
Start of program headers: 64 (bytes into file)
Start of section headers: 186677936 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 4
Size of section headers: 64 (bytes)
Number of section headers: 42
Section header string table index: 41
...
$ readelf -a vmlinux.o
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: AArch64
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 478638376 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 64 (bytes)
Number of section headers: 34816
Section header string table index: 34815
...
There is the "Type", the "Entry point address", and perhaps most
notably,
the vmlinux.o's "Number of program headers:" count of 0, which obviously
would preclude the file from being usable.
I would prefer to beef up the currently-existing is_kernel() function,
and use it instead of is_elf_file(). Let me take a look at what works.
Thanks,
Dave
----- Original Message -----
I found this particular issue while compiling and installing
a kernel from source on a RHEL arm64 machine. I used the following
commands to build and install the kernel from source:
# make olddefconfig
# make
# make modules_install INSTALL_MOD_STRIP=1 && make install
This compiles and installs both 'vmlinux' and 'vmlinux.o' at
the following standard location - '/lib/modules/<kernel>/build/
Now the problem can be reproduced using the following steps:
1. Update grub to boot the freshly compiled kernel.
2. Reboot machine.
3. Now, run crash on live system, it errors out with the following
messages:
crash: invalid kernel virtual address: 8470 type: "possible"
WARNING: cannot read cpu_possible_map
crash: invalid kernel virtual address: 8270 type: "present"
WARNING: cannot read cpu_present_map
crash: invalid kernel virtual address: 8070 type: "online"
WARNING: cannot read cpu_online_map
crash: invalid kernel virtual address: 8670 type: "active"
WARNING: cannot read cpu_active_map
crash: cannot resolve "_stext"
4. Enabling some debug messages in 'find_booted_kernel()' function
tells us that it finds 'vmlinux.o' earlier than 'vmlinux' and accepts
that as the kernel being boot'ed:
find_booted_kernel: check: /lib/modules/4.14.0/build/vmlinux.o
find_booted_kernel: found: /lib/modules/4.14.0/build/vmlinux.o
5. Now the problem happens due to following check inside
'find_booted_kernel()' function:
if (mount_point(kernel) ||
!file_readable(kernel) ||
!is_elf_file(kernel))
continue;
6. Since 'vmlinux.o' is a elf file, is readable and is not
mount'able, so the check in point 5 fails and we incorrectly
accept this as the kernel being boot'ed, even though
there was a 'vmlinux' present inside
'/lib/modules/<kernel>/build'.
7. Now, later when crash tries to access symbols (like _stext)
using this kernel symbol file, we see errors.
Fix this by skipping 'vmlinux.o' from the check in point 5,
so that we can select 'vmlinux' correctly as the kernel file.
After this fix, crash no longer errors out and we can use
full functionality on the crash prompt.
Signed-off-by: Bhupesh Sharma <bhsharma(a)redhat.com>
---
filesys.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/filesys.c b/filesys.c
index 1b44ad5cefa8..9f240d6115a1 100644
--- a/filesys.c
+++ b/filesys.c
@@ -623,9 +623,21 @@ find_booted_kernel(void)
sprintf(kernel, "%s%s", searchdirs[i], dp->d_name);
+ /* We may run into a special case with
+ * 'vmlinux.o' been present inside
+ * '/lib/modules/<kernel>/build/',
+ * which has some special characteristics -
+ * this is a elf file, is readable and is not
+ * mount'able.
+ *
+ * So we need to handle this properly here to
+ * make sure that we get to the real 'vmlinux'
+ * rather than the 'vmlinux.o'
+ */
if (mount_point(kernel) ||
!file_readable(kernel) ||
- !is_elf_file(kernel))
+ !is_elf_file(kernel) ||
+ !(strcmp(dp->d_name, "vmlinux.o")))
continue;
if (CRASHDEBUG(1))
--
2.7.4