Hello Vinayak,
This patch works and tests well, although I wish I had a more
useful ARM64 sample dumpfile whose kernel has come up farther
than the last one, and that either has kernel modules or at
least one user-space task so I could verify the VTOP of those
types of virtual addresses.
Here are a few minor issues with ramdump.c:
(1) It #includes <libelf.h>, which would require a new package to be
installed:
$ rpm -qf /usr/include/libelf.h
elfutils-libelf-devel
$
I changed it to #include <elf.h> (from glibc-headers) and it
compiles just fine.
(2) The ramdump_def structure:
struct ramdump_def {
char *path;
int rfd;
ulong start_paddr;
ulong end_paddr;
};
The start_paddr and end_paddr fields have to be 64-bits in
length to account for ARM kernels that have RAM beyond 4GB.
They start_paddr member gets assigned to a 64-bit p_paddr field
and the end_paddr field would overflow in a greater-than-4GB
scenario. They should be ulonglong or physaddr_t types.
Related to that, the start_paddr field should be intitialized
with htoll() instead of htol() just in case a 32-bit node
starts above the 32-bit threshold.
(3) The "Skipping dumpfile size check for ramdump" is unnecessary
given that the end_paddr is calculated based upon the size
of the RAM dump file itself.
(4) The "converting ramdump to kdump" message is unnecessary, and
the "Creating kdump ELF <filename>" should be different
if only an ELF header is created vs. whether a completely new
ELF file is being created.
I've made changes above to your patch with no problems, so there's no
need for a re-post. I'll keep playing around to see if anything
else comes up.
Nice job!
Thanks,
Dave
----- Original Message -----
> In any case, if you use "htol()" instead of strtoul(),
the user won't
> have to enter the "0x", because it will presume it's a hexadecimal
> address.
>
Done.
> However, you cannot put *nothing* as the offset value:
> So that also needs fixing.
Done.
>
> A few other questions/comments...
>
> When 32-bit x86 kdumps are created, they typically default to using
> the 64-bit ELF format. That is required in case there is physical memory
> above the 4GB threshold, which cannot be described in a 32-bit ELF
> header. Since 32-bit ARM can also be PAE, would it make more sense
> to *always* use 64-bit ELF headers for both ARM and ARM64? I don't
> see why not, and it should simplify ramdump.c quite a bit.
That's right. I had to add a case in is_netdump() for EM_ARM.
>
> And thinking this through a bit more, to me it seems really wasteful
> to have to create a whole new dumpfile. Even if it's only a temporary
> file, it's still seemingly an unnecessary duplication of disk space.
>
I agree.
> Here's an idea, not fully thought through, but seems like it could
> work when the "temporary" dumpfile option is used:
>
> (1) Create a temporary file that *only* consists of the ELF header.
> (2) Set a new RAMDUMP flag in pc->flags2.
> (3) Pass that temporary ELF header file to is_kdump() as you do now.
> (4) is_kdump() passes it to is_netdump(), and I believe that is_netdump()
> should parse just the ELF header and accept it as a KDUMP_ELF32 or
> KDUMP_ELF64 file without even being aware that there's no physical
> memory data attached.
> (5) When kdump_init() is called, it passes through to netdump_init()
> which calls check_dumpfile_size() -- which would need to look
> at the RAMDUMP flag to do the right thing.
> (6) And instead of using read_kdump(), copy it to a new read_ramdump()
> function that reads from the original RAM dump image(s), figuring
> out the file offsets accordingly. Keep the new read_ramdump()
> function in netdump.c so it can continue to use the "nd" vmcore_data
> structure that is statically defined there. read_ramdump() may have
> to interact with function(s) in ramdump.c, or perhaps the ramdump_def
> structure can be shared with netdump.c somehow.
>
The attached patch contains these changes. Please review. I have tried
to keep read_ramdump()
within ramdump.c.
> There will presumably be a few other glitches that will require checking
> the new RAMDUMP flag, but I don't think that there would be anything that
> couldn't be overcome. That really would be the ideal way to handle these
> files.
I have done some tests using both arm and arm64 ramdumps, including
the ones you had pointed out in previous emails.
I will do more tests and will let you know if I find any more bugs.
Please let me know if there are any specific tests to be done, which
can bring out these glitches.
Thanks,
Vinayak
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility