On 2023-05-10 03:15, HAGIO KAZUHITO(萩尾 一仁) wrote:
On 2023/05/10 14:02, HAGIO KAZUHITO(萩尾 一仁) wrote:
> On 2023/05/10 10:44, HAGIO KAZUHITO(萩尾 一仁) wrote:
>> On 2023/05/10 4:33, Luiz Capitulino wrote:
>>> On 2023-05-09 03:32, HAGIO KAZUHITO(萩尾 一仁) wrote:
>>>> On 2023/05/02 3:41, Luiz Capitulino wrote:
>>>>> Hi all,
>>>>>
>>>>> I'm trying to run latest crash (HEAD 2505a65ff54) against kernel
>>>>> 4.14.314 but I'm getting the error below on startup.
>>>>>
>>>>> Is this a known issue? If not, any suggestions on how to debug it?
>>>>
>>>> hmm, I tried the kernel version, but could not reproduce it.
>>>>
>>>> crash> sys
>>>> KERNEL: /lib/modules/4.14.314/build/vmlinux
>>>> DUMPFILE: /proc/kcore
>>>> CPUS: 4
>>>> DATE: Tue May 9 16:16:14 JST 2023
>>>> UPTIME: 00:07:02
>>>> LOAD AVERAGE: 0.07, 0.12, 0.07
>>>> TASKS: 174
>>>> NODENAME: rhel78b
>>>> RELEASE: 4.14.314
>>>> VERSION: #1 SMP Tue May 9 15:28:59 JST 2023
>>>> MACHINE: x86_64 (3408 Mhz)
>>>> MEMORY: 4 GB
>>>>
>>>> Could you upload a startup log with "crash -d 8" option?
>>>
>>> I'm attaching a file with this information, thanks a lot for looking
>>> into this.
>>
>> Thanks.
>>
>> -----
>> module: ffffffffa00f8f80
>> <readmem: ffffffffa00f8f80, KVADDR, "module struct", 896, (ROE|Q),
122f800>
>> <readmem: 200e000, PHYSADDR, "pud page", 4096, (FOE), 1c95e00>
>> <read_proc_kcore: addr: 200e000 paddr: 200e000 cnt: 4096>
>> crash: seek error: physical address: 200e000 type: "pud page"
>> -----
>>
>> It seems that the virt to phys conversion for ffffffffa00f8f80 fails
>> because the file offset of the pud page is not found in /proc/kcore.
>>
>> According to read_proc_kcore(), it does
>> 1. p2v for 200e000 i.e. phys:200e000 --> virt:???
>> 2. search /proc/kcore pt_loads for the corresponding file offset to the
>> virtual address. (as pc->curcmd_flags does not have MEMTYPE_KVADDR.)
>> 3. read the file offset.
>>
>> so, what is the converted virtual address? For example,
>>
>> --- a/netdump.c
>> +++ b/netdump.c
>> @@ -4362,6 +4362,8 @@ read_proc_kcore(int fd, void *bufptr, int cnt, ulong addr,
physaddr_t paddr)
>> else
>> kvaddr = PTOV((ulong)paddr);
>>
>> + fprintf(fp, "kvaddr: %lx\n", kvaddr);
>> +
>> offset = UNINITIALIZED;
>> readcnt = cnt;
>>
>
> Ah, probably got it.
>
> The PTOV() above is defined like this:
>
> #define PTOV(X) ((unsigned long)(X)+(machdep->kvbase))
>
>>
>> Your kernel has the following pt_load information, probably it's out of
>> these vaddr ranges?
>>
>> offset vaddr end paddr end
size
>> 7fffff604000 ffffffffff600000-ffffffffff601000 ffffffffffffffff- 0
(1000)
>> 7fff81004000 ffffffff81000000-ffffffff8377f000 1000000- 377f000
(277f000)
>> 490000004000 ffffc90000000000-ffffe90000000000 ffffffffffffffff- 0
(1fffffffffff)
>> 7fffa0004000 ffffffffa0000000-ffffffffff000000 ffffffffffffffff- 0
(5f000000)
>> 88000005000 ffff888000001000-ffff88800009f000 1000- 9f000
(9e000)
>> 6a0000004000 ffffea0000000000-ffffea0000003000 ffffffffffffffff- 0
(3000)
>> 88000104000 ffff888000100000-ffff8880bffe8000 100000- bffe8000
(bfee8000)
>> 6a0000008000 ffffea0000004000-ffffea0003000000 ffffffffffffffff- 0
(2ffc000)
>> 88100004000 ffff888100000000-ffff888fff000000 100000000-fff000000
(eff000000)
>> 6a0004004000 ffffea0004000000-ffffea003ffc0000 ffffffffffffffff- 0
(3bfc0000)
>
> Your kernel looks configured without CONFIG_RANDOMIZE_BASE. For such
> kernels, a hard-coded value is used for PAGE_OFFSET and kvbase. And
> I found that Linux 4.14.84 and later has the recent PAGE_OFFSET.
>
> case POST_GDB:
> if (!(machdep->flags & RANDOMIZED) &&
> ((THIS_KERNEL_VERSION >= LINUX(4,19,5)) ||
> ((THIS_KERNEL_VERSION >= LINUX(4,14,84)) &&
> (THIS_KERNEL_VERSION < LINUX(4,15,0))))) {
> machdep->machspec->page_offset = machdep->flags
& VM_5LEVEL ?
> PAGE_OFFSET_5LEVEL_4_20 :
PAGE_OFFSET_4LEVEL_4_20;
> machdep->kvbase =
machdep->machspec->page_offset;
>
> #define PAGE_OFFSET_4LEVEL_4_20 0xffff888000000000
>
> But, the THIS_KERNEL_VERSION and LINUX() macros are defined like this:
>
> #define THIS_KERNEL_VERSION ((kt->kernel_version[0] << 16) + \
> (kt->kernel_version[1] << 8) + \
> (kt->kernel_version[2]))
> #define LINUX(x,y,z) (((uint)(x) << 16) + ((uint)(y) << 8) + (uint)(z))
>
> So (THIS_KERNEL_VERSION < LINUX(4,15,0)) is false on Linux 4.14.256 and
> later, and the old PAGE_OFFSET will be used.
>
> So does this patch work well?
I also confirmed that the issue could be reproduced without CONFIG_RANDOMIZE_BASE,
and this patch fixed it. so posted a formal patch, please try that.
Yes, will do!
- Luiz
Thanks,
Kazu
>
> --- a/defs.h
> +++ b/defs.h
> @@ -807,10 +807,10 @@ struct kernel_table { /* kernel data */
> } \
> }
>
> -#define THIS_KERNEL_VERSION ((kt->kernel_version[0] << 16) + \
> - (kt->kernel_version[1] << 8) + \
> +#define THIS_KERNEL_VERSION ((kt->kernel_version[0] << 24) + \
> + (kt->kernel_version[1] << 16) + \
> (kt->kernel_version[2]))
> -#define LINUX(x,y,z) (((uint)(x) << 16) + ((uint)(y) << 8) + (uint)(z))
> +#define LINUX(x,y,z) (((uint)(x) << 24) + ((uint)(y) << 16) +
(uint)(z))
>
> #define THIS_GCC_VERSION ((kt->gcc_version[0] << 16) + \
> (kt->gcc_version[1] << 8) + \
>
> Thanks,
> Kazu
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
>
https://listman.redhat.com/mailman/listinfo/crash-utility
> Contribution Guidelines:
https://github.com/crash-utility/crash/wiki