Re: [Crash-utility] EXT: RE: crash: read error on type: "memory section root table"

Friday, 22 July 2022

Hello,

Back to this topic.

I upgraded our system with the kexec-tools from Centos 8 Stream, based on kexec 2.0.24 and
makedumpfile 1.7.1.
We are still facing errors when using 'makedumpfile -c'.

Removing the '-c' gives better ratio success/failure, but sometimes the crash file
cannot be read by the crash tool.

Referring to Hagio's remark below concerning the sync, I added a sync operation before
the call of makedumpfile (and just after the mount ext4 of the required partitions) and
add a second call to sync after the return of makedumpfile.
In that configuration, the crash file can be read by the crash tool (up to now in all
cases).

Thanks for your help.
Best regards,
Patrick Agrain

-----Message d'origine-----
De : Crash-utility <crash-utility-bounces(a)redhat.com&gt; De la part de Agrain Patrick
Envoyé : mercredi 6 avril 2022 17:48
À : Discussion list for crash utility usage, maintenance and development
<crash-utility(a)redhat.com>; kexec(a)lists.infradead.org
Objet : Re: [Crash-utility] EXT: RE: crash: read error on type: "memory section root
table"

-----Message d'origine-----
De : HAGIO KAZUHITO(萩尾　一仁) <k-hagio-ab(a)nec.com&gt; Envoyé : mercredi 6 avril 2022 09:48
À : Agrain Patrick <patrick.agrain(a)al-enterprise.com&gt;
Cc : Discussion list for crash utility usage, maintenance and development
<crash-utility(a)redhat.com>; kexec(a)lists.infradead.org Objet : RE: EXT: RE: crash:
read error on type: "memory section root table"

-----Original Message-----
...
 Hello,

 Suggested trace above gives following information after a crash -d 8 command:
 <...>
 kernel NR_CPUS: 2
 <readmem: ffffffffa4925820, KVADDR, "high_memory", 8, (FOE),
 56017b542648>
 <read_diskdump: addr: ffffffffa4925820 paddr: 12925820 cnt: 8>
 read_diskdump: paddr/pfn: 12925820/12925 -> cache physical page: 
 12925000
 GETBUF(328 -> 0)
 FREEBUF(0)
 GETBUF(328 -> 0)
 FREEBUF(0)
 PAGESIZE=4096
 mem_section_size = 16384
 NR_SECTION_ROOTS = 2048
 NR_MEM_SECTIONS = 524288
 SECTIONS_PER_ROOT = 256
 SECTION_ROOT_MASK = 0xff
 PAGES_PER_SECTION = 32768
 <readmem: ffffffffa4926db0, KVADDR, "mem_section", 8, (FOE),
 7ffd1b6bb000>
 <read_diskdump: addr: ffffffffa4926db0 paddr: 12926db0 cnt: 8>
 read_diskdump: paddr/pfn: 12926db0/12926 -> cache physical page: 
 12926000
 <readmem: ffff904c7f7fc000, KVADDR, "memory section root table", 
 16384, (FOE), 56017da26fd0>
 <read_diskdump: addr: ffff904c7f7fc000 paddr: 3f7fc000 cnt: 4096>
 read_diskdump: paddr/pfn: 3f7fc000/3f7fc -> cache physical page: 
 3f7fc000
 crash: PAG3 - errno=2 r=0 pd.size=49
 read_diskdump: READ_ERROR: cannot cache page: 3f7fc000
 crash: read error: kernel virtual address: ffff904c7f7fc000  type: "memory section
root table"

hmm, r=0 means end of file, can you check again whether pd.offset exceeds the dumpfile
size?  If so, somehow the dumpfile is shorter than expected.

Indeed, the offset points outside the dumpfile:
Get:
crash: PAG3 - errno=2 r=0 pd.size=52 pd.offset=168956485 with a dumpfile
164820 -rw-r--r--.  1 root root  168775680  6 avril 17:23 crashdump--20220406-1713

And another one:
Get:
crash: PAG3 - errno=2 r=0 pd.size=49 pd.offset=215640649 with a dumpfile
209984 -rw-r--r--.  1 root root  215023616  1 avril 10:58
crashdump-585.000-20220401-1054

I think a RHEL-based kexec-tools does "sync" after makedumpfile, but can you
check?

Actually, we are executing the makedumpfile in a script designated as init file for the
second kernel. Therefore, we do not perform the sync as per core_collector.

Thanks,
Kazu

Best regards,
Patrick

--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://listman.redhat.com/mailman/listinfo/crash-utility
Contribution Guidelines: https://github.com/crash-utility/crash/wiki

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Crash-utility] EXT: RE: crash: read error on type: "memory section root table"