Hi Guanyou,
On Wed, Dec 11, 2024 at 3:20 PM Guanyou Chen <chenguanyou9338(a)gmail.com> wrote:
Hi Tao
Thanks for your explanation.
Test env:
WARNING: cpu 0: cannot find NT_PRSTATUS note
WARNING: cpu 1: cannot find NT_PRSTATUS note
WARNING: cpu 2: cannot find NT_PRSTATUS note
WARNING: cpu 3: cannot find NT_PRSTATUS note
WARNING: cpu 4: cannot find NT_PRSTATUS note
WARNING: cpu 5: cannot find NT_PRSTATUS note
WARNING: cpu 6: cannot find NT_PRSTATUS note
WARNING: cpu 7: cannot find NT_PRSTATUS note
KERNEL: vmlinux [TAINTED]
DUMPFILES: /var/tmp/ramdump_elf_RabECD [temporary ELF header]
DDRCS0_0.BIN
DDRCS1_0.BIN
DDRCS1_1.BIN
DDRCS2_0.BIN
DDRCS2_1.BIN
DDRCS2_2.BIN
CPUS: 8
The non-elf-vmcore is composed of multiple DDR memory files.
Hmm I see... In this case the ELF header is constructed in
ramdump_to_elf(), so it is reasonable that the NT_PRSTATUS note is
missing. I don't have further questions for this patch, and Ack for
the code part.
As for the commit log, since it lacks of some useful info, how about
we re-draft it as:
----
When the ELF note does not contain CPU registers, attempting to
retrieve online CPU registers will cause a crash. This is likely to
happen for ramdump cases as follows, where the ELF header is created
by crash at runtime.
WARNING: cpu 0: cannot find NT_PRSTATUS note
...
WARNING: cpu 7: cannot find NT_PRSTATUS note
KERNEL: vmlinux [TAINTED]
DUMPFILES: /var/tmp/ramdump_elf_RabECD [temporary ELF header]
DDRCS0_0.BIN
...
DDRCS2_2.BIN
CPUS: 8
----
What do you think? @Lianbo Jiang
Thanks,
Tao Liu
Thanks
Guanyou
Tao Liu <ltao(a)redhat.com> 于2024年12月11日周三 07:12写道:
>
> Hi Guanyou,
>
> On Tue, Dec 10, 2024 at 7:35 PM lijiang <lijiang(a)redhat.com> wrote:
> >
> > On Wed, Nov 27, 2024 at 11:09 AM Guanyou Chen <chenguanyou9338(a)gmail.com>
wrote:
> >>
> >> Hi lianbo
> >>
> >> > Thanks for pointing out this. Can you help to try this one?
> >>
> >> test pass.
> >>
> >
> > Thank you for the confirmation, Guanyou.
> >
> > Lets pick up this one. so: Ack(with this correction)
> >
> > Thanks
> > Lianbo
> >
> >>
> >> Thanks
> >> Guanyou
> >>
> >> lijiang <lijiang(a)redhat.com> 于2024年11月27日周三 10:16写道:
> >>>
> >>> On Tue, Nov 26, 2024 at 11:46 AM Guanyou Chen
<chenguanyou9338(a)gmail.com> wrote:
> >>>>
> >>>> Hi lianbo
> >>>>
> >>>> test case is non-elf-vmcore, so all nt_prstatus_percpu invalid
pointer.
>
> I have a question, if your test case is non-elf-vmcore, then why did
> the code execution reach the function "display_regs_from_elf_notes()"
> here? It shouldn't pass the is_netdump() check for ELF header in the
> first place, am I right?
>
> How did you generate the non-elf-vmcore? If you generate it by some
> special method(those not generated by makedumpfile), please also point
> it out in your commit message at first, so we can know it is a special
> case.
>
> In addition, we are lack of none-standard vmcores(none makedumpfile
> generated) for testing, and I know in the arm world there are
> different methods for getting the kernel memory dumping, which by the
> way, are not our expertise. So please try to provide as much info as
> possible to help us understand your test case/environment in your
> commit message next time.
>
> Thanks,
> Tao Liu
>
>
> >>>>
> >>>
> >>> Thanks for pointing out this. Can you help to try this one?
> >>>
> >>> diff --git a/netdump.c b/netdump.c
> >>> index b4e2a5cb2037..b67bdad3c511 100644
> >>> --- a/netdump.c
> >>> +++ b/netdump.c
> >>> @@ -2768,7 +2768,8 @@ display_regs_from_elf_notes(int cpu, FILE *ofp)
> >>> }
> >>> }
> >>>
> >>> - if ((cpu - skipped_count) >= nd->num_prstatus_notes &&
> >>> + if (((cpu < 0 ) || (!nd->nt_prstatus_percpu[cpu]) ||
> >>> + (cpu - skipped_count) >= nd->num_prstatus_notes) &&
> >>> !machine_type("MIPS")) {
> >>> error(INFO, "registers not collected for cpu %d\n", cpu);
> >>> return;
> >>>
> >>> Lianbo
> >>>
> >>>>
> >>>> Thanks
> >>>> Guanyou.
> >>>>
> >>>> lijiang <lijiang(a)redhat.com> 于2024年11月26日周二 11:27写道:
> >>>>>
> >>>>> Hi, Guanyou
> >>>>> Thank you for the fix.
> >>>>> On Mon, Nov 4, 2024 at 4:13 PM
<devel-request(a)lists.crash-utility.osci.io> wrote:
> >>>>>>
> >>>>>> Date: Fri, 1 Nov 2024 18:01:27 +0800
> >>>>>> From: Guanyou Chen <chenguanyou9338(a)gmail.com>
> >>>>>> Subject: [Crash-utility] [PATCH] bugfix command "help
-r" segv fault
> >>>>>> To: Lianbo <lijiang(a)redhat.com>, Tao Liu
<ltao(a)redhat.com>,
> >>>>>> devel(a)lists.crash-utility.osci.io
> >>>>>> Message-ID:
> >>>>>>
<CAHS3RMU3nuiqW4z=Qo9RoufADrUxcaLhyjnxwMCuGODB_+37yQ(a)mail.gmail.com>
> >>>>>> Content-Type: multipart/mixed;
boundary="00000000000065fc530625d705b8"
> >>>>>>
> >>>>>> --00000000000065fc530625d705b8
> >>>>>> Content-Type: multipart/alternative;
boundary="00000000000065fc530625d705b6"
> >>>>>>
> >>>>>> --00000000000065fc530625d705b6
> >>>>>> Content-Type: text/plain; charset="UTF-8"
> >>>>>>
> >>>>>> Hi Lianbo, Tao
> >>>>>>
> >>>>>> When the ELF Note does not contain CPU registers,
> >>>>>> attempting to retrieve online CPU registers will cause a
crash.
> >>>>>>
> >>>>>> After:
> >>>>>> CPU 6:
> >>>>>> help: registers not collected for cpu 6
> >>>>>> ...
> >>>>>>
> >>>>>> Signed-off-by: Guanyou.Chen <chenguanyou(a)xiaomi.com>
> >>>>>> ---
> >>>>>> netdump.c | 16 ++++++++++++++++
> >>>>>> 1 file changed, 16 insertions(+)
> >>>>>>
> >>>>>> diff --git a/netdump.c b/netdump.c
> >>>>>> index 8ea5159..435793b 100644
> >>>>>> --- a/netdump.c
> >>>>>> +++ b/netdump.c
> >>>>>> @@ -2780,6 +2780,10 @@ display_regs_from_elf_notes(int cpu,
FILE *ofp)
> >>>>>
> >>>>>
> >>>>> I copied the code block here:
> >>>>> display_regs_from_elf_notes(int cpu, FILE *ofp)
> >>>>> {
> >>>>> Elf32_Nhdr *note32;
> >>>>> Elf64_Nhdr *note64;
> >>>>> size_t len;
> >>>>> char *user_regs;
> >>>>> int c, skipped_count;
> >>>>>
> >>>>> /*
> >>>>> * Kdump NT_PRSTATUS notes are only related to online
cpus,
> >>>>> * so offline cpus should be skipped.
> >>>>> */
> >>>>> if (pc->flags2 & QEMU_MEM_DUMP_ELF)
> >>>>> skipped_count = 0;
> >>>>> else {
> >>>>> for (c = skipped_count = 0; c < cpu; c++) {
> >>>>> if (check_offline_cpu(c))
> >>>>> skipped_count++;
> >>>>> }
> >>>>> }
> >>>>>
> >>>>> if ((cpu - skipped_count) >=
nd->num_prstatus_notes &&
> >>>>> !machine_type("MIPS")) {
> >>>>> error(INFO, "registers not collected for
cpu %d\n", cpu);
> >>>>> return;
> >>>>> }
> >>>>> ...
> >>>>> Could you please point out why the above check does not work?
> >>>>>
> >>>>> BTW: I'm not sure if it can work for you, can you help to
try this? Just a guess.
> >>>>>
> >>>>> if (((cpu < 0 ) || (!dd->nt_prstatus_percpu[cpu])
> >>>>> || (cpu - skipped_count) >=
nd->num_prstatus_notes) &&
> >>>>> !machine_type("MIPS")) {
> >>>>> error(INFO, "registers not collected for
cpu %d\n", cpu);
> >>>>> return;
> >>>>> }
> >>>>>
> >>>>> Thanks
> >>>>> Lianbo
> >>>>>
> >>>>>
> >>>>>> nd->nt_prstatus_percpu[cpu];
> >>>>>> else
> >>>>>> note64 = (Elf64_Nhdr
*)nd->nt_prstatus;
> >>>>>> + if (!note64) {
> >>>>>> + error(INFO, "registers not collected for
cpu %d\n", cpu);
> >>>>>> + return;
> >>>>>> + }
> >>>>>> len = sizeof(Elf64_Nhdr);
> >>>>>> len = roundup(len + note64->n_namesz, 4);
> >>>>>> len = roundup(len + note64->n_descsz, 4);
> >>>>>> @@ -2820,6 +2824,10 @@ display_regs_from_elf_notes(int cpu,
FILE *ofp)
> >>>>>> nd->nt_prstatus_percpu[cpu];
> >>>>>> else
> >>>>>> note32 = (Elf32_Nhdr
*)nd->nt_prstatus;
> >>>>>> + if (!note32) {
> >>>>>> + error(INFO, "registers not collected for
cpu %d\n", cpu);
> >>>>>> + return;
> >>>>>> + }
> >>>>>> len = sizeof(Elf32_Nhdr);
> >>>>>> len = roundup(len + note32->n_namesz, 4);
> >>>>>> len = roundup(len + note32->n_descsz, 4);
> >>>>>> @@ -2857,6 +2865,10 @@ display_regs_from_elf_notes(int cpu,
FILE *ofp)
> >>>>>> else
> >>>>>> note64 = (Elf64_Nhdr *)nd->nt_prstatus;
> >>>>>>
> >>>>>> + if (!note64) {
> >>>>>> + error(INFO, "registers not collected for
cpu %d\n", cpu);
> >>>>>> + return;
> >>>>>> + }
> >>>>>> prs = (struct ppc64_elf_prstatus *)
> >>>>>> ((char *)note64 + sizeof(Elf64_Nhdr) +
note64->n_namesz);
> >>>>>> prs = (struct ppc64_elf_prstatus
*)roundup((ulong)prs, 4);
> >>>>>> @@ -2903,6 +2915,10 @@ display_regs_from_elf_notes(int cpu,
FILE *ofp)
> >>>>>> nd->nt_prstatus_percpu[cpu];
> >>>>>> else
> >>>>>> note64 = (Elf64_Nhdr
*)nd->nt_prstatus;
> >>>>>> + if (!note64) {
> >>>>>> + error(INFO, "registers not collected for
cpu %d\n", cpu);
> >>>>>> + return;
> >>>>>> + }
> >>>>>> len = sizeof(Elf64_Nhdr);
> >>>>>> len = roundup(len + note64->n_namesz, 4);
> >>>>>> len = roundup(len + note64->n_descsz, 4);
> >>>>>> --
> >>>>>> 2.34.1
> >>>>>>
> >>>>>> Guanyou.
> >>>>>> Thanks
>