On Wed, Dec 11, 2024 at 1:28 PM Tao Liu <ltao(a)redhat.com> wrote:
 Hi Guanyou,
 On Wed, Dec 11, 2024 at 3:20 PM Guanyou Chen <chenguanyou9338(a)gmail.com>
 wrote:
 >
 > Hi Tao
 >
 > Thanks for your explanation.
 >
 > Test env:
 > WARNING: cpu 0: cannot find NT_PRSTATUS note
 > WARNING: cpu 1: cannot find NT_PRSTATUS note
 > WARNING: cpu 2: cannot find NT_PRSTATUS note
 > WARNING: cpu 3: cannot find NT_PRSTATUS note
 > WARNING: cpu 4: cannot find NT_PRSTATUS note
 > WARNING: cpu 5: cannot find NT_PRSTATUS note
 > WARNING: cpu 6: cannot find NT_PRSTATUS note
 > WARNING: cpu 7: cannot find NT_PRSTATUS note
 >       KERNEL: vmlinux  [TAINTED]
 >    DUMPFILES: /var/tmp/ramdump_elf_RabECD [temporary ELF header]
 >               DDRCS0_0.BIN
 >               DDRCS1_0.BIN
 >               DDRCS1_1.BIN
 >               DDRCS2_0.BIN
 >               DDRCS2_1.BIN
 >               DDRCS2_2.BIN
 >         CPUS: 8
 >
 > The non-elf-vmcore is composed of multiple DDR memory files.
 Hmm I see... In this case the ELF header is constructed in
 ramdump_to_elf(), so it is reasonable that the NT_PRSTATUS note is
 missing. I don't have further questions for this patch, and Ack for
 the code part.
 As for the commit log, since it lacks of some useful info, how about
 we re-draft it as:
 ----
 When the ELF note does not contain CPU registers, attempting to
 retrieve online CPU registers will cause a crash. This is likely to
 happen for ramdump cases as follows, where the ELF header is created
 by crash at runtime.
  WARNING: cpu 0: cannot find NT_PRSTATUS note
  ...
  WARNING: cpu 7: cannot find NT_PRSTATUS note
        KERNEL: vmlinux  [TAINTED]
     DUMPFILES: /var/tmp/ramdump_elf_RabECD [temporary ELF header]
                DDRCS0_0.BIN
                ...
                DDRCS2_2.BIN
          CPUS: 8
 ----
 What do you think? @Lianbo Jiang
 
Looks good to me. Applied:
https://github.com/crash-utility/crash/commit/aa9f7248075c701e565b59ec2eb...
 Thanks,
 Tao Liu
 >
 > Thanks
 > Guanyou
 >
 > Tao Liu <ltao(a)redhat.com> 于2024年12月11日周三 07:12写道:
 >>
 >> Hi Guanyou,
 >>
 >> On Tue, Dec 10, 2024 at 7:35 PM lijiang <lijiang(a)redhat.com> wrote:
 >> >
 >> > On Wed, Nov 27, 2024 at 11:09 AM Guanyou Chen <
 chenguanyou9338(a)gmail.com> wrote:
 >> >>
 >> >> Hi  lianbo
 >> >>
 >> >> > Thanks for pointing out this. Can you help to try this one?
 >> >>
 >> >> test pass.
 >> >>
 >> >
 >> > Thank you for the confirmation, Guanyou.
 >> >
 >> > Lets pick up this one. so: Ack(with this correction)
 >> >
 >> > Thanks
 >> > Lianbo
 >> >
 >> >>
 >> >> Thanks
 >> >> Guanyou
 >> >>
 >> >> lijiang <lijiang(a)redhat.com> 于2024年11月27日周三 10:16写道:
 >> >>>
 >> >>> On Tue, Nov 26, 2024 at 11:46 AM Guanyou Chen <
 chenguanyou9338(a)gmail.com> wrote:
 >> >>>>
 >> >>>> Hi  lianbo
 >> >>>>
 >> >>>> test case is non-elf-vmcore,  so all nt_prstatus_percpu
invalid
 pointer.
 >>
 >> I have a question, if your test case is non-elf-vmcore, then why did
 >> the code execution reach the function "display_regs_from_elf_notes()"
 >> here? It shouldn't pass the is_netdump() check for ELF header in the
 >> first place, am I right?
 >>
 >> How did you generate the non-elf-vmcore? If you generate it by some
 >> special method(those not generated by makedumpfile), please also point
 >> it out in your commit message at first, so we can know it is a special
 >> case.
 >>
 >> In addition, we are lack of none-standard vmcores(none makedumpfile
 >> generated) for testing, and I know in the arm world there are
 >> different methods for getting the kernel memory dumping, which by the
 >> way, are not our expertise. So please try to provide as much info as
 >> possible to help us understand your test case/environment in your
 >> commit message next time.
 >>
 >> Thanks,
 >> Tao Liu
 >>
 >>
 >> >>>>
 >> >>>
 >> >>> Thanks for pointing out this. Can you help to try this one?
 >> >>>
 >> >>> diff --git a/netdump.c b/netdump.c
 >> >>> index b4e2a5cb2037..b67bdad3c511 100644
 >> >>> --- a/netdump.c
 >> >>> +++ b/netdump.c
 >> >>> @@ -2768,7 +2768,8 @@ display_regs_from_elf_notes(int cpu, FILE
 *ofp)
 >> >>>   }
 >> >>>   }
 >> >>>
 >> >>> - if ((cpu - skipped_count) >= nd->num_prstatus_notes
&&
 >> >>> + if (((cpu < 0 ) || (!nd->nt_prstatus_percpu[cpu]) ||
 >> >>> + (cpu - skipped_count) >= nd->num_prstatus_notes)
&&
 >> >>>       !machine_type("MIPS")) {
 >> >>>   error(INFO, "registers not collected for cpu %d\n",
cpu);
 >> >>>   return;
 >> >>>
 >> >>> Lianbo
 >> >>>
 >> >>>>
 >> >>>> Thanks
 >> >>>> Guanyou.
 >> >>>>
 >> >>>> lijiang <lijiang(a)redhat.com> 于2024年11月26日周二 11:27写道:
 >> >>>>>
 >> >>>>> Hi, Guanyou
 >> >>>>> Thank you for the fix.
 >> >>>>> On Mon, Nov 4, 2024 at 4:13 PM <
 devel-request(a)lists.crash-utility.osci.io> wrote:
 >> >>>>>>
 >> >>>>>> Date: Fri, 1 Nov 2024 18:01:27 +0800
 >> >>>>>> From: Guanyou Chen <chenguanyou9338(a)gmail.com>
 >> >>>>>> Subject: [Crash-utility] [PATCH] bugfix command
"help -r" segv
 fault
 >> >>>>>> To: Lianbo <lijiang(a)redhat.com>, Tao Liu
<ltao(a)redhat.com>,
 >> >>>>>>         devel(a)lists.crash-utility.osci.io
 >> >>>>>> Message-ID:
 >> >>>>>>         <CAHS3RMU3nuiqW4z=
 Qo9RoufADrUxcaLhyjnxwMCuGODB_+37yQ(a)mail.gmail.com>
 >> >>>>>> Content-Type: multipart/mixed;
 boundary="00000000000065fc530625d705b8"
 >> >>>>>>
 >> >>>>>> --00000000000065fc530625d705b8
 >> >>>>>> Content-Type: multipart/alternative;
 boundary="00000000000065fc530625d705b6"
 >> >>>>>>
 >> >>>>>> --00000000000065fc530625d705b6
 >> >>>>>> Content-Type: text/plain; charset="UTF-8"
 >> >>>>>>
 >> >>>>>> Hi Lianbo, Tao
 >> >>>>>>
 >> >>>>>> When the ELF Note does not contain CPU registers,
 >> >>>>>> attempting to retrieve online CPU registers will cause
a crash.
 >> >>>>>>
 >> >>>>>> After:
 >> >>>>>> CPU 6:
 >> >>>>>> help: registers not collected for cpu 6
 >> >>>>>> ...
 >> >>>>>>
 >> >>>>>> Signed-off-by: Guanyou.Chen
<chenguanyou(a)xiaomi.com>
 >> >>>>>> ---
 >> >>>>>>  netdump.c | 16 ++++++++++++++++
 >> >>>>>>  1 file changed, 16 insertions(+)
 >> >>>>>>
 >> >>>>>> diff --git a/netdump.c b/netdump.c
 >> >>>>>> index 8ea5159..435793b 100644
 >> >>>>>> --- a/netdump.c
 >> >>>>>> +++ b/netdump.c
 >> >>>>>> @@ -2780,6 +2780,10 @@ display_regs_from_elf_notes(int
cpu, FILE
 *ofp)
 >> >>>>>
 >> >>>>>
 >> >>>>> I copied the code block here:
 >> >>>>> display_regs_from_elf_notes(int cpu, FILE *ofp)
 >> >>>>> {
 >> >>>>>         Elf32_Nhdr *note32;
 >> >>>>>         Elf64_Nhdr *note64;
 >> >>>>>         size_t len;
 >> >>>>>         char *user_regs;
 >> >>>>>         int c, skipped_count;
 >> >>>>>
 >> >>>>>         /*
 >> >>>>>          * Kdump NT_PRSTATUS notes are only related to
online
 cpus,
 >> >>>>>          * so offline cpus should be skipped.
 >> >>>>>          */
 >> >>>>>         if (pc->flags2 & QEMU_MEM_DUMP_ELF)
 >> >>>>>                 skipped_count = 0;
 >> >>>>>         else {
 >> >>>>>                 for (c = skipped_count = 0; c < cpu;
c++) {
 >> >>>>>                         if (check_offline_cpu(c))
 >> >>>>>                                 skipped_count++;
 >> >>>>>                 }
 >> >>>>>         }
 >> >>>>>
 >> >>>>>         if ((cpu - skipped_count) >=
nd->num_prstatus_notes &&
 >> >>>>>              !machine_type("MIPS")) {
 >> >>>>>                 error(INFO, "registers not collected
for cpu
 %d\n", cpu);
 >> >>>>>                 return;
 >> >>>>>         }
 >> >>>>> ...
 >> >>>>> Could you please point out why the above check does not
work?
 >> >>>>>
 >> >>>>> BTW: I'm not sure if it can work for you, can you help
to try
 this? Just a guess.
 >> >>>>>
 >> >>>>>         if (((cpu < 0 ) ||
(!dd->nt_prstatus_percpu[cpu])
 >> >>>>>              || (cpu - skipped_count) >=
nd->num_prstatus_notes)
 &&
 >> >>>>>              !machine_type("MIPS")) {
 >> >>>>>                 error(INFO, "registers not collected
for cpu
 %d\n", cpu);
 >> >>>>>                 return;
 >> >>>>>         }
 >> >>>>>
 >> >>>>> Thanks
 >> >>>>> Lianbo
 >> >>>>>
 >> >>>>>
 >> >>>>>>                 nd->nt_prstatus_percpu[cpu];
 >> >>>>>>         else
 >> >>>>>>                     note64 = (Elf64_Nhdr
*)nd->nt_prstatus;
 >> >>>>>> +       if (!note64) {
 >> >>>>>> +           error(INFO, "registers not collected
for cpu %d\n",
 cpu);
 >> >>>>>> +           return;
 >> >>>>>> +       }
 >> >>>>>>         len = sizeof(Elf64_Nhdr);
 >> >>>>>>         len = roundup(len + note64->n_namesz, 4);
 >> >>>>>>         len = roundup(len + note64->n_descsz, 4);
 >> >>>>>> @@ -2820,6 +2824,10 @@ display_regs_from_elf_notes(int
cpu, FILE
 *ofp)
 >> >>>>>>                 nd->nt_prstatus_percpu[cpu];
 >> >>>>>>         else
 >> >>>>>>                     note32 = (Elf32_Nhdr
*)nd->nt_prstatus;
 >> >>>>>> +       if (!note32) {
 >> >>>>>> +           error(INFO, "registers not collected
for cpu %d\n",
 cpu);
 >> >>>>>> +           return;
 >> >>>>>> +       }
 >> >>>>>>         len = sizeof(Elf32_Nhdr);
 >> >>>>>>         len = roundup(len + note32->n_namesz, 4);
 >> >>>>>>         len = roundup(len + note32->n_descsz, 4);
 >> >>>>>> @@ -2857,6 +2865,10 @@ display_regs_from_elf_notes(int
cpu, FILE
 *ofp)
 >> >>>>>>         else
 >> >>>>>>             note64 = (Elf64_Nhdr *)nd->nt_prstatus;
 >> >>>>>>
 >> >>>>>> +       if (!note64) {
 >> >>>>>> +           error(INFO, "registers not collected
for cpu %d\n",
 cpu);
 >> >>>>>> +           return;
 >> >>>>>> +       }
 >> >>>>>>         prs = (struct ppc64_elf_prstatus *)
 >> >>>>>>             ((char *)note64 + sizeof(Elf64_Nhdr) +
 note64->n_namesz);
 >> >>>>>>         prs = (struct ppc64_elf_prstatus
*)roundup((ulong)prs,
 4);
 >> >>>>>> @@ -2903,6 +2915,10 @@ display_regs_from_elf_notes(int
cpu, FILE
 *ofp)
 >> >>>>>>                 nd->nt_prstatus_percpu[cpu];
 >> >>>>>>         else
 >> >>>>>>                     note64 = (Elf64_Nhdr
*)nd->nt_prstatus;
 >> >>>>>> +       if (!note64) {
 >> >>>>>> +           error(INFO, "registers not collected
for cpu %d\n",
 cpu);
 >> >>>>>> +           return;
 >> >>>>>> +       }
 >> >>>>>>         len = sizeof(Elf64_Nhdr);
 >> >>>>>>         len = roundup(len + note64->n_namesz, 4);
 >> >>>>>>         len = roundup(len + note64->n_descsz, 4);
 >> >>>>>> --
 >> >>>>>> 2.34.1
 >> >>>>>>
 >> >>>>>> Guanyou.
 >> >>>>>> Thanks
 >>