November 2024 - Crash-utility - Crash Utility List Archives

by Guanyou Chen

Hi lianbo, tao Is there a plan to upgrade the version of GDB? GDB-10.2 doesn't seem to support "--with-zstd". err: BFD: /xxx/vmlinux: unable to initialize decompress status for section .debug_aranges Guanyou Thanks.

1 year, 5 months

3
4
0 / 0

Re: [PATCH] bugfix command "help -r" segv fault

by lijiang

Hi, Guanyou Thank you for the fix. On Mon, Nov 4, 2024 at 4:13 PM <devel-request(a)lists.crash-utility.osci.io> wrote: > Date: Fri, 1 Nov 2024 18:01:27 +0800 > From: Guanyou Chen <chenguanyou9338(a)gmail.com> > Subject: [Crash-utility] [PATCH] bugfix command "help -r" segv fault > To: Lianbo <lijiang(a)redhat.com>, Tao Liu <ltao(a)redhat.com>, > devel(a)lists.crash-utility.osci.io > Message-ID: > <CAHS3RMU3nuiqW4z= > Qo9RoufADrUxcaLhyjnxwMCuGODB_+37yQ(a)mail.gmail.com> > Content-Type: multipart/mixed; boundary="00000000000065fc530625d705b8" > > --00000000000065fc530625d705b8 > Content-Type: multipart/alternative; > boundary="00000000000065fc530625d705b6" > > --00000000000065fc530625d705b6 > Content-Type: text/plain; charset="UTF-8" > > Hi Lianbo, Tao > > When the ELF Note does not contain CPU registers, > attempting to retrieve online CPU registers will cause a crash. > > After: > CPU 6: > help: registers not collected for cpu 6 > ... > > Signed-off-by: Guanyou.Chen <chenguanyou(a)xiaomi.com> > --- > netdump.c | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/netdump.c b/netdump.c > index 8ea5159..435793b 100644 > --- a/netdump.c > +++ b/netdump.c > @@ -2780,6 +2780,10 @@ display_regs_from_elf_notes(int cpu, FILE *ofp) > I copied the code block here: display_regs_from_elf_notes(int cpu, FILE *ofp) { Elf32_Nhdr *note32; Elf64_Nhdr *note64; size_t len; char *user_regs; int c, skipped_count; /* * Kdump NT_PRSTATUS notes are only related to online cpus, * so offline cpus should be skipped. */ if (pc->flags2 & QEMU_MEM_DUMP_ELF) skipped_count = 0; else { for (c = skipped_count = 0; c < cpu; c++) { if (check_offline_cpu(c)) skipped_count++; } } if ((cpu - skipped_count) >= nd->num_prstatus_notes && !machine_type("MIPS")) { error(INFO, "registers not collected for cpu %d\n", cpu); return; } ... Could you please point out why the above check does not work? BTW: I'm not sure if it can work for you, can you help to try this? Just a guess. if (((cpu < 0 ) || (!dd->nt_prstatus_percpu[cpu]) || (cpu - skipped_count) >= nd->num_prstatus_notes) && !machine_type("MIPS")) { error(INFO, "registers not collected for cpu %d\n", cpu); return; } Thanks Lianbo nd->nt_prstatus_percpu[cpu]; > else > note64 = (Elf64_Nhdr *)nd->nt_prstatus; > + if (!note64) { > + error(INFO, "registers not collected for cpu %d\n", cpu); > + return; > + } > len = sizeof(Elf64_Nhdr); > len = roundup(len + note64->n_namesz, 4); > len = roundup(len + note64->n_descsz, 4); > @@ -2820,6 +2824,10 @@ display_regs_from_elf_notes(int cpu, FILE *ofp) > nd->nt_prstatus_percpu[cpu]; > else > note32 = (Elf32_Nhdr *)nd->nt_prstatus; > + if (!note32) { > + error(INFO, "registers not collected for cpu %d\n", cpu); > + return; > + } > len = sizeof(Elf32_Nhdr); > len = roundup(len + note32->n_namesz, 4); > len = roundup(len + note32->n_descsz, 4); > @@ -2857,6 +2865,10 @@ display_regs_from_elf_notes(int cpu, FILE *ofp) > else > note64 = (Elf64_Nhdr *)nd->nt_prstatus; > > + if (!note64) { > + error(INFO, "registers not collected for cpu %d\n", cpu); > + return; > + } > prs = (struct ppc64_elf_prstatus *) > ((char *)note64 + sizeof(Elf64_Nhdr) + note64->n_namesz); > prs = (struct ppc64_elf_prstatus *)roundup((ulong)prs, 4); > @@ -2903,6 +2915,10 @@ display_regs_from_elf_notes(int cpu, FILE *ofp) > nd->nt_prstatus_percpu[cpu]; > else > note64 = (Elf64_Nhdr *)nd->nt_prstatus; > + if (!note64) { > + error(INFO, "registers not collected for cpu %d\n", cpu); > + return; > + } > len = sizeof(Elf64_Nhdr); > len = roundup(len + note64->n_namesz, 4); > len = roundup(len + note64->n_descsz, 4); > -- > 2.34.1 > > Guanyou. > Thanks >

1 year, 5 months

3
8
0 / 0

Re: [PAT CH] remove offline status check for CPU register map

by lijiang

On Mon, Nov 4, 2024 at 4:13 PM <devel-request(a)lists.crash-utility.osci.io> wrote: > Date: Fri, 1 Nov 2024 20:35:32 +0800 > From: Guanyou Chen <chenguanyou9338(a)gmail.com> > Subject: [Crash-utility] [PATCH] remove offline status check for CPU > register map > To: Lianbo <lijiang(a)redhat.com>, Tao Liu <ltao(a)redhat.com>, > devel(a)lists.crash-utility.osci.io > Message-ID: > <CAHS3RMV5tzd2cHR+zniv-39QZE2idjQjXLytFXv5= > mneizbw5Q(a)mail.gmail.com> > Content-Type: multipart/alternative; > boundary="0000000000006026e40625d92c82" > > --0000000000006026e40625d92c82 > Content-Type: text/plain; charset="UTF-8" > > Hi Lianbo, Tao > > Remove offline status check, We can query the registers of > each CPU at any time and obtain their stack. > > CPU 0: [OFFLINE] > X0: 0000000000000000 X1: 0000000000000000 X2: 0000000000000000 > X3: 000000000003fcbc X4: 0000000000000001 X5: 0000000000000000 > X6: 0000000000000000 X7: 0000000000000000 X8: 00000000ffffffff > X9: ffffffc009e6ae48 X10: ffffffc009e6ae20 X11: 0000000000000000 > X12: 0000000000000002 X13: 0000000000000004 X14: 0000000000000000 > X15: 0000000000004000 X16: 00000000f90f05f6 X17: 00000000f90f05f6 > X18: 0000000000000000 X19: 0000000000000002 X20: ffffffc009e3b008 > X21: ffffffc00a01d020 X22: ffffffc009f798f0 X23: 0000000060001000 > X24: 0000000000000000 X25: 0000000000000000 X26: 0000000000000000 > X27: 0000000000000000 X28: ffffff8111eecb00 X29: ffffffc008003f50 > LR: ffffffc00802df88 SP: ffffffc008003f40 PC: ffffffc00802df94 > PSTATE: 024003c5 FPVALID: 00000000 > > crash> bt -c 0 > PID: 1842 TASK: ffffff8111eecb00 CPU: 0 COMMAND: "android.bg" > 00 [ffffffc008003f50] ipi_handler at ffffffc00802df90 > 01 [ffffffc008003f90] handle_percpu_devid_irq at ffffffc008146f50 > 02 [ffffffc008003fd0] generic_handle_domain_irq at ffffffc00813f484 > 03 [ffffffc008003fe0] gic_handle_irq at ffffffc008010140 > --- <IRQ stack> --- > 04 [ffffffc019c3be20] call_on_irq_stack at ffffffc008016ed4 > 05 [ffffffc019c3be40] do_interrupt_handler at ffffffc008019cb4 > 06 [ffffffc019c3be60] el0_interrupt at ffffffc008f7b848 > 07 [ffffffc019c3be90] __el0_irq_handler_common at ffffffc008f7b368 > 08 [ffffffc019c3bea0] el0t_64_irq_handler at ffffffc008f7b344 > 09 [ffffffc019c3bfe0] el0t_64_irq at ffffffc008011720 > PC: 0000000072415108 LR: 00000000724150d0 SP: 0000007691d2bfa0 > X29: 00000000734f60e0 X28: 000000001a2fa678 X27: 0000000000000063 > X26: 000000001a2fa678 X25: 000000001a2fa678 X24: 000000001a7bb718 > X23: 000000001a7ba198 X22: 000000001a7ba190 X21: b4000076f9a828c8 > X20: 0000000000000000 X19: b4000076f9a82800 X18: 000000768d68a000 > X17: 00000000708f89f8 X16: 00000000000000f0 X15: 0000000000000000 > X14: 0000007691d2bca0 X13: 0000000080100000 X12: 0000000000000000 > X11: 0000000000000000 X10: 0000000000000000 X9: 9636716211228cd4 > X8: 9636716211228cd4 X7: 0000000000000010 X6: 000000001a7bb728 > X5: 0000000070845200 X4: 0000000018a40d38 X3: 00000000707e8f98 > X2: 000000001a2fa678 X1: 000000001a7ba198 X0: 0000000070847aa8 > ORIG_X0: 00000000ffffff9c SYSCALLNO: ffffffff PSTATE: 60001000 > > Signed-off-by: Guanyou.Chen <chenguanyou(a)xiaomi.com> > --- > netdump.c | 15 +++++---------- > 1 file changed, 5 insertions(+), 10 deletions(-) > > diff --git a/netdump.c b/netdump.c > index 435793b..455f90e 100644 > --- a/netdump.c > +++ b/netdump.c > @@ -101,7 +101,7 @@ map_cpus_to_prstatus(void) > nrcpus = (kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS); > > for (i = 0; i < nrcpus; i++) { > - if (in_cpu_map(ONLINE_MAP, i) && machdep->is_cpu_prstatus_valid(i)) > Checking online cpus is meaningful, the current modification seems unreasonable :-) Please refer to this commit: d5b362edf7d5. Thanks Lianbo > { > + if (machdep->is_cpu_prstatus_valid(i)) { > nd->nt_prstatus_percpu[i] = nt_ptr[i]; > nd->num_prstatus_notes = > MAX(nd->num_prstatus_notes, i+1); > @@ -2998,15 +2998,10 @@ dump_registers_for_elf_dumpfiles(void) > return; > } > > - for (c = 0; c < kt->cpus; c++) { > - if (check_offline_cpu(c)) { > - fprintf(fp, "%sCPU %d: [OFFLINE]\n", c ? "\n" : "", c); > - continue; > - } > - > - fprintf(fp, "%sCPU %d:\n", c ? "\n" : "", c); > - display_regs_from_elf_notes(c, fp); > - } > + for (c = 0; c < kt->cpus; c++) { > + fprintf(fp, "%sCPU %d: %s\n", c ? "\n" : "", c, > check_offline_cpu(c) ? "[OFFLINE]" : "[ONLINE]"); > + display_regs_from_elf_notes(c, fp); > + } > } > > struct x86_64_user_regs_struct { > -- > 2.34.1 > > Guanyou. > Thanks. >

1 year, 6 months

1
0
0 / 0

[PATCH] bugfix map cpus register

by Guanyou Chen

Hi Lianbo, Tao When CPUs are in an offline state, it can lead to mapping errors. We need to map them to the correct positions one by one. Before: n_namesz: 5 ("CPU2") n_descsz: 392 n_type: 1 (NT_PRSTATUS) si.signo: 0 si.code: 0 si.errno: 0 cursig: 0 sigpend: 0 sighold: 0 pid: 3 ppid: 0 pgrp: 0 sid:0 utime: 0.000000 stime: 0.000000 cutime: 0.000000 cstime: 0.000000 X0: ffffffc000fc8818 X1: 0000000000000000 X2: ffffffc000fc84c8 X3: 0000000000000000 X4: ffffffc0405e37bf X5: ffffffc00a07372f X6: 322e34323320205b X7: 545b5d3539383334 X8: ffffffc000fc2f0c X9: 89fece0a9ef8cb00 X10: c0000001001f75f4 X11: 00000001001f75f4 X12: 0000000000000003 X13: 00000000000005f4 X14: ffffffc009eb1210 X15: 0000000000000004 X16: 000000002a4cec24 X17: 000000002a4cec24 X18: ffffffc009e7d140 X19: ffffffc00a04c670 X20: 0000000000000000 X21: 0000000000000000 X22: ffffff8027f22280 X23: 0000000000000009 X24: 0000000000000007 X25: ffffffc009f839c0 X26: ffffffc0090f87f8 X27: 0000000000000000 X28: ffffff80454f3840 X29: ffffffc0405e3b60 LR: ffffffc0080e57fc SP: ffffffc0405e3b60 PC: ffffffc000fc2f84 CPU 0: [OFFLINE] CPU 1: [OFFLINE] CPU 2: X0: 0000000000000000 X1: 0000000000000000 X2: 0000000000000000 X3: 000000000003fcbc X4: 0000000000000001 X5: 0000000000000000 X6: 0000000000000000 X7: 0000000000000000 X8: 00000000ffffffff X9: ffffffc009e6ae48 X10: ffffffc009e6ae20 X11: 0000000000000000 X12: 0000000000000002 X13: 0000000000000004 X14: 0000000000000000 X15: 0000000000004000 X16: 00000000f90f05f6 X17: 00000000f90f05f6 X18: 0000000000000000 X19: 0000000000000002 X20: ffffffc009e3b008 X21: ffffffc00a01d020 X22: ffffffc009f798f0 X23: 0000000060001000 X24: 0000000000000000 X25: 0000000000000000 X26: 0000000000000000 X27: 0000000000000000 X28: ffffff8111eecb00 X29: ffffffc008003f50 LR: ffffffc00802df88 SP: ffffffc008003f40 PC: ffffffc00802df94 PSTATE: 024003c5 FPVALID: 00000000 After: CPU 2: X0: ffffffc000fc8818 X1: 0000000000000000 X2: ffffffc000fc84c8 X3: 0000000000000000 X4: ffffffc0405e37bf X5: ffffffc00a07372f X6: 322e34323320205b X7: 545b5d3539383334 X8: ffffffc000fc2f0c X9: 89fece0a9ef8cb00 X10: c0000001001f75f4 X11: 00000001001f75f4 X12: 0000000000000003 X13: 00000000000005f4 X14: ffffffc009eb1210 X15: 0000000000000004 X16: 000000002a4cec24 X17: 000000002a4cec24 X18: ffffffc009e7d140 X19: ffffffc00a04c670 X20: 0000000000000000 X21: 0000000000000000 X22: ffffff8027f22280 X23: 0000000000000009 X24: 0000000000000007 X25: ffffffc009f839c0 X26: ffffffc0090f87f8 X27: 0000000000000000 X28: ffffff80454f3840 X29: ffffffc0405e3b60 LR: ffffffc0080e57fc SP: ffffffc0405e3b60 PC: ffffffc000fc2f84 PSTATE: 600000c5 FPVALID: 00000000 crash> bt PID: 15959 TASK: ffffff80454f3840 CPU: 2 COMMAND: "AnrConsumer" [ffffffc0405e3b60] ipanic at ffffffc000fc2f80 [mrdump] [ffffffc0405e3b70] atomic_notifier_call_chain at ffffffc0080e57f8 [ffffffc0405e3c30] panic at ffffffc008f734d0 [ffffffc0405e3c80] sysrq_handle_crash at ffffffc0087f3c18 [ffffffc0405e3c90] __handle_sysrq at ffffffc0087f3798 [ffffffc0405e3ce0] write_sysrq_trigger at ffffffc0087f49c0 [ffffffc0405e3d00] proc_reg_write at ffffffc00842e4b8 [ffffffc0405e3d80] vfs_write at ffffffc008381eb4 [ffffffc0405e3dd0] ksys_write at ffffffc008382200 [ffffffc0405e3e10] __arm64_sys_write at ffffffc00838228c [ffffffc0405e3e20] invoke_syscall at ffffffc00802efe0 [ffffffc0405e3e40] el0_svc_common at ffffffc00802eef4 [ffffffc0405e3e70] do_el0_svc at ffffffc00802ede8 [ffffffc0405e3e80] el0_svc at ffffffc008f7a7d0 [ffffffc0405e3ea0] el0t_64_sync_handler at ffffffc008f7a758 [ffffffc0405e3fe0] el0t_64_sync at ffffffc00801157c PC: 00000077c798ca28 LR: 00000077a82e19f4 SP: 000000761c517af0 X29: 000000761c517b00 X28: 000000761c517db8 X27: 000000761c517c90 X26: 000000761c517c98 X25: 000000761c517bf9 X24: 000000761c519000 X23: 000000761c517be1 X22: 0000000000000001 X21: 00000000000003e3 X20: 000000761c517c11 X19: 000000761c517bf8 X18: 0000007568224000 X17: 00000077c798ca20 X16: 00000077c79b2ae0 X15: b4000077202cc480 X14: 0000000000000000 X13: 000000761c517a70 X12: ffffff80ffffffd0 X11: 000000761c517a40 X10: 0000000000000001 X9: 0000000000000000 X8: 0000000000000040 X7: 7f7f7f7f7f7f7f7f X6: 0000000000000010 X5: 000000761c517c0c X4: ffffffffffffffff X3: ffffffffffffffff X2: 0000000000000001 X1: 000000761c517c11 X0: 00000000000003e3 ORIG_X0: 00000000000003e3 SYSCALLNO: 40 PSTATE: 00001000 Signed-off-by: Guanyou.Chen <chenguanyou(a)xiaomi.com> --- netdump.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/netdump.c b/netdump.c index b4e2a5c..8ea5159 100644 --- a/netdump.c +++ b/netdump.c @@ -75,7 +75,7 @@ void map_cpus_to_prstatus(void) { void **nt_ptr; - int online, i, j, nrcpus; + int online, i, nrcpus; size_t size; if (pc->flags2 & QEMU_MEM_DUMP_ELF) /* notes exist for all cpus */ @@ -100,9 +100,9 @@ map_cpus_to_prstatus(void) */ nrcpus = (kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS); - for (i = 0, j = 0; i < nrcpus; i++) { + for (i = 0; i < nrcpus; i++) { if (in_cpu_map(ONLINE_MAP, i) && machdep->is_cpu_prstatus_valid(i)) { - nd->nt_prstatus_percpu[i] = nt_ptr[j++]; + nd->nt_prstatus_percpu[i] = nt_ptr[i]; nd->num_prstatus_notes = MAX(nd->num_prstatus_notes, i+1); } -- 2.34.1 Guanyou. Thanks.

1 year, 6 months

3
4
0 / 0

[PATCH] remove offline status check for CPU register map

by Guanyou Chen

Hi Lianbo, Tao Remove offline status check, We can query the registers of each CPU at any time and obtain their stack. CPU 0: [OFFLINE] X0: 0000000000000000 X1: 0000000000000000 X2: 0000000000000000 X3: 000000000003fcbc X4: 0000000000000001 X5: 0000000000000000 X6: 0000000000000000 X7: 0000000000000000 X8: 00000000ffffffff X9: ffffffc009e6ae48 X10: ffffffc009e6ae20 X11: 0000000000000000 X12: 0000000000000002 X13: 0000000000000004 X14: 0000000000000000 X15: 0000000000004000 X16: 00000000f90f05f6 X17: 00000000f90f05f6 X18: 0000000000000000 X19: 0000000000000002 X20: ffffffc009e3b008 X21: ffffffc00a01d020 X22: ffffffc009f798f0 X23: 0000000060001000 X24: 0000000000000000 X25: 0000000000000000 X26: 0000000000000000 X27: 0000000000000000 X28: ffffff8111eecb00 X29: ffffffc008003f50 LR: ffffffc00802df88 SP: ffffffc008003f40 PC: ffffffc00802df94 PSTATE: 024003c5 FPVALID: 00000000 crash> bt -c 0 PID: 1842 TASK: ffffff8111eecb00 CPU: 0 COMMAND: "android.bg" 00 [ffffffc008003f50] ipi_handler at ffffffc00802df90 01 [ffffffc008003f90] handle_percpu_devid_irq at ffffffc008146f50 02 [ffffffc008003fd0] generic_handle_domain_irq at ffffffc00813f484 03 [ffffffc008003fe0] gic_handle_irq at ffffffc008010140 --- <IRQ stack> --- 04 [ffffffc019c3be20] call_on_irq_stack at ffffffc008016ed4 05 [ffffffc019c3be40] do_interrupt_handler at ffffffc008019cb4 06 [ffffffc019c3be60] el0_interrupt at ffffffc008f7b848 07 [ffffffc019c3be90] __el0_irq_handler_common at ffffffc008f7b368 08 [ffffffc019c3bea0] el0t_64_irq_handler at ffffffc008f7b344 09 [ffffffc019c3bfe0] el0t_64_irq at ffffffc008011720 PC: 0000000072415108 LR: 00000000724150d0 SP: 0000007691d2bfa0 X29: 00000000734f60e0 X28: 000000001a2fa678 X27: 0000000000000063 X26: 000000001a2fa678 X25: 000000001a2fa678 X24: 000000001a7bb718 X23: 000000001a7ba198 X22: 000000001a7ba190 X21: b4000076f9a828c8 X20: 0000000000000000 X19: b4000076f9a82800 X18: 000000768d68a000 X17: 00000000708f89f8 X16: 00000000000000f0 X15: 0000000000000000 X14: 0000007691d2bca0 X13: 0000000080100000 X12: 0000000000000000 X11: 0000000000000000 X10: 0000000000000000 X9: 9636716211228cd4 X8: 9636716211228cd4 X7: 0000000000000010 X6: 000000001a7bb728 X5: 0000000070845200 X4: 0000000018a40d38 X3: 00000000707e8f98 X2: 000000001a2fa678 X1: 000000001a7ba198 X0: 0000000070847aa8 ORIG_X0: 00000000ffffff9c SYSCALLNO: ffffffff PSTATE: 60001000 Signed-off-by: Guanyou.Chen <chenguanyou(a)xiaomi.com> --- netdump.c | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/netdump.c b/netdump.c index 435793b..455f90e 100644 --- a/netdump.c +++ b/netdump.c @@ -101,7 +101,7 @@ map_cpus_to_prstatus(void) nrcpus = (kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS); for (i = 0; i < nrcpus; i++) { - if (in_cpu_map(ONLINE_MAP, i) && machdep->is_cpu_prstatus_valid(i)) { + if (machdep->is_cpu_prstatus_valid(i)) { nd->nt_prstatus_percpu[i] = nt_ptr[i]; nd->num_prstatus_notes = MAX(nd->num_prstatus_notes, i+1); @@ -2998,15 +2998,10 @@ dump_registers_for_elf_dumpfiles(void) return; } - for (c = 0; c < kt->cpus; c++) { - if (check_offline_cpu(c)) { - fprintf(fp, "%sCPU %d: [OFFLINE]\n", c ? "\n" : "", c); - continue; - } - - fprintf(fp, "%sCPU %d:\n", c ? "\n" : "", c); - display_regs_from_elf_notes(c, fp); - } + for (c = 0; c < kt->cpus; c++) { + fprintf(fp, "%sCPU %d: %s\n", c ? "\n" : "", c, check_offline_cpu(c) ? "[OFFLINE]" : "[ONLINE]"); + display_regs_from_elf_notes(c, fp); + } } struct x86_64_user_regs_struct { -- 2.34.1 Guanyou. Thanks.

1 year, 6 months

2
4
0 / 0

Re: [PATCH] mod: introduce -v option to display modules with valid version

by lijiang

Hi, Sun Feng Thank you for the patch. On Mon, Oct 28, 2024 at 11:32 AM <devel-request(a)lists.crash-utility.osci.io> wrote: > Date: Wed, 23 Oct 2024 08:53:58 +0800 > From: Sun Feng <loyou85(a)gmail.com> > Subject: [Crash-utility] [PATCH] mod: introduce -v option to display > modules with valid version > To: devel(a)lists.crash-utility.osci.io > Cc: Sun Feng <loyou85(a)gmail.com> > Message-ID: <20241023005358.11328-1-loyou85(a)gmail.com> > > With this option, we can get module version easily in kdump, > it's helpful when developing external modules. > It seems to be a specific case? > > crash> mod -v > NAME VERSION > ahci 3.0 > vxlan 0.1.2.1 > dca 1.12.1 > ... > > Signed-off-by: Sun Feng <loyou85(a)gmail.com> > --- > defs.h | 3 +++ > help.c | 12 +++++++++++- > kernel.c | 46 +++++++++++++++++++++++++++++++++++++++++++++- > symbols.c | 44 +++++++++++++++++++++++++++++++++++++++----- > 4 files changed, 98 insertions(+), 7 deletions(-) > > diff --git a/defs.h b/defs.h > index e2a9278..f14fcdf 100644 > --- a/defs.h > +++ b/defs.h > @@ -2244,6 +2244,7 @@ struct offset_table { /* stash of > commonly-used offsets */ > long rb_list_head; > long file_f_inode; > long page_page_type; > + long module_version; > }; > > struct size_table { /* stash of commonly-used sizes */ > @@ -2935,6 +2936,7 @@ struct symbol_table_data { > > #define MAX_MOD_NAMELIST (256) > #define MAX_MOD_NAME (64) > +#define MAX_MOD_VERSION (64) > #define MAX_MOD_SEC_NAME (64) > > #define MOD_EXT_SYMS (0x1) > @@ -2984,6 +2986,7 @@ struct load_module { > long mod_size; > char mod_namelist[MAX_MOD_NAMELIST]; > char mod_name[MAX_MOD_NAME]; > + char mod_version[MAX_MOD_VERSION]; > ulong mod_flags; > struct syment *mod_symtable; > struct syment *mod_symend; > diff --git a/help.c b/help.c > index e95ac1d..1bac5e1 100644 > --- a/help.c > +++ b/help.c > @@ -5719,7 +5719,7 @@ NULL > char *help_mod[] = { > "mod", > "module information and loading of symbols and debugging data", > -"-s module [objfile] | -d module | -S [directory] [-D|-t|-r|-R|-o|-g]", > +"-s module [objfile] | -d module | -S [directory] [-D|-t|-r|-R|-o|-g|-v]", > " With no arguments, this command displays basic information of the > currently", > " installed modules, consisting of the module address, name, base > address,", > " size, the object file name (if known), and whether the module was > compiled", > @@ -5791,6 +5791,7 @@ char *help_mod[] = { > " -g When used with -s or -S, add a module object's > section", > " start and end addresses to its symbol list.", > " -o Load module symbols with old mechanism.", > +" -v Display modules with valid version.", > " ", > " If the %s session was invoked with the \"--mod <directory>\" option, > or", > " a CRASH_MODULE_PATH environment variable exists, then > /lib/modules/<release>", > @@ -5881,6 +5882,15 @@ char *help_mod[] = { > " vxglm P(U)", > " vxgms P(U)", > " vxodm P(U)", > +" ", > +" Display modules with valid version:", > +" ", > +" %s> mod -v", > +" NAME VERSION", > +" ahci 3.0", > +" vxlan 0.1.2.1", > +" dca 1.12.1", > +" ...", > NULL > }; > There are many kernel modules, which do not have the actual value for the field "version"(null), E.g: crash> struct module c008000005cb1d00 struct module { ... version = 0x0, srcversion = 0xc00000009c3628c0 "7D7FAEDDA764AC772D6F805", ... Currently, it is also easy to view the version string, for example: crash> mod MODULE NAME TEXT_BASE SIZE OBJECT FILE c008000004400080 libcrc32c c008000004260000 196608 (not loaded) [CONFIG_KALLSYMS] ... c0080000044a0700 sg c008000004480000 262144 (not loaded) [CONFIG_KALLSYMS] ... crash> struct module c0080000044a0700|grep -w version version = 0xc000000009d67f20 "3.5.36", Could you please explain the current background? Why is it needed? As you saw, it's not too hard to get a module version string based on crash internal command. Thanks Lianbo > > diff --git a/kernel.c b/kernel.c > index adb19ad..91eef2a 100644 > --- a/kernel.c > +++ b/kernel.c > @@ -3593,6 +3593,9 @@ module_init(void) > MEMBER_OFFSET_INIT(module_num_gpl_syms, "module", > "num_gpl_syms"); > > + if (MEMBER_EXISTS("module", "version")) > + MEMBER_OFFSET_INIT(module_version, "module", > "version"); > + > if (MEMBER_EXISTS("module", "mem")) { /* 6.4 and later */ > kt->flags2 |= KMOD_MEMORY; /* MODULE_MEMORY() > can be used. */ > > @@ -4043,6 +4046,7 @@ irregularity: > #define REMOTE_MODULE_SAVE_MSG (6) > #define REINIT_MODULES (7) > #define LIST_ALL_MODULE_TAINT (8) > +#define LIST_ALL_MODULE_VERSION (9) > > void > cmd_mod(void) > @@ -4117,7 +4121,7 @@ cmd_mod(void) > address = 0; > flag = LIST_MODULE_HDR; > > - while ((c = getopt(argcnt, args, "Rd:Ds:Sot")) != EOF) { > + while ((c = getopt(argcnt, args, "Rd:Ds:Sotv")) != EOF) { > switch(c) > { > case 'R': > @@ -4195,6 +4199,13 @@ cmd_mod(void) > flag = LIST_ALL_MODULE_TAINT; > break; > > + case 'v': > + if (flag) > + cmd_usage(pc->curcmd, SYNOPSIS); > + else > + flag = LIST_ALL_MODULE_VERSION; > + break; > + > default: > argerrs++; > break; > @@ -4578,10 +4589,12 @@ do_module_cmd(ulong flag, char *modref, ulong > address, > struct load_module *lm, *lmp; > int maxnamelen; > int maxsizelen; > + int maxversionlen; > char buf1[BUFSIZE]; > char buf2[BUFSIZE]; > char buf3[BUFSIZE]; > char buf4[BUFSIZE]; > + char buf5[BUFSIZE]; > > if (NO_MODULES()) > return; > @@ -4744,6 +4757,37 @@ do_module_cmd(ulong flag, char *modref, ulong > address, > case LIST_ALL_MODULE_TAINT: > show_module_taint(); > break; > + > + case LIST_ALL_MODULE_VERSION: > + maxnamelen = maxversionlen = 0; > + > + for (i = 0; i < kt->mods_installed; i++) { > + lm = &st->load_modules[i]; > + maxnamelen = strlen(lm->mod_name) > maxnamelen ? > + strlen(lm->mod_name) : maxnamelen; > + > + maxversionlen = strlen(lm->mod_version) > > maxversionlen ? > + strlen(lm->mod_version) : maxversionlen; > + } > + > + fprintf(fp, "%s %s\n", > + mkstring(buf2, maxnamelen, LJUST, "NAME"), > + mkstring(buf5, maxversionlen, LJUST, "VERSION")); > + > + for (i = 0; i < kt->mods_installed; i++) { > + lm = &st->load_modules[i]; > + if ((!address || (lm->module_struct == address) || > + (lm->mod_base == address)) && > + strlen(lm->mod_version)) { > + fprintf(fp, "%s ", mkstring(buf2, > maxnamelen, > + LJUST, lm->mod_name)); > + fprintf(fp, "%s ", mkstring(buf5, > maxversionlen, > + LJUST, lm->mod_version)); > + > + fprintf(fp, "\n"); > + } > + } > + break; > } > } > > diff --git a/symbols.c b/symbols.c > index d00fbd7..9d90df7 100644 > --- a/symbols.c > +++ b/symbols.c > @@ -1918,6 +1918,7 @@ store_module_symbols_6_4(ulong total, int > mods_installed) > { > int i, m, t; > ulong mod, mod_next; > + ulong version; > char *mod_name; > uint nsyms, ngplsyms; > ulong syms, gpl_syms; > @@ -1930,6 +1931,7 @@ store_module_symbols_6_4(ulong total, int > mods_installed) > struct load_module *lm; > char buf1[BUFSIZE]; > char buf2[BUFSIZE]; > + char mod_version[BUFSIZE]; > char *strbuf = NULL, *modbuf, *modsymbuf; > struct syment *sp; > ulong first, last; > @@ -1980,6 +1982,13 @@ store_module_symbols_6_4(ulong total, int > mods_installed) > > mod_name = modbuf + OFFSET(module_name); > > + BZERO(mod_version, BUFSIZE); > + if (MEMBER_EXISTS("module", "version")) { > + version = ULONG(modbuf + OFFSET(module_version)); > + if (version) > + read_string(version, mod_version, BUFSIZE > - 1); > + } > + > lm = &st->load_modules[m++]; > BZERO(lm, sizeof(struct load_module)); > > @@ -2003,9 +2012,15 @@ store_module_symbols_6_4(ulong total, int > mods_installed) > error(INFO, "module name greater than > MAX_MOD_NAME: %s\n", mod_name); > strncpy(lm->mod_name, mod_name, MAX_MOD_NAME-1); > } > + if (strlen(mod_version) < MAX_MOD_VERSION) > + strcpy(lm->mod_version, mod_version); > + else { > + error(INFO, "module version greater than > MAX_MOD_VERSION: %s\n", mod_version); > + strncpy(lm->mod_version, mod_version, > MAX_MOD_VERSION-1); > + } > if (CRASHDEBUG(3)) > - fprintf(fp, "%lx (%lx): %s syms: %d gplsyms: %d > ksyms: %ld\n", > - mod, lm->mod_base, lm->mod_name, nsyms, > ngplsyms, nksyms); > + fprintf(fp, "%lx (%lx): %s syms: %d gplsyms: %d > ksyms: %ld version: %s\n", > + mod, lm->mod_base, lm->mod_name, nsyms, > ngplsyms, nksyms, lm->mod_version); > > lm->mod_flags = MOD_EXT_SYMS; > lm->mod_ext_symcnt = mcnt; > @@ -2271,6 +2286,7 @@ store_module_symbols_v2(ulong total, int > mods_installed) > { > int i, m; > ulong mod, mod_next; > + ulong version; > char *mod_name; > uint nsyms, ngplsyms; > ulong syms, gpl_syms; > @@ -2285,6 +2301,7 @@ store_module_symbols_v2(ulong total, int > mods_installed) > char buf2[BUFSIZE]; > char buf3[BUFSIZE]; > char buf4[BUFSIZE]; > + char mod_version[BUFSIZE]; > char *strbuf, *modbuf, *modsymbuf; > struct syment *sp; > ulong first, last; > @@ -2344,6 +2361,13 @@ store_module_symbols_v2(ulong total, int > mods_installed) > > mod_name = modbuf + OFFSET(module_name); > > + BZERO(mod_version, BUFSIZE); > + if (MEMBER_EXISTS("module", "version")) { > + version = ULONG(modbuf + OFFSET(module_version)); > + if (version) > + read_string(version, mod_version, BUFSIZE > - 1); > + } > + > lm = &st->load_modules[m++]; > BZERO(lm, sizeof(struct load_module)); > lm->mod_base = ULONG(modbuf + > MODULE_OFFSET2(module_module_core, rx)); > @@ -2357,11 +2381,19 @@ store_module_symbols_v2(ulong total, int > mods_installed) > mod_name); > strncpy(lm->mod_name, mod_name, MAX_MOD_NAME-1); > } > + if (strlen(mod_version) < MAX_MOD_VERSION) > + strcpy(lm->mod_version, mod_version); > + else { > + error(INFO, > + "module version greater than MAX_MOD_VERSION: > %s\n", > + mod_version); > + strncpy(lm->mod_version, mod_version, > MAX_MOD_VERSION-1); > + } > if (CRASHDEBUG(3)) > fprintf(fp, > - "%lx (%lx): %s syms: %d gplsyms: %d ksyms: > %ld\n", > - mod, lm->mod_base, lm->mod_name, nsyms, > - ngplsyms, nksyms); > + "%lx (%lx): %s syms: %d gplsyms: %d ksyms: %ld > version: %s\n", > + mod, lm->mod_base, lm->mod_name, nsyms, > + ngplsyms, nksyms, lm->mod_version); > lm->mod_flags = MOD_EXT_SYMS; > lm->mod_ext_symcnt = mcnt; > lm->mod_init_module_ptr = ULONG(modbuf + > @@ -10177,6 +10209,8 @@ dump_offset_table(char *spec, ulong makestruct) > OFFSET(module_next)); > fprintf(fp, " module_name: %ld\n", > OFFSET(module_name)); > + fprintf(fp, " module_version: %ld\n", > + OFFSET(module_version)); > fprintf(fp, " module_syms: %ld\n", > OFFSET(module_syms)); > fprintf(fp, " module_nsyms: %ld\n", > -- > 2.43.0 >

1 year, 6 months

2
1
0 / 0

[PATCH] bugfix command "help -r" segv fault

by Guanyou Chen

Hi Lianbo, Tao When the ELF Note does not contain CPU registers, attempting to retrieve online CPU registers will cause a crash. After: CPU 6: help: registers not collected for cpu 6 ... Signed-off-by: Guanyou.Chen <chenguanyou(a)xiaomi.com> --- netdump.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/netdump.c b/netdump.c index 8ea5159..435793b 100644 --- a/netdump.c +++ b/netdump.c @@ -2780,6 +2780,10 @@ display_regs_from_elf_notes(int cpu, FILE *ofp) nd->nt_prstatus_percpu[cpu]; else note64 = (Elf64_Nhdr *)nd->nt_prstatus; + if (!note64) { + error(INFO, "registers not collected for cpu %d\n", cpu); + return; + } len = sizeof(Elf64_Nhdr); len = roundup(len + note64->n_namesz, 4); len = roundup(len + note64->n_descsz, 4); @@ -2820,6 +2824,10 @@ display_regs_from_elf_notes(int cpu, FILE *ofp) nd->nt_prstatus_percpu[cpu]; else note32 = (Elf32_Nhdr *)nd->nt_prstatus; + if (!note32) { + error(INFO, "registers not collected for cpu %d\n", cpu); + return; + } len = sizeof(Elf32_Nhdr); len = roundup(len + note32->n_namesz, 4); len = roundup(len + note32->n_descsz, 4); @@ -2857,6 +2865,10 @@ display_regs_from_elf_notes(int cpu, FILE *ofp) else note64 = (Elf64_Nhdr *)nd->nt_prstatus; + if (!note64) { + error(INFO, "registers not collected for cpu %d\n", cpu); + return; + } prs = (struct ppc64_elf_prstatus *) ((char *)note64 + sizeof(Elf64_Nhdr) + note64->n_namesz); prs = (struct ppc64_elf_prstatus *)roundup((ulong)prs, 4); @@ -2903,6 +2915,10 @@ display_regs_from_elf_notes(int cpu, FILE *ofp) nd->nt_prstatus_percpu[cpu]; else note64 = (Elf64_Nhdr *)nd->nt_prstatus; + if (!note64) { + error(INFO, "registers not collected for cpu %d\n", cpu); + return; + } len = sizeof(Elf64_Nhdr); len = roundup(len + note64->n_namesz, 4); len = roundup(len + note64->n_descsz, 4); -- 2.34.1 Guanyou. Thanks

1 year, 6 months

2
1
0 / 0

[PATCH] gdb bt: multiple stacks support (x86_64)

by Alexey Makhalov

gdb target analyzes only one task at a time and it backtraces only straight C stack until end of the stack. If stacks were concatenated during exceptions or interrupts, gdb bt will show only the topmost one. Introduce multiple stacks support in gdb target, which can be observed as a different threads from gdb perspective. 'gdb info threads' - to see list of in-kenrel stacks to given task. 'gdb thread <Id>' - to switch. 'gdb bt' - to show it. Implmentation is machine specific. In x86_64, I use cmd_bt() to add additional gdb threads (gdb_add_substack(stack_id) call). Once added, gdb will may call machdep->get_current_task_reg() with corresonding stack_id (sid: new argument). Note: crash 'bt' command must be called for addition threads to appear. No threads/stacks support for arm64 and ppc64, x86_64 only. Example of #GP fault in the kernel caught by SCTP task.. crash> bt PID: 94228 TASK: ffff96a6766a8000 CPU: 31 COMMAND: "SCTP" #0 [ffffbb67437e7220] panic at ffffffff99b4f60b #1 [ffffbb67437e72c0] die_addr at ffffffff99033650 #2 [ffffbb67437e72f0] exc_general_protection at ffffffff99b9194b #3 [ffffbb67437e7390] asm_exc_general_protection at ffffffff99c00b47 [exception RIP: crypto_aead_encrypt+9] RIP: ffffffff995ce269 RSP: ffffbb67437e7440 RFLAGS: 00010246 RAX: 0fdd59d2b3d89ecb RBX: 0000000000000000 RCX: 0000000000000c90 RDX: ffff96a368508110 RSI: 0000000000000000 RDI: ffff96a348352060 RBP: ffffbb67437e7650 R8: 0000000000000001 R9: ffff96a3685080c8 R10: ffff96a348351c78 R11: 00000000d5a09e53 R12: 0000000000000008 R13: ffff96a348352010 R14: ffff96a348352000 R15: 0000000000000001 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #4 [ffffbb67437e7440] echainiv_encrypt at ffffffffc0ae82c2 [echainiv] #5 [ffffbb67437e7658] crypto_aead_encrypt at ffffffff995ce27c #6 [ffffbb67437e7668] esp_output_tail at ffffffffc0add3fc [esp4] #7 [ffffbb67437e76f8] esp_output at ffffffffc0addedf [esp4] #8 [ffffbb67437e7760] xfrm_output_resume at ffffffff99a9186a #9 [ffffbb67437e77e0] xfrm_output at ffffffff99a91fba #10 [ffffbb67437e7810] __xfrm4_output at ffffffff99a7b0e6 #11 [ffffbb67437e7820] xfrm4_output at ffffffff99a7b172 #12 [ffffbb67437e7890] ip_local_out at ffffffff99a000ef #13 [ffffbb67437e78b8] __ip_queue_xmit at ffffffff99a0028e #14 [ffffbb67437e7918] sctp_v4_xmit at ffffffffc0afe0f8 [sctp] #15 [ffffbb67437e79f0] sctp_packet_singleton at ffffffffc0b0bc47 [sctp] #16 [ffffbb67437e7a60] sctp_outq_flush at ffffffffc0b0c636 [sctp] #17 [ffffbb67437e7b08] sctp_outq_uncork at ffffffffc0b0d85c [sctp] #18 [ffffbb67437e7b18] sctp_do_sm at ffffffffc0afbaa6 [sctp] #19 [ffffbb67437e7d08] __sctp_connect at ffffffffc0b17893 [sctp] #20 [ffffbb67437e7d78] __sctp_setsockopt_connectx at ffffffffc0b17a6d [sctp] #21 [ffffbb67437e7da8] sctp_getsockopt at ffffffffc0b1c892 [sctp] #22 [ffffbb67437e7eb8] sock_common_getsockopt at ffffffff9993c6e7 #23 [ffffbb67437e7ec8] __sys_getsockopt at ffffffff9993afac #24 [ffffbb67437e7f18] __x64_sys_getsockopt at ffffffff9993b0bf #25 [ffffbb67437e7f28] x64_sys_call at ffffffff99004ca5 #26 [ffffbb67437e7f38] do_syscall_64 at ffffffff99b90e34 #27 [ffffbb67437e7f50] entry_SYSCALL_64_after_hwframe at ffffffff99c00126 RIP: 00007f12c63028ea RSP: 00007f10e41d9b28 RFLAGS: 00000206 RAX: ffffffffffffffda RBX: 0000000000000050 RCX: 00007f12c63028ea RDX: 000000000000006f RSI: 0000000000000084 RDI: 0000000000000050 RBP: 00007f10a00009b0 R8: 00007f10e41d9b3c R9: 00007f10ac000a5c R10: 00007f10e41d9b40 R11: 0000000000000206 R12: 00007f10e41db120 R13: 0000000000000050 R14: 0000000000000010 R15: 000000000289e070 ORIG_RAX: 0000000000000037 CS: 0033 SS: 002b crash> gdb bt #0 0xffffffff998eaadf in __inb (port=100) at ./arch/x86/include/asm/shared/io.h:22 #1 i8042_read_status () at drivers/input/serio/i8042-acpipnpio.h:54 #2 i8042_panic_blink (state=<optimized out>) at drivers/input/serio/i8042.c:1137 #3 0xffffffff99b4f60b in panic (fmt=fmt@entry=0xffffffff9a42c4cb "Fatal exception") at kernel/panic.c:460 #4 0xffffffff99b49b84 in oops_end (flags=<optimized out>, flags@entry=582, regs=<optimized out>, regs@entry=0xffffbb67437e7398, signr=<optimized out>) at arch/x86/kernel/dumpstack.c:382 #5 0xffffffff99033650 in die_addr (str=str@entry=0xffffbb67437e7304 "general protection fault, probably for non-canonical address 0xfdd59d2b3d89edb", regs=regs@entry=0xffffbb67437e7398, err=err@entry=0, gp_addr=<optimized out>) at arch/x86/kernel/dumpstack.c:462 #6 0xffffffff99b9194b in __exc_general_protection (error_code=0, regs=0xffffbb67437e7398) at arch/x86/kernel/traps.c:784 #7 exc_general_protection (regs=0xffffbb67437e7398, error_code=0) at arch/x86/kernel/traps.c:729 #8 0xffffffff99c00b47 in asm_exc_general_protection () at ./arch/x86/include/asm/idtentry.h:564 crash> gdb info threads Id Target Id Frame * 1 94228 SCTP (stack 0) 0xffffffff998eaadf in __inb (port=100) at ./arch/x86/include/asm/shared/io.h:22 2 94228 SCTP (stack 1) crypto_aead_encrypt (req=req@entry=0xffff96a348352060) at crypto/aead.c:86 crash> gdb thread 2 [Switching to thread 2 (94228 SCTP (stack 1))] #0 crypto_aead_encrypt (req=req@entry=0xffff96a348352060) at crypto/aead.c:86 86 crypto/aead.c: No such file or directory. crash> gdb bt #0 crypto_aead_encrypt (req=req@entry=0xffff96a348352060) at crypto/aead.c:86 #1 0xffffffffc0ae82c2 in echainiv_encrypt (req=0xffff96a348352010) at crypto/echainiv.c:82 #2 0xffffffff995ce27c in crypto_aead_encrypt (req=0xffff96a348352060) at crypto/aead.c:94 #3 0xffffffffc0add3fc in esp_output_tail () #4 0xffffffffc0addedf in esp_output () #5 0xffffffff99a9186a in xfrm_output_one (err=0, skb=0xffff96a3c852b300) at net/xfrm/xfrm_output.c:553 #6 xfrm_output_resume (sk=sk@entry=0xffff96a348368000, skb=skb@entry=0xffff96a3c852b300, err=<optimized out>, err@entry=1) at net/xfrm/xfrm_output.c:588 #7 0xffffffff99a91fba in xfrm_output2 (skb=0xffff96a3c852b300, sk=0xffff96a348368000, net=0xffff96a365582580) at net/xfrm/xfrm_output.c:615 #8 xfrm_output (sk=0xffff96a348368000, skb=0xffff96a3c852b300) at net/xfrm/xfrm_output.c:765 #9 0xffffffff99a7b0e6 in __xfrm4_output (net=<optimized out>, sk=<optimized out>, skb=<optimized out>) at net/ipv4/xfrm4_output.c:28 #10 0xffffffff99a7b172 in NF_HOOK_COND (pf=2 '\002', hook=4, okfn=0xffffffff99a7b0c0 <__xfrm4_output>, cond=<optimized out>, out=0xffff96a496ff2000, in=0x0, skb=0xffff96a3c852b300, sk=0xffff96a348368000, net=0xffff96a365582580) at ./include/linux/netfilter.h:291 #11 xfrm4_output (net=0xffff96a365582580, sk=0xffff96a348368000, skb=0xffff96a3c852b300) at net/ipv4/xfrm4_output.c:33 #12 0xffffffff99a000ef in dst_output (skb=0xffff96a368508110, sk=0x0, net=0xffff96a348352060) at ./include/net/dst.h:444 #13 ip_local_out (net=0xffff96a348352060, sk=0x0, skb=0xffff96a368508110) at net/ipv4/ip_output.c:126 #14 0xffffffff99a0028e in __ip_queue_xmit (sk=sk@entry=0xffff96a348368000, skb=skb@entry=0xffff96a3c852b300, fl=fl@entry=0xffff96a348351830, tos=tos@entry=186 '\272') at net/ipv4/ip_output.c:532 #15 0xffffffffc0afe0f8 in sctp_v4_xmit (skb=0xffff96a3c852b300, t=0xffff96a348351800) at net/sctp/protocol.c:1071 #16 0xffffffffc0b1f553 in sctp_packet_transmit (packet=packet@entry=0xffffbb67437e79f8, gfp=gfp@entry=3264) at net/sctp/output.c:653 #17 0xffffffffc0b0bc47 in sctp_packet_singleton (transport=<optimized out>, chunk=chunk@entry=0xffff96a34c96f500, gfp=3264) at net/sctp/outqueue.c:783 #18 0xffffffffc0b0c636 in sctp_outq_flush_ctrl (ctx=0xffffbb67437e7aa0) at net/sctp/outqueue.c:914 #19 sctp_outq_flush (q=0xffff96a3483585b8, rtx_timeout=rtx_timeout@entry=0, gfp=<optimized out>) at net/sctp/outqueue.c:1212 #20 0xffffffffc0b0d85c in sctp_outq_uncork (q=q@entry=0xffff96a3483585b8, gfp=gfp@entry=3264) at net/sctp/outqueue.c:764 #21 0xffffffffc0afbaa6 in sctp_cmd_interpreter (state=<optimized out>, status=<optimized out>, gfp=<optimized out>, commands=0xffffbb67437e7b68, event_arg=<optimized out>, asoc=0xffff96a348358000, ep=<optimized out>, subtype=..., event_type=<optimized out>) at net/sctp/sm_sideeffect.c:1819 #22 sctp_side_effects (gfp=<optimized out>, commands=0xffffbb67437e7b68, status=<optimized out>, event_arg=<optimized out>, asoc=<synthetic pointer>, ep=<optimized out>, state=<optimized out>, subtype=..., event_type=<optimized out>) at net/sctp/sm_sideeffect.c:1199 #23 sctp_do_sm (net=<optimized out>, event_type=event_type@entry=SCTP_EVENT_T_PRIMITIVE, subtype=..., subtype@entry=..., state=<optimized out>, ep=<optimized out>, asoc=<optimized out>, event_arg=<optimized out>, gfp=<optimized out>) at net/sctp/sm_sideeffect.c:1170 #24 0xffffffffc0b1e2f0 in sctp_primitive_ASSOCIATE (net=<optimized out>, asoc=asoc@entry=0xffff96a348358000, arg=arg@entry=0x0) at net/sctp/primitive.c:73 #25 0xffffffffc0b17893 in __sctp_connect (sk=sk@entry=0xffff96a348368000, kaddrs=kaddrs@entry=0xffff96a342085030, addrs_size=addrs_size@entry=16, flags=2050, assoc_id=assoc_id@entry=0xffffbb67437e7df4) at ./include/net/net_namespace.h:369 #26 0xffffffffc0b17a6d in __sctp_setsockopt_connectx (sk=sk@entry=0xffff96a348368000, kaddrs=kaddrs@entry=0xffff96a342085030, addrs_size=16, assoc_id=assoc_id@entry=0xffffbb67437e7df4) at net/sctp/socket.c:1334 #27 0xffffffffc0b1c892 in sctp_getsockopt_connectx3 (optlen=0x7f10e41d9b3c, optval=0x7f10e41d9b40 <error: Cannot access memory at address 0x7f10e41d9b40>, len=16, sk=0xffff96a348368000) at net/sctp/socket.c:1419 #28 sctp_getsockopt (sk=0xffff96a348368000, level=<optimized out>, optname=<optimized out>, optval=0x7f10e41d9b40 <error: Cannot access memory at address 0x7f10e41d9b40>, optlen=<optimized out>) at net/sctp/socket.c:8124 #29 0xffffffff9993c6e7 in sock_common_getsockopt (sock=<optimized out>, level=0, optname=1750106384, optval=0xc90 <error: Cannot access memory at address 0xc90>, optlen=0x1) at net/core/sock.c:3652 #30 0xffffffff9993afac in __sys_getsockopt (fd=<optimized out>, level=132, optname=111, optval=0x7f10e41d9b40 <error: Cannot access memory at address 0x7f10e41d9b40>, optlen=<optimized out>) at net/socket.c:2327 #31 0xffffffff9993b0bf in __do_sys_getsockopt (optlen=<optimized out>, optval=<optimized out>, optname=<optimized out>, level=<optimized out>, fd=<optimized out>) at net/socket.c:2342 #32 __se_sys_getsockopt (optlen=<optimized out>, optval=<optimized out>, optname=<optimized out>, level=<optimized out>, fd=<optimized out>) at net/socket.c:2339 #33 __x64_sys_getsockopt (regs=<optimized out>) at net/socket.c:2339 #34 0xffffffff99004ca5 in x64_sys_call (regs=regs@entry=0xffffbb67437e7f58, nr=<optimized out>) at ./arch/x86/include/generated/asm/syscalls_64.h:56 #35 0xffffffff99b90e34 in do_syscall_x64 (nr=<optimized out>, regs=0xffffbb67437e7f58) at arch/x86/entry/common.c:51 #36 do_syscall_64 (regs=0xffffbb67437e7f58, nr=<optimized out>) at arch/x86/entry/common.c:81 #37 0xffffffff99c00126 in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:121 Now we can use GDB to see the root cause. Signed-off-by: Alexey Makhalov <alexey.makhalov(a)broadcom.com> --- arm64.c | 2 +- crash_target.c | 25 ++++++++++++++++++---- defs.h | 3 ++- gdb_interface.c | 6 +++--- ppc64.c | 2 +- x86_64.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++-- 6 files changed, 81 insertions(+), 12 deletions(-) diff --git a/arm64.c b/arm64.c index 608b19d..62f91d8 100644 --- a/arm64.c +++ b/arm64.c @@ -204,7 +204,7 @@ out: static int arm64_get_current_task_reg(int regno, const char *name, - int size, void *value) + int size, void *value, int unused) { struct bt_info bt_info, bt_setup; struct task_context *tc; diff --git a/crash_target.c b/crash_target.c index 1080976..8b17ef8 100644 --- a/crash_target.c +++ b/crash_target.c @@ -27,8 +27,9 @@ void crash_target_init (void); extern "C" int gdb_readmem_callback(unsigned long, void *, int, int); extern "C" int crash_get_current_task_reg (int regno, const char *regname, - int regsize, void *val); + int regsize, void *val, int sid); extern "C" int gdb_change_thread_context (void); +extern "C" int gdb_add_substack (int sid); extern "C" void crash_get_current_task_info(unsigned long *pid, char **comm); /* The crash target. */ @@ -64,9 +65,10 @@ public: unsigned long pid; char *comm; crash_get_current_task_info(&pid, &comm); - return string_printf ("%ld %s", pid, comm); + if (thread_count(this) == 1) + return string_printf ("%ld %s", pid, comm); + return string_printf ("%ld %s (stack %ld)", pid, comm, ptid.tid()); } - }; static void supply_registers(struct regcache *regcache, int regno) @@ -79,7 +81,7 @@ static void supply_registers(struct regcache *regcache, int regno) if (regsize > sizeof (regval)) error (_("fatal error: buffer size is not enough to fit register value")); - if (crash_get_current_task_reg (regno, regname, regsize, (void *)&regval)) + if (crash_get_current_task_reg (regno, regname, regsize, (void *)&regval, inferior_thread()->ptid.tid())) regcache->raw_supply (regno, regval); else regcache->raw_supply (regno, NULL); @@ -144,7 +146,22 @@ crash_target_init (void) extern "C" int gdb_change_thread_context (void) { + for (thread_info *tp : current_inferior()->threads_safe()) + if (tp->ptid.tid_p()) + delete_thread (tp); target_fetch_registers(get_current_regcache(), -1); reinit_frame_cache(); return TRUE; } + +/* Add a thread for each additional stack. Use stack ID as a thread ID */ +extern "C" int +gdb_add_substack (int sid) +{ + ptid_t ptid = ptid_t(CRASH_INFERIOR_PID, 0, sid); + + thread_info *tp = find_thread_ptid (current_inferior(), ptid); + if (tp == nullptr) + add_thread_silent (current_inferior()->process_target(), ptid); + return TRUE; +} diff --git a/defs.h b/defs.h index b93a7a6..bb2bc20 100644 --- a/defs.h +++ b/defs.h @@ -1081,7 +1081,7 @@ struct machdep_table { void (*get_irq_affinity)(int); void (*show_interrupts)(int, ulong *); int (*is_page_ptr)(ulong, physaddr_t *); - int (*get_current_task_reg)(int, const char *, int, void *); + int (*get_current_task_reg)(int, const char *, int, void *, int); int (*is_cpu_prstatus_valid)(int cpu); }; @@ -8301,5 +8301,6 @@ enum ppc64_regnum { /* crash_target.c */ extern int gdb_change_thread_context (void); +extern int gdb_add_substack (int sid); #endif /* !GDB_COMMON */ diff --git a/gdb_interface.c b/gdb_interface.c index 315711e..c138c94 100644 --- a/gdb_interface.c +++ b/gdb_interface.c @@ -1074,12 +1074,12 @@ unsigned long crash_get_kaslr_offset(void) /* Callbacks for crash_target */ int crash_get_current_task_reg (int regno, const char *regname, - int regsize, void *value); + int regsize, void *value, int sid); int crash_get_current_task_reg (int regno, const char *regname, - int regsize, void *value) + int regsize, void *value, int sid) { if (!machdep->get_current_task_reg) return FALSE; - return machdep->get_current_task_reg(regno, regname, regsize, value); + return machdep->get_current_task_reg(regno, regname, regsize, value, sid); } diff --git a/ppc64.c b/ppc64.c index 782107b..1cf06e3 100644 --- a/ppc64.c +++ b/ppc64.c @@ -2512,7 +2512,7 @@ ppc64_print_eframe(char *efrm_str, struct ppc64_pt_regs *regs, static int ppc64_get_current_task_reg(int regno, const char *name, int size, - void *value) + void *value, int unused) { struct bt_info bt_info, bt_setup; struct task_context *tc; diff --git a/x86_64.c b/x86_64.c index e7f8fe2..2e7cde4 100644 --- a/x86_64.c +++ b/x86_64.c @@ -126,7 +126,7 @@ static int x86_64_get_framesize(struct bt_info *, ulong, ulong, char *); static void x86_64_framesize_debug(struct bt_info *); static void x86_64_get_active_set(void); static int x86_64_get_kvaddr_ranges(struct vaddr_range *); -static int x86_64_get_current_task_reg(int, const char *, int, void *); +static int x86_64_get_current_task_reg(int, const char *, int, void *, int); static int x86_64_verify_paddr(uint64_t); static void GART_init(void); static void x86_64_exception_stacks_init(void); @@ -143,6 +143,14 @@ struct machine_specific x86_64_machine_specific = { 0 }; static const char *exception_functions_orig[]; static const char *exception_functions_5_8[]; +/* + * Additional stacks entry registers for gdb target. + * See 'gdb info threads' + */ +#define MAX_STACKS_NUM 5 +ulong stack_idx; +ulong stacks_regs[MAX_STACKS_NUM][SS_REGNUM + 1]; + /* Use this hardwired version -- sometimes the * debuginfo doesn't pick this up even though * it exists in the kernel; it shouldn't change. @@ -3551,6 +3559,7 @@ x86_64_low_budget_back_trace_cmd(struct bt_info *bt_in) irq_eframe = 0; last_process_stack_eframe = 0; bt->call_target = NULL; + stack_idx = 0; rsp = bt->stkptr; ms = machdep->machspec; @@ -4159,6 +4168,7 @@ x86_64_dwarf_back_trace_cmd(struct bt_info *bt_in) last_process_stack_eframe = 0; bt->call_target = NULL; bt->bptr = 0; + stack_idx = 0; rsp = bt->stkptr; if (!rsp) { error(INFO, "cannot determine starting stack pointer\n"); @@ -4799,6 +4809,36 @@ x86_64_exception_frame(ulong flags, ulong kvaddr, char *local, } else if (machdep->flags & ORC) bt->bptr = rbp; + + /* + * Preserve registers set for each additional in-kernel stack + * up to MAX_STACKS_NUM. + */ + if (!(cs & 3) && verified && stack_idx < MAX_STACKS_NUM) { + stacks_regs[stack_idx][RAX_REGNUM] = rax; + stacks_regs[stack_idx][RBX_REGNUM] = rbx; + stacks_regs[stack_idx][RCX_REGNUM] = rcx; + stacks_regs[stack_idx][RDX_REGNUM] = rdx; + stacks_regs[stack_idx][RSI_REGNUM] = rsi; + stacks_regs[stack_idx][RDI_REGNUM] = rdi; + stacks_regs[stack_idx][RBP_REGNUM] = rbp; + stacks_regs[stack_idx][RSP_REGNUM] = rsp; + stacks_regs[stack_idx][R8_REGNUM] = r8; + stacks_regs[stack_idx][R9_REGNUM] = r9; + stacks_regs[stack_idx][R10_REGNUM] = r10; + stacks_regs[stack_idx][R11_REGNUM] = r11; + stacks_regs[stack_idx][R12_REGNUM] = r12; + stacks_regs[stack_idx][R13_REGNUM] = r13; + stacks_regs[stack_idx][R14_REGNUM] = r14; + stacks_regs[stack_idx][R15_REGNUM] = r15; + stacks_regs[stack_idx][RIP_REGNUM] = rip; + stacks_regs[stack_idx][EFLAGS_REGNUM] = rflags; + stacks_regs[stack_idx][CS_REGNUM] = cs; + stacks_regs[stack_idx][SS_REGNUM] = ss; + /* Skip stack 0 (main stack), start with index 1 */ + gdb_add_substack (stack_idx + 1); + stack_idx++; + } if (kvaddr) FREEBUF(pt_regs_buf); @@ -9236,7 +9276,7 @@ x86_64_get_kvaddr_ranges(struct vaddr_range *vrp) static int x86_64_get_current_task_reg(int regno, const char *name, - int size, void *value) + int size, void *value, int sid) { struct bt_info bt_info, bt_setup; struct task_context *tc; @@ -9256,6 +9296,17 @@ x86_64_get_current_task_reg(int regno, const char *name, if (!tc) return FALSE; + /* Non zero stack ID, use saved regs */ + if (sid && sid <= MAX_STACKS_NUM) { + switch (regno) { + case RAX_REGNUM ... SS_REGNUM: + memcpy(value, &stacks_regs[sid - 1][regno], size > 8 ? 8 : size); + return TRUE; + default: + return FALSE; + } + } + /* * Task is active, grab CPU's registers */ -- 2.43.5

1 year, 6 months

2
3
0 / 0

[ANNOUNCE] crash-8.0.6 is available

by lijiang

Hi, Thank you all for your contributions to the crash-utility, crash-8.0.6 is now available. Download from: https://crash-utility.github.io/ or https://github.com/crash-utility/crash/releases The GitHub master branch serves as a development branch that will contain all patches that are queued for the next release: $ git clone https://github.com/crash-utility/crash.git Changelog: f13853cef53f crash-8.0.5 -> crash-8.0.6 db0077614aae Fix for 'sys' to properly display the PANIC message ca74157283dd Doc: add doc to state that the --log option is deprecated 968debd0d597 arm64: Add gdb stack unwind support 89ff1e457344 x86_64: Add gdb stack unwind support 6dfda0d22355 ppc64: Add gdb stack unwind support 1fd80c623c20 Preparing for gdb stack unwind support 7c8a7dddda66 vmware_guestdump: Various format versions support c4db469af091 x86_64: Fix invalid input "=>" for bt command 21e0a345f973 Fix cpumask_t recursive dependence issue 32b03ca26229 Revert "arm64: section_size_bits compatible with macro definitions" 7b5c8bca7d05 X86 64: improve the method of determining whether kaslr is enabled 9babe985a7eb kmem: fix the determination for slab page 0d2ad774532d x86_64: Fix the bug of getting incorrect framesize 17248cf00276 arm64: Support 16K page, 48 VA bits and 4 level page table 19ce5a996ce7 arm64: fix 64K page and 52-bits VA support 3b8f9721e13d arm64: use the same expression to indicate ptrs_per_pgd 2ebf656a4a17 arm64: fix indent issue and refactor PTE_TO_PHYS f20a94016148 “kmem address” not working properly when redzone is enabled 79b93ecb2e72 Fix a "Bus error" issue caused by 'crash --osrelease' or crash loading af3d266aeb8c arm64: cleanup the pud description f93d870f8b6a arm64: fix for 'help -m/-M' to correctly display the pmd description bcdf0f798d01 arm64: Introduction of support for 16K page with 2-level table support 5218919ec108 s390x: Fix "bt -f/-F" command fail with seek error 321e1e854588 Fix a segfault issue due to the incorrect irq_stack_size on ARM64 5cd1c6ace5fe arm64: fix the determination of vmemmap and struct_page_size f615f8fab7bf Fix "irq -a" exceeding the memory range issue 38f26cc8b930 LoongArch64: fix incorrect code in the main() 93d7f647c45b arm64: Introduction of support for 16K page with 3-level table support 1c6da3eaff82 arm64: Fix bt command show wrong stacktrace on ramdump source af895b219876 arm64: fix a potential segfault when unwind frame ce4ddc742fbd List: enable LIST_HEAD_FORMAT for -r option 3452fe802bf9 Fix "kmem -i" and "swap" commands on Linux 6.10-rc1 and later kernels 196c4b79c13d X86 64: fix a regression issue about kernel stack padding a20eb05de3c1 Fix for failing to load kernel module 6752571d8d78 X86 64: fix for crash session loading failure 7c2c90d0b06a Fix "kmem -v" option on Linux 6.9 and later kernels 48764a14bc58 x86_64: fix for adding top_of_kernel_stack_padding for kernel stack 3879e9104826 Reflect __{start,end}_init_task kernel symbols rename 568c6f049ad4 arm64: section_size_bits compatible with macro definitions af2ac4c41df6 Cleanup: replace struct zspage_5_17 with union a584e9752fb2 Adding the zram decompression algorithm "lzo-rle" 9104e87db44e Mark start of 8.0.6 development phase with version 8.0.5++ Full ChangeLog: https://crash-utility.github.io/changelog/ChangeLog-8.0.6.txt or https://github.com/crash-utility/crash/compare/8.0.5...8.0.6

1 year, 6 months

1
0
0 / 0

Re: [PATCH] RISCV64: add panic signature to panic_msg to properly display the PANIC message

by lijiang

On Sat, Nov 9, 2024 at 9:14 AM <devel-request(a)lists.crash-utility.osci.io> wrote: > Date: Fri, 8 Nov 2024 21:13:19 +1300 > From: Tao Liu <ltao(a)redhat.com> > Subject: [Crash-utility] Re: [PATCH] RISCV64: add panic signature to > panic_msg to properly display the PANIC message > To: Austin Kim <austindh.kim(a)gmail.com> > Cc: lijiang <lijiang(a)redhat.com>, devel(a)lists.crash-utility.osci.io, > Austin Kim <austindhkim(a)gmail.com>, 김동현 <austin.kim(a)lge.com> > Message-ID: > < > CAO7dBbWgemin4FPcSkgQ21dbbqW-S8KeFtNuaH8otOZvNWQVSA(a)mail.gmail.com> > Content-Type: text/plain; charset="UTF-8" > > Hi Austin & Lianbo, > > On Fri, Nov 8, 2024 at 1:35 AM Austin Kim <austindh.kim(a)gmail.com> wrote: > > > > Hello Lianbo, > > > > 2024년 11월 6일 (수) 오후 12:53, lijiang <lijiang(a)redhat.com>님이 작성: > > > > > > Hi, Austin > > > Thank you for the patch. > > > > > > On Fri, Nov 1, 2024 at 5:19 PM < > devel-request(a)lists.crash-utility.osci.io> wrote: > > >> > > >> Date: Tue, 29 Oct 2024 17:32:07 +0900 > > >> From: Austin Kim <austindh.kim(a)gmail.com> > > >> Subject: [Crash-utility] [PATCH] RISCV64: add panic signature to > > >> panic_msg to properly display the PANIC message > > >> To: devel(a)lists.crash-utility.osci.io > > >> Cc: austindh.kim(a)gmail.com, austin.kim(a)lge.com > > >> Message-ID: <20241029083207.GA30130@adminpc-PowerEdge-R7525> > > >> Content-Type: text/plain; charset=us-ascii > > >> > > >> Using 'sys' command, we can view the panic message with general system > > >> information. If we run RISCV64-based vmcore, PANIC message is not > properly > > >> displayed. > > >> > > >> The reason is that "Unable to handle kernel" is first printed in the > kernel log > > >> when exception occurs in the RISC-V based Linux kernel. The > corresponding > > >> kernel commit is 21733cb518471. > > >> > > >> Without the patch: > > >> crash> sys > > >> KERNEL: vmlinux [TAINTED] > > >> DUMPFILE: vmcore > > >> CPUS: 4 > > >> DATE: Thu Aug 22 16:13:08 KST 2024 > > >> UPTIME: 00:33:25 > > >> LOAD AVERAGE: 0.07, 0.07, 0.02 > > >> TASKS: 385 > > >> NODENAME: starfive > > >> RELEASE: 6.6.20+ > > >> VERSION: #13 SMP Mon Aug 19 12:58:52 KST 2024 > > >> MACHINE: riscv64 (unknown Mhz) > > >> MEMORY: 4 GB > > >> PANIC: "" > > >> > > >> With the patch: > > >> crash> sys > > >> KERNEL: vmlinux [TAINTED] > > >> DUMPFILE: vmcore > > >> CPUS: 4 > > >> DATE: Thu Aug 22 16:13:08 KST 2024 > > >> UPTIME: 00:33:25 > > >> LOAD AVERAGE: 0.07, 0.07, 0.02 > > >> TASKS: 385 > > >> NODENAME: starfive > > >> RELEASE: 6.6.20+ > > >> VERSION: #13 SMP Mon Aug 19 12:58:52 KST 2024 > > >> MACHINE: riscv64 (unknown Mhz) > > >> MEMORY: 4 GB > > >> PANIC: "Unable to handle kernel access to user memory without > uaccess routines at virtual address 0000000000000000" > > >> > > >> Signed-off-by: Austin Kim <austindh.kim(a)gmail.com> > > >> --- > > >> task.c | 1 + > > >> 1 file changed, 1 insertion(+) > > >> > > >> diff --git a/task.c b/task.c > > >> index d52ce0b..443f488 100644 > > >> --- a/task.c > > >> +++ b/task.c > > >> @@ -6330,6 +6330,7 @@ static const char* panic_msg[] = { > > >> "[Hardware Error]: ", > > >> "Bad mode in ", > > >> "Oops: ", > > >> + "Unable to handle kernel access ", > > > > > > > > > I would tend to search the panic keywords again as below, which can > cover both riscv64 and aarch64 cases. > > > > > > diff --git a/task.c b/task.c > > > index c131cc32067d..9613adebab57 100644 > > > --- a/task.c > > > +++ b/task.c > > > @@ -6392,6 +6392,9 @@ get_panicmsg(char *buf) > > > get_symbol_data("sysrq_pressed", sizeof(int), > &msg_found); > > > break; > > > } > > > + > > > + /* try to search panic string with panic keywords*/ > > > + search_panic_task_by_keywords(buf, &msg_found); > > > } > > With this patch applied, no regression found, I think this one can work. > Thank you for the confirmation, Austin and Tao. Applied: https://github.com/crash-utility/crash/commit/db0077614aaeda6d0ed557f2b91... Lianbo > > Thanks, > Tao Liu > > > > > > > found: > > > > > > > > > What do you think? I haven't tested this one, not sure if it can work > for you, could you please try it? > > > > Thank you for the positive feedback on the patch and for sharing > > another great idea. > > I tested the patch you suggested, and it worked well on my side. > > Here’s the crash message: > > > > crash> sys | grep PANIC > > PANIC: "Unable to handle kernel access to user memory without > > uaccess routines at virtual address 0000000000000000" > > > > This new patch is useful not only for RISC-V but also for a wider > > range of architectures, > > and it seems like a better approach than modifying panic_msg[]. > > > > Best regards, > > Austin Kim > > > > > Tao, can we also do a regression test to double check if there are any > risks? > > > > > > Thanks > > > Lianbo > > > > > > > > >> > > >> }; > > >> > > >> #define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0])) > > >> -- > > >> 2.17.1 > > >

1 year, 6 months

1
0
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility November 2024