On 2023/11/14 17:49, Tao Liu wrote:
There is an issue that, for kernel modules loaded by mod -s/-S,
"dis -rl" fails
to display module's code line number data after execute "bt" cmd in
crash.
Without the patch:
crsah> mod -S
crash> bt
PID: 1500 TASK: ff2bd8b093524000 CPU: 16 COMMAND: "lpfc_worker_0"
#0 [ff2c9f725c39f9e0] machine_kexec at ffffffff8e0686d3
...snip...
#7 [ff2c9f725c39fc00] page_fault at ffffffff8ea0114e
[exception RIP: lpfc_nlp_get+210]
RIP: ffffffffc0f60f82 RSP: ff2c9f725c39fcb0 RFLAGS: 00010046
RAX: 0000000000000046 RBX: ff2bd8d8ac056000 RCX: 0000000000fffffc
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046
RBP: ff2bd8d8ac056090 R8: 0000000000000000 R9: 0000000000000000
R10: ff2bd90d1f8701c0 R11: 0000000000000001 R12: ff2bd93320482ae0
R13: ff2bd93051a80524 R14: ff2bd93051a80000 R15: ff2bd9332079fc00
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#8 [ff2c9f725c39fcc0] __lpfc_sli_release_iocbq_s4 at ffffffffc0f2f425 [lpfc]
...snip...
crash> dis -rl ffffffffc0f60f82
0xffffffffc0f60eb0 <lpfc_nlp_get>: nopl 0x0(%rax,%rax,1) [FTRACE NOP]
0xffffffffc0f60eb5 <lpfc_nlp_get+5>: push %rbp
0xffffffffc0f60eb6 <lpfc_nlp_get+6>: push %rbx
0xffffffffc0f60eb7 <lpfc_nlp_get+7>: test %rdi,%rdi
With the patch:
crash> mod -S
crash> bt
PID: 1500 TASK: ff2bd8b093524000 CPU: 16 COMMAND: "lpfc_worker_0"
#0 [ff2c9f725c39f9e0] machine_kexec at ffffffff8e0686d3
...snip...
#7 [ff2c9f725c39fc00] page_fault at ffffffff8ea0114e
[exception RIP: lpfc_nlp_get+210]
RIP: ffffffffc0f60f82 RSP: ff2c9f725c39fcb0 RFLAGS: 00010046
RAX: 0000000000000046 RBX: ff2bd8d8ac056000 RCX: 0000000000fffffc
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046
RBP: ff2bd8d8ac056090 R8: 0000000000000000 R9: 0000000000000000
R10: ff2bd90d1f8701c0 R11: 0000000000000001 R12: ff2bd93320482ae0
R13: ff2bd93051a80524 R14: ff2bd93051a80000 R15: ff2bd9332079fc00
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#8 [ff2c9f725c39fcc0] __lpfc_sli_release_iocbq_s4 at ffffffffc0f2f425 [lpfc]
...snip...
crash> dis -rl ffffffffc0f60f82
/usr/src/debug/kernel-4.18.0-425.13.1.el8_7/linux-4.18.0-425.13.1.el8_7.x86_64/drivers/scsi/lpfc/lpfc_hbadisc.c:
6756
0xffffffffc0f60eb0 <lpfc_nlp_get>: nopl 0x0(%rax,%rax,1) [FTRACE NOP]
/usr/src/debug/kernel-4.18.0-425.13.1.el8_7/linux-4.18.0-425.13.1.el8_7.x86_64/drivers/scsi/lpfc/lpfc_hbadisc.c:
6759
0xffffffffc0f60eb5 <lpfc_nlp_get+5>: push %rbp
The root cause is, after kernel module been loaded by command mod, the symtable
is not expanded in gdb side. crash command bt or dis will trigger such an
expansion. However the symtable expansion is different for the 2 commands:
The stack trace of "dis -rl" for symtable expanding:
#0 0x00000000008d8d9f in add_compunit_symtab_to_objfile (cu=cu@entry=0xe6a77a0) at
symfile.c:2914
#1 0x00000000006d3293 in buildsym_compunit::end_symtab_with_blockvector
(this=<optimized out>, static_block=static_block@entry=0xfbe4b60, section=1,
expandable=expandable@entry=0) at buildsym.c:1072
#2 0x00000000006d336a in buildsym_compunit::end_symtab_from_static_block
(this=<optimized out>, static_block=static_block@entry=0xfbe4b60,
section=<optimized out>, expandable=expandable@entry=0) at buildsym.c:1106
#3 0x000000000077e8e9 in process_full_comp_unit (pretend_language=<optimized
out>, cu=0x8ee4c60) at /usr/include/c++/8/bits/unique_ptr.h:716
#4 process_queue (per_objfile=0xc54c870) at dwarf2/read.c:9220
#5 dw2_do_instantiate_symtab (per_cu=<optimized out>, per_objfile=0xc54c870,
skip_partial=<optimized out>) at dwarf2/read.c:2448
#6 0x000000000077ed67 in dw2_instantiate_symtab (per_cu=0xdd0a320,
per_objfile=0xc54c870, skip_partial=<optimized out>) at dwarf2/read.c:2472
#7 0x000000000077f75e in dw2_expand_all_symtabs (objfile=<optimized out>) at
dwarf2/read.c:3768
#8 0x00000000008f254d in gdb_get_line_number (req=0x7fffffffb1f0) at symtab.c:7112
#9 0x00000000008f22af in gdb_command_funnel_1 (req=0x7fffffffb1f0) at symtab.c:7023
#10 0x00000000008f2003 in gdb_command_funnel (req=0x7fffffffb1f0) at symtab.c:6965
#11 0x00000000005b7f02 in gdb_interface (req=req@entry=0x7fffffffb1f0) at
gdb_interface.c:409
#12 0x00000000005f5bd8 in get_line_number (addr=18446744072651935408,
buf=buf@entry=0x7fffffffd460 "", reserved=reserved@entry=0) at symbols.c:4440
#13 0x000000000059e574 in cmd_dis () at kernel.c:2143
The stack trace of "bt" for symtable expanding:
#0 0x00000000008d8d9f in add_compunit_symtab_to_objfile (cu=cu@entry=0x1ad15630) at
symfile.c:2914
#1 0x00000000006d3293 in buildsym_compunit::end_symtab_with_blockvector
(this=<optimized out>, static_block=static_block@entry=0x1db0be30, section=1,
expandable=expandable@entry=0) at buildsym.c:1072
#2 0x00000000006d336a in buildsym_compunit::end_symtab_from_static_block
(this=<optimized out>, static_block=static_block@entry=0x1db0be30,
section=<optimized out>, expandable=expandable@entry=0) at buildsym.c:1106
#3 0x000000000077e8e9 in process_full_comp_unit (pretend_language=<optimized
out>, cu=0x7465240) at /usr/include/c++/8/bits/unique_ptr.h:716
#4 process_queue (per_objfile=0xc113810) at dwarf2/read.c:9220
#5 dw2_do_instantiate_symtab (per_cu=<optimized out>, per_objfile=0xc113810,
skip_partial=<optimized out>) at dwarf2/read.c:2448
#6 0x000000000077ed67 in dw2_instantiate_symtab (per_cu=0xdd069d0,
per_objfile=0xc113810, skip_partial=<optimized out>) at dwarf2/read.c:2472
#7 0x000000000077f8ed in dw2_lookup_symbol (objfile=<optimized out>,
block_index=STATIC_BLOCK, name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN)
at dwarf2/read.c:3669
#8 0x00000000008e6d03 in lookup_symbol_via_quick_fns (objfile=0xdd277a0,
block_index=STATIC_BLOCK, name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN)
at symtab.c:2392
#9 0x00000000008e7153 in lookup_symbol_in_objfile (objfile=0xdd277a0,
block_index=STATIC_BLOCK, name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN)
at symtab.c:2541
#10 0x00000000008e73c6 in lookup_symbol_global_or_static_iterator_cb
(objfile=0xdd277a0, cb_data=0x7fffffffc470) at symtab.c:2615
#11 0x00000000008b99c4 in svr4_iterate_over_objfiles_in_search_order
(gdbarch=<optimized out>, cb=0x8e7342
<lookup_symbol_global_or_static_iterator_cb(objfile*, void*)>,
cb_data=0x7fffffffc470, current_objfile=0x0) at solib-svr4.c:3248
#12 0x00000000008e754e in lookup_global_or_static_symbol (name=0x7fffffffc890
"cpumask_t", block_index=STATIC_BLOCK, objfile=0x0, domain=STRUCT_DOMAIN) at
symtab.c:2660
#13 0x00000000008e75da in lookup_static_symbol (name=0x7fffffffc890
"cpumask_t", domain=STRUCT_DOMAIN) at symtab.c:2678
#14 0x00000000008e632c in lookup_symbol_aux (name=0x7fffffffc890
"cpumask_t", match_type=symbol_name_match_type::FULL, block=0x0,
domain=STRUCT_DOMAIN, language=language_c, is_a_field_of_this=0x0) at symtab.c:2122
#15 0x00000000008e5a7a in lookup_symbol_in_language (name=0x7fffffffc890
"cpumask_t", block=0x0, domain=STRUCT_DOMAIN, lang=language_c,
is_a_field_of_this=0x0) at symtab.c:1889
#16 0x00000000008e5b30 in lookup_symbol (name=0x7fffffffc890 "cpumask_t",
block=0x0, domain=STRUCT_DOMAIN, is_a_field_of_this=0x0) at symtab.c:1915
#17 0x00000000008f2a4a in gdb_get_datatype (req=0x7fffffffc730) at symtab.c:7229
#18 0x00000000008f22c0 in gdb_command_funnel_1 (req=0x7fffffffc730) at symtab.c:7027
#19 0x00000000008f2003 in gdb_command_funnel (req=0x7fffffffc730) at symtab.c:6965
#20 0x00000000005b7f02 in gdb_interface (req=req@entry=0x7fffffffc730) at
gdb_interface.c:409
#21 0x00000000005f8a9f in datatype_info (name=name@entry=0xa8454d
"cpumask_t", member=member@entry=0x0, dm=dm@entry=0xfffffffffffffffc) at
symbols.c:5715
#22 0x0000000000599947 in cpu_map_size (type=<optimized out>) at kernel.c:913
#23 0x00000000005a975d in get_cpus_online () at kernel.c:9556
#24 0x0000000000637a8b in diskdump_get_prstatus_percpu (cpu=16) at diskdump.c:2277
#25 0x000000000062f0e4 in get_netdump_regs_x86_64 (bt=0x7fffffffd950,
ripp=0x7fffffffd130, rspp=0x7fffffffd138) at netdump.c:3471
#26 0x000000000059fe68 in back_trace (bt=bt@entry=0x7fffffffd950) at kernel.c:3092
#27 0x00000000005ab1cb in cmd_bt () at kernel.c:2859
For the stacktrace of "dis -rl", it calls dw2_expand_all_symtabs() to expand
all symtable of the objfile, or "*.ko.debug" in our case. However for
the stacktrace of "bt", it doesn't expand all, but only a subset of
symtable
which is enough to find a symbol by dw2_lookup_symbol(). As a result, the
objfile->compunit_symtabs, which is the head of a single linked list of
struct compunit_symtab, is not NULL but didn't contain all symtables. It
will not be reinitialized in gdb_get_line_number() by "dis -rl" because
!objfile_has_full_symbols(objfile) check will fail, so it cannot display
the proper code line number data.
This patch will force all the symtable of module to be expanded during
mod load phase, so no matter what commands follow, objfile->compunit_symtabs
always contain all symtabls.
Thank you for looking into this issue.
a question, is "mod -S -r" a workaround for it?
I'm thinking that, if the current gdb's auto expansion is not good for
crash, maybe we can make the behavior of "mod -r" option default. The
option adds "-readnow" to the add-symbol-file command and it looks same
as your patch to me:
$ vim gdb-10.2/gdb/symfile.c
/* We now have at least a partial symbol table. Check to see if the
user requested that all symbols be read on initial access via either
the gdb startup command line or on a per symbol file basis. Expand
all partial symbol tables for this objfile if so. */
if ((flags & OBJF_READNOW))
{
if (should_print)
printf_filtered (_("Expanding full symbols from %ps...\n"),
styled_string (file_name_style.style (), name));
if (objfile->sf)
objfile->sf->qf->expand_all_symtabs (objfile);
}
Thanks,
Kazu
>
> Signed-off-by: Tao Liu <ltao(a)redhat.com>
> ---
>
> PS: This patch is a stand along and is not the follow-up of
> [PATCH v2] symbols: skip load .init.* sections if module was successfully
initialized
>
> ---
> gdb-10.2.patch | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/gdb-10.2.patch b/gdb-10.2.patch
> index d81030d..0a9a4e1 100644
> --- a/gdb-10.2.patch
> +++ b/gdb-10.2.patch
> @@ -3187,3 +3187,14 @@ exit 0
> result = stringtab + symbol_entry->_n._n_n._n_offset;
> }
> else
> +--- gdb-10.2/gdb/symtab.c.orig
> ++++ gdb-10.2/gdb/symtab.c
> +@@ -7537,6 +7537,8 @@ gdb_add_symbol_file(struct gnu_request *req)
> + lm->loaded_objfile =
objfile->separate_debug_objfile;
> + else
> + lm->loaded_objfile = objfile;
> ++ if (lm->loaded_objfile->sf)
> ++
lm->loaded_objfile->sf->qf->expand_all_symtabs(lm->loaded_objfile);
> + break;
> + }
> + }