Hi Tao,
thank you for the information.
I'm looking into it, I noticed that the unexpected symbol
"floopy_module_init"
is in section .init.text. Crash side doesn't have the symbol info, probably
the address is already freed and reused by the virtio_blk module?
crash> mod -S
...
crash> sym -m floppy | grep MODULE
ffffffffc0092000 MODULE START: floppy
ffffffffc00a2f29 MODULE END: floppy
crash> sym -m virtio_blk | grep MODULE
ffffffffc00a4000 MODULE START: virtio_blk
ffffffffc00a86ec MODULE END: virtio_blk
crash> gdb info symbol floppy_module_init
init_module in section .init.text of
/home/vmcore/symbol_err/usr/lib/debug/lib/modules/3.10.0-693.2.2.el7.x86_64/kernel/drivers/block/floppy.ko.debug
virtblk_freeze + 33 in section .text of
/home/vmcore/symbol_err/usr/lib/debug/lib/modules/3.10.0-693.2.2.el7.x86_64/kernel/drivers/block/virtio_blk.ko.debug
crash> gdb info address floppy_module_init
Symbol "floppy_module_init" is a function at address 0xffffffffc00a4131.
crash> sym floppy_module_init
symbol not found: floppy_module_init
possible alternatives:
(none found)
So still not sure, but this might be an issue of handling the init
section/symbols of modules..
Thanks,
Kazu
On 2023/11/07 17:20, Tao Liu wrote:
> Hi Kazu,
>
> On Tue, Nov 7, 2023 at 1:07 PM HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab(a)nec.com>
wrote:
>>
>> Hi Tao,
>>
>> thank you for the patches.
>>
>> On 2023/11/02 21:09, Tao Liu wrote:
>>> The address range may overlap for kernel modules global block vector,
>>> for example:
>>>
>>> module compunit_symtab block start block end
>>> floppy floppy.c ffffffffc0092000 ffffffffc00a4fa6
>>> virtio_blk virtio_blk.c ffffffffc00a4000 ffffffffc00fd080
>>
>> hmm, this condition looks strange in the first place, is it inevitable?
>> In other words, I wonder if there might be something wrong with how to
>> load module symbol files.
>>
>
> Honestly I'm not sure why there is address overlap of the 2 modules
> global block, it shouldn't happen from my view. But it can happen on
> another kernel(4.18.0-425.13.1.el8_7.x86_64), see the results blow:
>
> module compunit_symtab block start block end
> libata.ko.debug libata-core.c ffffffffc054f000 ffffffffc0591b18
> mgag200.ko.debug mgag200_drv.c ffffffffc03f3000 ffffffffc0595b0e
>
> module address boundary:
> crash> sym -l
> ffffffffc054f000 MODULE START: libata
> ...snip...
> ffffffffc0590000 MODULE END: libata
> ffffffffc0591000 MODULE START: mgag200
> ...snip...
> ffffffffc059a000 MODULE END: mgag200
>
>> What is the kernel version?
>
> The kernel which I used to create the patch is 3.10.0-693.2.2.el7.x86_64.
>
>> Do you have any patch to debug the gdb internal information?
>
> Sure, simple debug patch as follows:
> diff --git gdb-10.2/gdb/symtab.c.orig gdb-10.2/gdb/symtab.c
> index 82605cc..3af3570 100644
> --- gdb-10.2/gdb/symtab.c.orig
> +++ gdb-10.2/gdb/symtab.c
> @@ -2939,6 +2939,8 @@ find_pc_sect_compunit_symtab (CORE_ADDR pc,
> struct obj_section *section)
> bv = COMPUNIT_BLOCKVECTOR (cust);
> b = BLOCKVECTOR_BLOCK (bv, GLOBAL_BLOCK);
>
> + printf("%s %s %lx %lx\n", objfile_filename(obj_file),
> cust->name, BLOCK_START (b), BLOCK_END (b));
> +
> if (BLOCK_START (b) <= pc
> && BLOCK_END (b) > pc
> && (distance == 0
>>
>> I'd like to check vmcores on hand..
>
> I can share the vmcore with you by a private email later.
>
> Thanks,
> Tao Liu
>
>>
>> Thanks,
>> Kazu
>>
>>>
>>> According to the gdb source code, the distance(block end - block start) of
>>> floppy(0x12fa6) is smaller than virtio_blk(0x59080), so address
>>> 0xffffffffc00a45b0, 0xffffffffc00a4370, 0xffffffffc00a4260 are translated
>>> by floppy module instead of virtio_blk module. However according
>>> to crash, the address range for the 2 modules are:
>>>
>>> crash> sym -l
>>> ffffffffc0092000 MODULE START: floppy
>>> ...snip...
>>> ffffffffc00a2f29 MODULE END: floppy
>>> ffffffffc00a4000 MODULE START: virtio_blk
>>> ...snip...
>>> ffffffffc00a86ec MODULE END: virtio_blk
>>>
>>> So the block vector address arrange and distance is not a reliable way to
>>> determine the symbol belonging of a specific kernel module. This patch will
>>> let gdb to query the module address range from crash, in order to locate
the
>>> symbol belongings more percisely.
>>>
>>> Without the patch:
>>> crash> mod -S
>>> crash> struct blk_mq_ops 0xffffffffc00a7160
>>> struct blk_mq_ops {
>>> queue_rq = 0xffffffffc00a45b0 <floppy_module_init+1151>, <--
symbol translated from module floppy
>>> map_queue = 0xffffffff813015c0 <blk_mq_map_queue>,
>>> ...snip...
>>> complete = 0xffffffffc00a4370 <floppy_module_init+575>,
>>> init_request = 0xffffffffc00a4260 <floppy_module_init+303>,
>>> ...snip...
>>> }
>>>
>>> With the patch:
>>> crash> mod -S
>>> crash> struct blk_mq_ops 0xffffffffc00a7160
>>> struct blk_mq_ops {
>>> queue_rq = 0xffffffffc00a45b0 <virtio_queue_rq>, <-- symbol
translated from module virtio_blk
>>> map_queue = 0xffffffff813015c0 <blk_mq_map_queue>,
>>> ...snip...
>>> complete = 0xffffffffc00a4370 <virtblk_request_done>,
>>> init_request = 0xffffffffc00a4260 <virtblk_init_request>,
>>> ...snip...
>>> }
>>>
>>> Signed-off-by: Tao Liu <ltao(a)redhat.com>
>>> ---
>>> defs.h | 2 ++
>>> gdb-10.2.patch | 37 ++++++++++++++++++++++++++++++++++++-
>>> gdb_interface.c | 18 ++++++++++++++++++
>>> 3 files changed, 56 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/defs.h b/defs.h
>>> index 788f63a..09e6738 100644
>>> --- a/defs.h
>>> +++ b/defs.h
>>> @@ -5130,6 +5130,7 @@ struct syment *symbol_search(char *);
>>> int gdb_line_number_callback(ulong, ulong, ulong);
>>> int gdb_print_callback(ulong);
>>> char *gdb_lookup_module_symbol(ulong, ulong *);
>>> +ulong gdb_get_module_boundary(const char *, bool);
>>> extern "C" int same_file(char *, char *);
>>> #endif
>>>
>>> @@ -7687,6 +7688,7 @@ int gdb_readmem_callback(ulong, void *, int, int);
>>> int gdb_line_number_callback(ulong, ulong, ulong);
>>> int gdb_print_callback(ulong);
>>> char *gdb_lookup_module_symbol(ulong, ulong *);
>>> +ulong gdb_get_module_boundary(const char *, bool);
>>> void gdb_error_hook(void);
>>> void restore_gdb_sanity(void);
>>> int is_gdb_command(int, ulong);
>>> diff --git a/gdb-10.2.patch b/gdb-10.2.patch
>>> index 31135ca..a5c10d3 100644
>>> --- a/gdb-10.2.patch
>>> +++ b/gdb-10.2.patch
>>> @@ -3203,4 +3203,39 @@ exit 0
>>> +#endif
>>> for (compunit_symtab *cust : obj_file->compunits ())
>>> {
>>> - const struct block *b;
>>> \ No newline at end of file
>>> + const struct block *b;
>>> +--- gdb-10.2/gdb/symtab.c.orig
>>> ++++ gdb-10.2/gdb/symtab.c
>>> +@@ -2895,6 +2895,10 @@ iterate_over_symbols_terminated
>>> + return callback (&block_sym);
>>> + }
>>> +
>>> ++#ifdef CRASH_MERGE
>>> ++extern "C" ulong gdb_get_module_boundary(const char *, bool);
>>> ++#endif
>>> ++
>>> + /* Find the compunit symtab associated with PC and SECTION.
>>> + This will read in debug info as necessary. */
>>> +
>>> +@@ -2932,11 +2936,21 @@ find_pc_sect_compunit_symtab (CORE_ADDR pc, struct
obj_section *section)
>>> + for (objfile *obj_file : current_program_space->objfiles ())
>>> + {
>>> + #ifdef CRASH_MERGE
>>> ++ size_t sep;
>>> + std::string objfile_name = objfile_filename(obj_file);
>>> ++ const char *mod_name;
>>> +
>>> + if (objfile_name.find(".ko") != std::string::npos) {
>>> + if (obj_file->sf && obj_file->compunit_symtabs ==
nullptr)
>>> + obj_file->sf->qf->expand_all_symtabs(obj_file);
>>> ++ sep = objfile_name.find_last_of("/");
>>> ++ objfile_name = objfile_name.substr(sep + 1, objfile_name.size() -
sep - 1);
>>> ++ sep = objfile_name.find_first_of(".");
>>> ++ objfile_name = objfile_name.substr(0, sep);
>>> ++ mod_name = objfile_name.c_str();
>>> ++ if (pc < gdb_get_module_boundary(mod_name, true) ||
>>> ++ pc > gdb_get_module_boundary(mod_name, false))
>>> ++ continue;
>>> + }
>>> + #endif
>>> + for (compunit_symtab *cust : obj_file->compunits ())
>>> diff --git a/gdb_interface.c b/gdb_interface.c
>>> index b14319c..a58fd68 100644
>>> --- a/gdb_interface.c
>>> +++ b/gdb_interface.c
>>> @@ -947,6 +947,24 @@ gdb_lookup_module_symbol(ulong addr, ulong *offset)
>>> }
>>> }
>>>
>>> +ulong
>>> +gdb_get_module_boundary(const char *module, bool is_start)
>>> +{
>>> + char buf[256];
>>> + struct syment *sp;
>>> +
>>> + if (is_start) {
>>> + snprintf(buf, sizeof(buf), "_MODULE_START_%s",
module);
>>> + } else {
>>> + snprintf(buf, sizeof(buf), "_MODULE_END_%s",
module);
>>> + }
>>> + sp = symbol_search(buf);
>>> + if (sp)
>>> + return sp->value;
>>> + else
>>> + return is_start ? 0 : (ulong)(-1);
>>> +}
>>> +
>>> /*
>>> * Used by gdb_interface() to catch gdb-related errors, if desired.
>>> */