Hi Castor,
I was hoping to fold in your lastest crash-4.0-3.14-sym.2.patch,
but upon testing it, there is a bug that needs attention, and I'm
not sure why it occurs.
Again, it has something to do with those "__key.####" bss symbols
in the modules.
Here's a "mod -S" on an x86 machine, where it bumps into one of those
symbols in the very first module it tries to load:
crash> set debug 1
debug: 1
crash> mod -S
load_module_symbols: scsi_transport_spi
/lib/modules/2.6.18-1.2747.el5xen/kernel/drivers/scsi/scsi_transport_spi.ko
ee031000 122600
186 symbols found in obj file
/lib/modules/2.6.18-1.2747.el5xen/kernel/drivers/scsi/scsi_transport_spi.ko
scsi_transport_spi: update sec offset sym spi_device_configure @ ee031000 val 0 section
.text
scsi_transport_spi: update sec offset sym spi_transport_exit @ ee033510 val 0 section
.exit.text
scsi_transport_spi: update sec offset sym class_device_attr_period @ ee033570 val 30
section .rodata
scsi_transport_spi: update sec offset sym __ksymtab_spi_release_transport @ ee033ed8 val
0 section __ksymtab
scsi_transport_spi: update sec offset sym __kcrctab_spi_release_transport @ ee033f08 val
0 section __kcrctab
scsi_transport_spi: update sec offset sym __ksymtab_spi_populate_ppr_msg @ ee033f20 val 0
section __ksymtab_gpl
scsi_transport_spi: update sec offset sym __kcrctab_spi_populate_ppr_msg @ ee033f38 val 0
section __kcrctab_gpl
scsi_transport_spi: update sec offset sym __kstrtab_spi_release_transport @ ee033f44 val
0 section __ksymtab_strings
scsi_transport_spi: update sec offset sym ____versions @ ee034000 val 0 section
__versions
scsi_transport_spi: update sec offset sym spi_transport_class @ ee036ae0 val 0 section
.data
scsi_transport_spi: update sec offset sym __this_module @ ee036d80 val 0 section
.gnu.linkonce.this_module
scsi_transport_spi: update sec offset sym __key.19739 @ ee037f80 val 0 section .bss
scsi_transport_spi: update sec offset sym __key.19739 @ ee037f80 val 0 section .bss
scsi_transport_spi: update sec offset sym __key.19739 @ ee037f80 val 0 section .bss
scsi_transport_spi: update sec offset sym __key.19739 @ ee037f80 val 0 section .bss
scsi_transport_spi: update sec offset sym __key.19739 @ ee037f80 val 0 section .bss
scsi_transport_spi: update sec offset sym __key.19739 @ ee037f80 val 0 section .bss
scsi_transport_spi: update sec offset sym __key.19739 @ ee037f80 val 0 section .bss
... [ spits out same message forever ] ...
i.e, continuing to loop in your new calculate_load_order_v2() function.
On an x86_64, I see the same thing, although it goes through several
other modules successfully before the infinite loop starts in the i2c_core
module:
crash> set debug 1
debug: 1
crash> mod -S
[ ... debug info removed ...]
load_module_symbols: i2c_core
/lib/modules/2.6.18-1.2747.el5/kernel/drivers/i2c/i2c-core.ko ffffffff8817d000 b02600
244 symbols found in obj file
/lib/modules/2.6.18-1.2747.el5/kernel/drivers/i2c/i2c-core.ko
update sec offset sym i2c_device_match @ ffffffff8817d000 val 0 section .text
update sec offset sym i2c_exit @ ffffffff8817e614 val 0 section .exit.text
update sec offset sym __ksymtab_i2c_smbus_write_i2c_block_data @ ffffffff8817e990 val 0
section __ksymtab
update sec offset sym __kcrctab_i2c_smbus_write_i2c_block_data @ ffffffff8817eb50 val 0
section __kcrctab
update sec offset sym __ksymtab_i2c_bus_type @ ffffffff8817ec30 val 0 section
__ksymtab_gpl
update sec offset sym __kcrctab_i2c_bus_type @ ffffffff8817ec70 val 0 section
__kcrctab_gpl
update sec offset sym __kstrtab_i2c_smbus_write_i2c_block_data @ ffffffff8817ec90 val 0
section __ksymtab_strings
update sec offset sym ____versions @ ffffffff8817efa0 val 0 section __versions
update sec offset sym i2c_bus_type @ ffffffff88182980 val 0 section .data
update sec offset sym __this_module @ ffffffff88182f80 val 0 section
.gnu.linkonce.this_module
update sec offset sym __key.10825 @ ffffffff8818b180 val 0 section .bss
update sec offset sym __key.10825 @ ffffffff8818b180 val 0 section .bss
update sec offset sym __key.10825 @ ffffffff8818b180 val 0 section .bss
update sec offset sym __key.10825 @ ffffffff8818b180 val 0 section .bss
update sec offset sym __key.10825 @ ffffffff8818b180 val 0 section .bss
update sec offset sym __key.10825 @ ffffffff8818b180 val 0 section .bss
update sec offset sym __key.10825 @ ffffffff8818b180 val 0 section .bss
update sec offset sym __key.10825 @ ffffffff8818b180 val 0 section .bss
update sec offset sym __key.10825 @ ffffffff8818b180 val 0 section .bss
... [ repeat forever ] ...
First I had thought that it would start spinning upon encountering the first
module with one of those __key.##### symbols -- which was true in
the case of the x86 machine.
However, in the above x86_64 machine, several other modules with those
__key.#### bss symbols types did get loaded OK before getting hung up
loading i2c_core module.
For example, if I do the loads individually on the x86_64, the jbd module
loads OK, but the i2c_core hangs:
crash> sym -m jbd | grep __key
ffffffff8804b0b0 (b) __key.16794
ffffffff8804b0b0 (b) __key.16795
crash> mod -s jbd
MODULE NAME SIZE OBJECT FILE
ffffffff88042e80 jbd 98609
/lib/modules/2.6.18-1.2747.el5/kernel/fs/jbd/jbd.ko
crash> sym -m i2c_core | grep __key
ffffffff8818b180 (b) __key.10825
ffffffff8818b180 (b) __key.10826
crash> mod -s i2c_core
< hang >
Can you see if you can reproduce, and hopefully fix this?
Thanks,
Dave