On Fri, Mar 22, 2013 at 09:51:39AM -0400, Dave Anderson wrote:
----- Original Message -----
> On Thu, Mar 21, 2013 at 03:02:54PM -0400, Dave Anderson wrote:
> > If for some reason you can't get them, I can make them available to
> > you.
> > And Lei Wen can also give you a sample dumpfile from his
> > environment.
>
> Got them from Luc.
>
> > > Are you able to access module symbols on ARM dump (the one that Luc
provided)?
> > > Or is it failing completely?
> >
> > I *think* so...
> >
> > This module text disassembly looks right:
> >
> > crash> dis usbnet_suspend
> > 0xbf000ae8 <usbnet_suspend>: push {r3, r4, r5, lr}
> > 0xbf000aec <usbnet_suspend+4>: add r0, r0, #32
> > 0xbf000af0 <usbnet_suspend+8>: mov r5, r1
> > 0xbf000af4 <usbnet_suspend+12>: bl 0xc01b8264
> > <dev_get_drvdata>
> > 0xbf000af8 <usbnet_suspend+16>: ldrb r3, [r0, #36] ; 0x24
> > 0xbf000afc <usbnet_suspend+20>: mov r4, r0
> > 0xbf000b00 <usbnet_suspend+24>: add r2, r3, #1
> > 0xbf000b04 <usbnet_suspend+28>: cmp r3, #0
> > 0xbf000b08 <usbnet_suspend+32>: strb r2, [r0, #36] ; 0x24
> > 0xbf000b0c <usbnet_suspend+36>: bne 0xbf000bdc
> > <usbnet_suspend+244>
> > 0xbf000b10 <usbnet_suspend+40>: mrs r3, CPSR
> > 0xbf000b14 <usbnet_suspend+44>: orr r3, r3, #128 ; 0x80
> > 0xbf000b18 <usbnet_suspend+48>: msr CPSR_c, r3
> > 0xbf000b1c <usbnet_suspend+52>: mov r0, #1
> > 0xbf000b20 <usbnet_suspend+56>: bl 0xc0015f40
> > <add_preempt_count>
> > 0xbf000b24 <usbnet_suspend+60>: ldr r3, [r4, #200] ; 0xc8
> > 0xbf000b28 <usbnet_suspend+64>: cmp r3, #0
> > 0xbf000b2c <usbnet_suspend+68>: beq 0xbf000b70
> > <usbnet_suspend+136>
> > 0xbf000b30 <usbnet_suspend+72>: tst r5, #1024 ; 0x400
> > 0xbf000b34 <usbnet_suspend+76>: beq 0xbf000b70
> > <usbnet_suspend+136>
> > 0xbf000b38 <usbnet_suspend+80>: mrs r3, CPSR
> > ...
> >
> > This (r) data looks OK:
> >
> > crash> p smsc95xx_netdev_ops
> > smsc95xx_netdev_ops = $8 = {
> > ndo_init = 0,
> > ndo_uninit = 0,
> > ndo_open = 0xbf000514 <usbnet_open>,
> > ndo_stop = 0xbf000bec <usbnet_stop>,
> > ndo_start_xmit = 0xbf001a60 <usbnet_start_xmit>,
> > ndo_select_queue = 0,
> > ndo_change_rx_flags = 0,
> > ndo_set_rx_mode = 0,
> > ndo_set_multicast_list = 0xbf008abc <smsc95xx_set_multicast>,
> > ndo_set_mac_address = 0xc025d854 <eth_mac_addr>,
> > ndo_validate_addr = 0xc025d6f8 <eth_validate_addr>,
> > ndo_do_ioctl = 0xbf00926c <smsc95xx_ioctl>,
> > ndo_set_config = 0,
> > ndo_change_mtu = 0xbf000de0 <usbnet_change_mtu>,
> > ndo_neigh_setup = 0,
> > ndo_tx_timeout = 0xbf000d4c <usbnet_tx_timeout>,
> > ndo_get_stats64 = 0,
> > ndo_get_stats = 0,
> > ndo_vlan_rx_add_vid = 0,
> > ndo_vlan_rx_kill_vid = 0,
> > ndo_set_vf_mac = 0,
> > ndo_set_vf_vlan = 0,
> > ndo_set_vf_tx_rate = 0,
> > ndo_get_vf_config = 0,
> > ndo_set_vf_port = 0,
> > ndo_get_vf_port = 0,
> > ndo_setup_tc = 0,
> > ndo_add_slave = 0,
> > ndo_del_slave = 0,
> > ndo_fix_features = 0,
> > crash>
>
> I'm able to see the same.
>
> Setting suitable debug level reveals:
>
> bf00f040 (bf00f000): scsi_wait_scan syms: 0 gplsyms: 0 ksyms: 1
> bf00a1f8 (bf008000): smsc95xx syms: 0 gplsyms: 0 ksyms: 60
> bf002a40 (bf000000): usbnet syms: 0 gplsyms: 24 ksyms: 65
>
> The ksyms comes from KALLSYMS and by default it only includes text and
> inittext symbols. This explains why Lei is not able to see data etc. symbols
> when he runs 'sym -m <module>'.
>
> So I believe crash on ARM works as it should in this case.
I note that the symbols exported by ARM modules prior to mod -[sS]
contains a bunch of "$d" and "$a" symbols. The ARM
arm_verify_symbol()
function rejects symbols of that type, but that is only called if the
"mod -[sS]" function is run.
In other words, this is the flow during session initialization:
module_init()
store_module_symbols_v2() -> symbols from KALLSYMS + in-kernel module
struct
And if "mod -[sS]" is done, it goes like this:
cmd_mod()
do_module_cmd()
load_module_symbols()
store_load_module_symbols() -> symbols from module.ko file
machdep->verify_symbol()
So the "$d" and "$a" are there from the initialization-time onward.
But since store_module_symbols_v2() has never called machdep->verify_symbol()
I'm a bit hesitant to make it do so for all architectures without knowing the
consequences. But it certainly seems legitimate in the
"machine_type("ARM")" case.
Indeed. However, I'm a bit concerned because there is this check:
if (STREQ(name, "swapper_pg_dir"))
machdep->flags |= KSYMS_START;
if (!name || !strlen(name) || !(machdep->flags & KSYMS_START))
return FALSE;
so if the KSYMS_START is not yet set (is that possible?) we might reject a
valid symbol from a module.
> > But the user-space vtop is clearly wrong:
> >
> > crash> vm
> > PID: 1495 TASK: c1ef1380 CPU: 0 COMMAND: "bash"
> > MM PGD RSS TOTAL_VM
> > c30cd1e0 c1de4000 1484k 2940k
> > VMA START END FLAGS FILE
> > c1e9ae90 8000 c2000 8001875 /bin/bash
> > c1e9aee8 c9000 ce000 8101877 /bin/bash
> > c1e9af40 ce000 d3000 100077
> > c2fc27b0 1247000 1268000 100077
> > c2fc2650 4001c000 4001d000 100077
> > c1e9af98 40038000 40055000 8000875 /lib/ld-linux.so.3
> > c2fc20d0 4005c000 4005d000 8100875 /lib/ld-linux.so.3
> > c2fc2758 4005d000 4005e000 8100877 /lib/ld-linux.so.3
> > ...
> >
> >
> > crash> vtop 8000
> > VIRTUAL PHYSICAL
> > 8000 8000
> >
> > PAGE DIRECTORY: c1de4000
> > PGD: c1de4000 => 412
> > PMD: c1de4000 => 412
> > PAGE: 0 (1MB)
> >
> >
> > VMA START END FLAGS FILE
> > c1e9ae90 8000 c2000 8001875 /bin/bash
> >
> > crash> vtop 4005d000
> > VIRTUAL PHYSICAL
> > 4005d000 4005d000
> >
> > PAGE DIRECTORY: c1de4000
> > PGD: c1de5000 => 40000412
> > PMD: c1de5000 => 40000412
> > PAGE: 40000000 (1MB)
> >
> >
> > VMA START END FLAGS FILE
> > c2fc2758 4005d000 4005e000 8100877 /lib/ld-linux.so.3
>
> This is actually a known issue on ARM (just remembered that). When the crash
> happens it identity maps the whole address space of the running process. This
> has been fixed by upstream commit:
>
> commit 2c8951ab0c337cb198236df07ad55f9dd4892c26
> Author: Will Deacon <will.deacon(a)arm.com>
> Date: Wed Jun 8 15:53:34 2011 +0100
>
> ARM: idmap: use idmap_pgd when setting up mm for reboot
>
> For soft-rebooting a system, it is necessary to map the MMU-off code
> with an identity mapping so that execution can continue safely once the
> MMU has been switched off.
>
> Currently, switch_mm_for_reboot takes out a 1:1 mapping from 0x0 to
> TASK_SIZE during reboot in the hope that the reset code lives at a
> physical address corresponding to a userspace virtual address.
>
> This patch modifies the code so that we switch to the idmap_pgd tables,
> which contain a 1:1 mapping of the cpu_reset code. This has the
> advantage of only remapping the code that we need and also means we
> don't need to worry about allocating a pgd from an atomic context in the
> case that the physical address of the cpu_reset code aliases with the
> virtual space used by the kernel.
>
> It went in for 3.2 and Luc's kernel is v3.1.1 which explains this.
>
> If you select any other task vtop should work fine. For example cron daemon:
>
> crash> vm
> PID: 316 TASK: c2a7c160 CPU: 0 COMMAND: "crond"
> MM PGD RSS TOTAL_VM
> c30cd060 c0a70000 836k 2916k
> VMA START END FLAGS FILE
> c1cdd860 8000 15000 8001875 /usr/sbin/crond
> c1cddcd8 1c000 1d000 8101875 /usr/sbin/crond
> c1d7d758 1d000 1e000 8101877 /usr/sbin/crond
> c1cddd88 1e000 9e000 100077
> c1d7d5a0 9a4000 9c5000 100077
> ...
>
> crash> vtop 8000
> VIRTUAL PHYSICAL
> 8000 c1030000
>
> PAGE DIRECTORY: c0a70000
> PGD: c0a70000 => c2b3d831
> PMD: c0a70000 => c2b3d831
> PTE: c2b3d020 => c103018f
>
> PAGE: c1030000
>
> PTE PHYSICAL FLAGS
> c103018f c1030000 (PRESENT|YOUNG|EXEC)
>
> VMA START END FLAGS FILE
> c1cdd860 8000 15000 8001875 /usr/sbin/crond
>
> PAGE PHYSICAL MAPPING INDEX CNT FLAGS
> c047d600 c1030000 c09b1590 0 2 228
>
OK good, that explains that...
Is it something that can be worked-around, or is the original pgd
lost forever? If it is not recoverable, then maybe the user-space
vtop should recognize that the bait-and-switch has occurred and fail?
In this case the original PGD is lost forever. But we can certainly detect
that and bail out instead of confusing our users. Maybe something like the
patch below?
Note that I have not tested it on 3.2+ dump (I have none) but it works on the
dumps I have.
Per, Jan, any comments on this?
diff --git a/arm.c b/arm.c
index a3a7c23..03f63e6 100644
--- a/arm.c
+++ b/arm.c
@@ -265,6 +265,10 @@ arm_init(int when)
STRUCT_EXISTS("pteval_t"))
machdep->flags |= PGTABLE_V2;
+ if (THIS_KERNEL_VERSION >= LINUX(3,2,0) ||
+ symbol_exists("idmap_pgd"))
+ machdep->flags |= IDMAP_PGD;
+
machdep->section_size_bits = _SECTION_SIZE_BITS;
machdep->max_physmem_bits = _MAX_PHYSMEM_BITS;
@@ -352,6 +356,8 @@ arm_dump_machdep_table(ulong arg)
fprintf(fp, "%sPHYS_BASE", others++ ? "|" : "");
if (machdep->flags & PGTABLE_V2)
fprintf(fp, "%sPGTABLE_V2", others++ ? "|" : "");
+ if (machdep->flags & IDMAP_PGD)
+ fprintf(fp, "%sIDMAP_PGD", others++ ? "|" : "");
fprintf(fp, ")\n");
fprintf(fp, " kvbase: %lx\n", machdep->kvbase);
@@ -1042,6 +1048,15 @@ arm_uvtop(struct task_context *tc, ulong uvaddr, physaddr_t *paddr,
int verbose)
if (!tc)
error(FATAL, "current context invalid\n");
+ /*
+ * Before idmap_pgd was introduced with upstream commit 2c8951ab0c
+ * (ARM: idmap: use idmap_pgd when setting up mm for reboot), the
+ * panic task pgd was overwritten by soft reboot code, so we can't do
+ * any vtop translations.
+ */
+ if (!(machdep->flags & IDMAP_PGD) && tc->task == tt->panic_task)
+ error(FATAL, "panic task pgd is trashed by soft reboot code\n");
+
*paddr = 0;
if (is_kernel_thread(tc->task) && IS_KVADDR(uvaddr)) {
diff --git a/defs.h b/defs.h
index 1f693c3..8b8b9f3 100755
--- a/defs.h
+++ b/defs.h
@@ -4649,6 +4649,7 @@ struct arm_pt_regs {
#define KSYMS_START (0x1)
#define PHYS_BASE (0x2)
#define PGTABLE_V2 (0x4)
+#define IDMAP_PGD (0x8)
struct machine_specific {
ulong phys_base;