----- Original Message -----
----- Original Message -----
> Hello,
>
> I just noticed that on ppc64le, sometimes "bt" cannot find the stack
> info of current process. For example, there is a vmcore captured by
> kdump on a ppc64le system, which running with a kernel version 3.10. The
> vmcore was captured when kernel oopsed. There is no stack info found by
> bt:
Hello Han,
I've never worked on the backtrace code for ppc64, as it was written
by (and maintained by) IBM. From the debug messages, what happened is
that the starting IP/SP hooks are not being found. The crash command
sequence presumably looks like this:
cmd_bt
back_trace
get_kdump_regs
get_netdump_regs
get_netdump_regs_ppc64 (should setup bt->machdep to point to NT_PRSTATUS
note)
ppc64_get_stack_frame
ppc64_get_dumpfile_stack_frame
ppc64_kdump_stack_frame (should get IP/SP pair based upon NT_PRSTATUS note
contents)
ppc64_back_trace_cmd
ppc64_back_trace
ppc64_kdump_stack_frame() should pull the starting NIP/KSP values from the
pt_regs structure in the per-cpu NT_PRSTATUS note, but it appears that it is not,
leaving the registers at their initialized values of NULL.
This causes the failure later on when ppc64_back_trace_cmd() is called, and which
prints the "=> PC: 0 () FP: 0" debug message shown below, and later on
ppc64_back_trace()
prints the "cannot find the stack info." debug message.
Without the dumpfile, I can't offer much else. Can you verify the crash utility
stack trail above, and if it is as I suspect, figure out why ppc64_kdump_stack_frame()
is failing? Or what other path it is taking?
Actually, if this is a compressed kdump, ppc64_kdump_stack_frame() will not be
called, and the register access is done inside ppc64_get_dumpfile_stack_frame().
The ppc64_get_dumpfile_stack_frame() function first grabs the registers from the pt_regs
structure in the per-cpu NT_PRSTATUS note, but then also checks the hard and soft IRQ
stacks, and the hardware interrupt stack, for known instances of kernel dump functions,
which would override the pt_regs contents. If nothing is found on those stacks,
the registers from the NT_PRSTATUS note are used.
Dave
>
> crash 7.0.9-2.ael7b
> Copyright (C) 2002-2014 Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
> Copyright (C) 1999-2006 Hewlett-Packard Co
> Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
> Copyright (C) 2005, 2011 NEC Corporation
> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions. Enter "help copying" to see the conditions.
> This program has absolutely no warranty. Enter "help warranty" for
> details.
>
> GNU gdb (GDB) 7.6
> Copyright (C) 2013 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <
http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show
copying"
> and "show warranty" for details.
> This GDB was configured as "powerpc64le-unknown-linux-gnu"...
>
> KERNEL: /usr/lib/debug/lib/modules/3.10.0-221.ael7b.ppc64le/vmlinux
> DUMPFILE: /var/crash/127.0.0.1-2015.01.15-22:19:14/vmcore [PARTIAL
> DUMP]
> CPUS: 16
> DATE: Thu Jan 15 21:18:16 2015
> UPTIME: 17:53:43
> LOAD AVERAGE: 213.58, 213.23, 212.70
> TASKS: 1383
> NODENAME:
thymelp2.isst.aus.stglabs.ibm.com
> RELEASE: 3.10.0-221.ael7b.ppc64le
> VERSION: #1 SMP Wed Jan 7 09:27:09 EST 2015
> MACHINE: ppc64le (3425 Mhz)
> MEMORY: 15 GB
> PANIC: "Oops: Kernel access of bad area, sig: 11 [#1]" (check log
> for
> details)
> PID: 1970
> COMMAND: "cat"
> TASK: c0000003130874a0 [THREAD_INFO: c00000005069c000]
> CPU: 5
> STATE: TASK_RUNNING (PANIC)
>
> crash> set debug 99
> debug: 99
> crash> bt
> PID: 1970 TASK: c0000003130874a0 CPU: 5 COMMAND: "cat"
> GETBUF(16384 -> 0)
> <readmem: c00000005069c000, KVADDR, "stack contents", 16384, (ROE),
> 10a81570>
> <read_diskdump: addr: c00000005069c000 paddr: 5069c000 cnt: 16384>
> read_diskdump: paddr/pfn: 5069c000/5069 -> cache physical page: 50690000
> c00000005069c018: do_no_restart_syscall
> c00000005069e870: blk_throtl_bio+240
> c00000005069e990: clone_endio
> c00000005069ea00: generic_make_request_checks+836
> c00000005069eab8: hardware_interrupt_common+128
> c00000005069eac0: generic_make_request+36
> c00000005069eb10: mempool_alloc_slab+36
> c00000005069eb30: mempool_alloc+256
> c00000005069eb50: mempool_alloc_slab+36
> c00000005069ebc0: get_request+948
> c00000005069ec00: __split_and_process_bio+1408
> c00000005069ec20: autoremove_wake_function
> c00000005069ec80: find_busiest_group+544
> c00000005069edf0: load_balance+684
> c00000005069ee10: blk_throtl_bio+240
> c00000005069ee70: find_busiest_group+544
> c00000005069eee0: dequeue_task_fair+968
> c00000005069ef30: clone_endio
> c00000005069ef50: get_page_from_freelist+1436
> c00000005069f0a0: pSeries_cause_ipi_mux+112
> c00000005069f0c0: smp_send_reschedule+164
> c00000005069f0e0: default_wake_function+708
> c00000005069f160: __wake_up_locked+116
> c00000005069f1b0: ep_poll_callback+444
> c00000005069f250: run_posix_cpu_timers+104
> c00000005069f2c0: hvterm_raw_put_chars+64
> c00000005069f2e0: hvc_console_print+336
> c00000005069f3a8: initial_stab+2048
> c00000005069f3b0: crash_save_cpu+252
> c00000005069f488: cik_cp_resume+13476
> c00000005069f490: dev_get_drvdata
> c00000005069f580: default_machine_kexec+332
> c00000005069f610: pSeries_machine_kexec+60
> c00000005069f680: machine_kexec+56
> c00000005069f6a0: crash_kexec+312
> c00000005069f6f0: dev_attr_show+64
> c00000005069f748: cik_cp_resume+13476
> c00000005069f750: dev_get_drvdata
> c00000005069f7f0: radeon_hwmon_show_temp+72
> c00000005069f800: slb_miss_realmode+80
> c00000005069f808: dev_get_drvdata
> c00000005069f810: radeon_hwmon_show_temp+32
> c00000005069f890: die+840
> c00000005069f930: bad_page_fault+224
> c00000005069f948: radeon_hwmon_show_temp+72
> c00000005069f9a0: handle_page_fault+44
> c00000005069fa00: dev_attr_show+64
> c00000005069fa58: cik_cp_resume+13476
> c00000005069fa60: dev_get_drvdata
> c00000005069fb00: radeon_hwmon_show_temp+72
> c00000005069fb10: slb_miss_realmode+80
> c00000005069fb18: dev_get_drvdata
> c00000005069fb20: radeon_hwmon_show_temp+32
> c00000005069fb60: handle_mm_fault+1724
> c00000005069fb80: sysfs_open_file
> c00000005069fbd0: handle_page_fault+16
> c00000005069fc90: alloc_pages_current+416
> c00000005069fd00: dev_attr_show+64
> c00000005069fd30: sysfs_read_file+220
> c00000005069fde0: sys_read+304
> c00000005069fe40: syscall_exit
> [3fffd0d6fe88] back_trace:
> task: c0000003130874a0
> flags: 0
> instptr: 0
> stkptr: 0
> bptr: 0
> stackbase: c00000005069c000
> stacktop: c0000000506a0000
> tc: 1003c7b9fa8 (1970, c0000003130874a0)
> hp: 0
> ref: 0
> stackbuf: 10a81570
> textlist: 0
> frameptr: 0
> call_target: none
> eframe_ip: 0
> debug: 0
> radix: 0
> cpumask: 0
> => PC: 0 () FP: 0
> GETBUF(248 -> 1)
> GETBUF(1500 -> 2)
> cannot find the stack info.
> FREEBUF(2)
> FREEBUF(1)
> crash>
>
>
> Is this a problem?
>
> Thanks in advance!
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
>
https://www.redhat.com/mailman/listinfo/crash-utility
>