----- Original Message -----
Hello,
I just noticed that on ppc64le, sometimes "bt" cannot find the stack
info of current process. For example, there is a vmcore captured by
kdump on a ppc64le system, which running with a kernel version 3.10. The
vmcore was captured when kernel oopsed. There is no stack info found by
bt:
Hello Han,
I've never worked on the backtrace code for ppc64, as it was written
by (and maintained by) IBM. From the debug messages, what happened is
that the starting IP/SP hooks are not being found. The crash command
sequence presumably looks like this:
cmd_bt
back_trace
get_kdump_regs
get_netdump_regs
get_netdump_regs_ppc64 (should setup bt->machdep to point to NT_PRSTATUS
note)
ppc64_get_stack_frame
ppc64_get_dumpfile_stack_frame
ppc64_kdump_stack_frame (should get IP/SP pair based upon NT_PRSTATUS note
contents)
ppc64_back_trace_cmd
ppc64_back_trace
ppc64_kdump_stack_frame() should pull the starting NIP/KSP values from the
pt_regs structure in the per-cpu NT_PRSTATUS note, but it appears that it is not,
leaving the registers at their initialized values of NULL.
This causes the failure later on when ppc64_back_trace_cmd() is called, and which
prints the "=> PC: 0 () FP: 0" debug message shown below, and later on
ppc64_back_trace()
prints the "cannot find the stack info." debug message.
Without the dumpfile, I can't offer much else. Can you verify the crash utility
stack trail above, and if it is as I suspect, figure out why ppc64_kdump_stack_frame()
is failing? Or what other path it is taking?
Dave
crash 7.0.9-2.ael7b
Copyright (C) 2002-2014 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "powerpc64le-unknown-linux-gnu"...
KERNEL: /usr/lib/debug/lib/modules/3.10.0-221.ael7b.ppc64le/vmlinux
DUMPFILE: /var/crash/127.0.0.1-2015.01.15-22:19:14/vmcore [PARTIAL DUMP]
CPUS: 16
DATE: Thu Jan 15 21:18:16 2015
UPTIME: 17:53:43
LOAD AVERAGE: 213.58, 213.23, 212.70
TASKS: 1383
NODENAME:
thymelp2.isst.aus.stglabs.ibm.com
RELEASE: 3.10.0-221.ael7b.ppc64le
VERSION: #1 SMP Wed Jan 7 09:27:09 EST 2015
MACHINE: ppc64le (3425 Mhz)
MEMORY: 15 GB
PANIC: "Oops: Kernel access of bad area, sig: 11 [#1]" (check log for
details)
PID: 1970
COMMAND: "cat"
TASK: c0000003130874a0 [THREAD_INFO: c00000005069c000]
CPU: 5
STATE: TASK_RUNNING (PANIC)
crash> set debug 99
debug: 99
crash> bt
PID: 1970 TASK: c0000003130874a0 CPU: 5 COMMAND: "cat"
GETBUF(16384 -> 0)
<readmem: c00000005069c000, KVADDR, "stack contents", 16384, (ROE),
10a81570>
<read_diskdump: addr: c00000005069c000 paddr: 5069c000 cnt: 16384>
read_diskdump: paddr/pfn: 5069c000/5069 -> cache physical page: 50690000
c00000005069c018: do_no_restart_syscall
c00000005069e870: blk_throtl_bio+240
c00000005069e990: clone_endio
c00000005069ea00: generic_make_request_checks+836
c00000005069eab8: hardware_interrupt_common+128
c00000005069eac0: generic_make_request+36
c00000005069eb10: mempool_alloc_slab+36
c00000005069eb30: mempool_alloc+256
c00000005069eb50: mempool_alloc_slab+36
c00000005069ebc0: get_request+948
c00000005069ec00: __split_and_process_bio+1408
c00000005069ec20: autoremove_wake_function
c00000005069ec80: find_busiest_group+544
c00000005069edf0: load_balance+684
c00000005069ee10: blk_throtl_bio+240
c00000005069ee70: find_busiest_group+544
c00000005069eee0: dequeue_task_fair+968
c00000005069ef30: clone_endio
c00000005069ef50: get_page_from_freelist+1436
c00000005069f0a0: pSeries_cause_ipi_mux+112
c00000005069f0c0: smp_send_reschedule+164
c00000005069f0e0: default_wake_function+708
c00000005069f160: __wake_up_locked+116
c00000005069f1b0: ep_poll_callback+444
c00000005069f250: run_posix_cpu_timers+104
c00000005069f2c0: hvterm_raw_put_chars+64
c00000005069f2e0: hvc_console_print+336
c00000005069f3a8: initial_stab+2048
c00000005069f3b0: crash_save_cpu+252
c00000005069f488: cik_cp_resume+13476
c00000005069f490: dev_get_drvdata
c00000005069f580: default_machine_kexec+332
c00000005069f610: pSeries_machine_kexec+60
c00000005069f680: machine_kexec+56
c00000005069f6a0: crash_kexec+312
c00000005069f6f0: dev_attr_show+64
c00000005069f748: cik_cp_resume+13476
c00000005069f750: dev_get_drvdata
c00000005069f7f0: radeon_hwmon_show_temp+72
c00000005069f800: slb_miss_realmode+80
c00000005069f808: dev_get_drvdata
c00000005069f810: radeon_hwmon_show_temp+32
c00000005069f890: die+840
c00000005069f930: bad_page_fault+224
c00000005069f948: radeon_hwmon_show_temp+72
c00000005069f9a0: handle_page_fault+44
c00000005069fa00: dev_attr_show+64
c00000005069fa58: cik_cp_resume+13476
c00000005069fa60: dev_get_drvdata
c00000005069fb00: radeon_hwmon_show_temp+72
c00000005069fb10: slb_miss_realmode+80
c00000005069fb18: dev_get_drvdata
c00000005069fb20: radeon_hwmon_show_temp+32
c00000005069fb60: handle_mm_fault+1724
c00000005069fb80: sysfs_open_file
c00000005069fbd0: handle_page_fault+16
c00000005069fc90: alloc_pages_current+416
c00000005069fd00: dev_attr_show+64
c00000005069fd30: sysfs_read_file+220
c00000005069fde0: sys_read+304
c00000005069fe40: syscall_exit
[3fffd0d6fe88] back_trace:
task: c0000003130874a0
flags: 0
instptr: 0
stkptr: 0
bptr: 0
stackbase: c00000005069c000
stacktop: c0000000506a0000
tc: 1003c7b9fa8 (1970, c0000003130874a0)
hp: 0
ref: 0
stackbuf: 10a81570
textlist: 0
frameptr: 0
call_target: none
eframe_ip: 0
debug: 0
radix: 0
cpumask: 0
=> PC: 0 () FP: 0
GETBUF(248 -> 1)
GETBUF(1500 -> 2)
cannot find the stack info.
FREEBUF(2)
FREEBUF(1)
crash>
Is this a problem?
Thanks in advance!
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility