September 2006 - Crash-utility - Crash Utility List Archives

finding information about threads

by Guy Streeter

How can I find out what other tasks are threads of (or with) a given task? --Guy

19 years, 5 months

3
4
0 / 0

dis -l problem

by ville.mattila＠stonesoft.com

Hello, When trying to get line numbers for disassembler output, I get "dis: line numbers are not available" and normal disassembled output. What am I missing? thanks, - Ville

19 years, 5 months

2
5
0 / 0

[PATCH] remove busy-wait loop in cmdline.c

by Jean-Marc Saffroy

Hello, While using crash I found that it would suck all my CPU when external commands are invoked (eg. bt piped to $PAGER). The patch below works for me. Cheers, -- saffroy(a)gmail.com Index: crash-4.0-3.4/cmdline.c =================================================================== --- crash-4.0-3.4.orig/cmdline.c 2006-09-25 00:28:11.000000000 +0200 +++ crash-4.0-3.4/cmdline.c 2006-09-25 00:30:30.000000000 +0200 @@ -846,10 +846,8 @@ if (pc->stdpipe) { close(fileno(pc->stdpipe)); pc->stdpipe = NULL; - if (pc->stdpipe_pid && PID_ALIVE(pc->stdpipe_pid)) { - while (!waitpid(pc->stdpipe_pid, &waitstatus, WNOHANG)) - ; - } + if (pc->stdpipe_pid && PID_ALIVE(pc->stdpipe_pid)) + waitpid(pc->stdpipe_pid, &waitstatus, 0); pc->stdpipe_pid = 0; } if (pc->pipe) {

19 years, 5 months

1
0
0 / 0

Re: [Kgdb-bugreport] compiling kernel with -O0 flag (For optimal debugging with kgdb and/or crash).

by Piet Delaney

On Thu, 2006-09-21 at 23:54 +0300, emin ak wrote: > Dear All; > Firstly thank you very much for your great effort for kgdb that makes > kernel much understandable. > I'am using kgdb to debug tcp-ip stack but I have experienced serious > difficulties while debugging inline functions. Hi Emin: Yep, I edit "static inline" to "static_inline" and then define static_inline as static for KGDB kernels. In include/linux/compiler-gcc3.h and include/linux/compiler-gcc4.h I added: ------------------------------------------------------------------ #if defined(CONFIG_KGDB) || defined(CONFIG_KEXEC) # define static_inline static __attribute__ ((__unused__)) # define static__inline__ static __attribute__ ((__unused__)) # define INLINE __attribute__ ((__unused__)) # define __INLINE__ __attribute__ ((__unused__)) #else # define static_inline static inline # define static__inline__ static __inline__ # define INLINE inline # define __INLINE__ __inline__ #endif ---------------------------------------------------------------------- I'm using it today to understand the device mapping and encryption code. It's great! Inline's make skipping over code with the gdb 'next' instruction impossible and you can't see the local variables. I like having a large stack, compiling -O0 and without inlines can increase the stack size. I think I notices more stability by adding this to include/asm-i386/thread_info.h: ---------------------------------------------------------------------- #if defined(CONFIG_DEBUG_PREEMPT_AUDIT) || defined(CONFIG_KGDB) || defined(CONFIG_KEXEC) #define THREAD_SIZE (8192 * 2) #else #ifdef CONFIG_4KSTACKS #define THREAD_SIZE (4096) #else #define THREAD_SIZE (8192) #endif #endif ----------------------------------------------------------------------- Without OPTIMIZATION I found the MMU code needs a tweak in../linux-4/mm/memory.c: ------------------------------------------------------------------------ #if !defined(__PAGETABLE_PUD_FOLDED) || defined(CONFIG_KGDB) || defined(CONFIG_KEXEC) /* * Allocate page upper directory. * * We've already handled the fast-path in-line, and we own the * page table lock. */ pud_t fastcall *__pud_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long address) { . . . } #if !defined(__PAGETABLE_PMD_FOLDED) || defined(CONFIG_KGDB) || defined(CONFIG_KEXEC) /* * Allocate page middle directory. * * We've already handled the fast-path in-line, and we own the * page table lock. */ pmd_t fastcall *__pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address) { . . } --------------------------------------------------------------------------- Maybe I should have used #if !defined(__OPTIMIZE__) in ../linux-4/mm/memory.c. Another change is I needed to define a few network byte swapping functions. I currently define them in ../linux-4/net/core/sock.c but I'm not resistant to putting it in a better place: ---------------------------------------------------------------------------- /* * If compiling -O0 we need to define * these functions somewhere. */ #if !defined(__OPTIMIZE__) #define ___htonl(x) __cpu_to_be32(x) #define ___htons(x) __cpu_to_be16(x) #define ___ntohl(x) __be32_to_cpu(x) #define ___ntohs(x) __be16_to_cpu(x) __u32 htonl(__be32 x) { return(___htonl(x)); } __u32 ntohl(__be32 x) { return(___ntohl(x)); } __be16 htons(__u16 x) { return(___htons(x)); } __u16 ntohs(__be16 x) { return(___ntohs(x)); } EXPORT_SYMBOL(htonl); EXPORT_SYMBOL(ntohl); EXPORT_SYMBOL(htons); EXPORT_SYMBOL(ntohs); #endif ------------------------------------------------------------------------------ Sometimes I only want to compile the tcp/ip code -O0, so I modified the networking Makefiles and added: ------------------------------------------------------------------------------- ifdef CONFIG_KGDB CFLAGS += -gdwarf-2 -O0 else ifdef CONFIG_KEXEC CFLAGS += -gdwarf-2 -O0 endif endif ------------------------------------------------------------------------------ In the top level kernel Makefile I have: ------------------------------------------------------------------------------ ifdef CONFIG_FRAME_POINTER CFLAGS += -fno-omit-frame-pointer else CFLAGS += -fomit-frame-pointer endif # # Compiling the complete kernel without optimization (-O0) for enhanced debugging # with kgdb/kdump requires ./mm/memory.c to have: # # if !defined(__PAGETABLE_PUD_FOLDED) || defined(CONFIG_KGDB) || defined(CONFIG_KEXEC) # and # if !defined(__PAGETABLE_PMD_FOLDED) || defined(CONFIG_KGDB) || defined(CONFIG_KEXEC) # # A less invasive procedure is to use -O1 and only use -O0 for networking code. # The networking Makefiles have been setup to support this. So just change # -O0 to -O1 below and back out the kgdb change in ./mm/memory.c for a # less invasive change. Compiling -O0 also required increasing ROUNDUP_WAIT in # linux/kernel/kgdb.c; value in 2.6.12 patch was way to low and value in 2.6.16 # is marginal and frequently causes lead CPU to times out prematurely waiting for # other CPU's to stop. # ifdef CONFIG_DEBUG_INFO ifdef CONFIG_KGDB CFLAGS += -gdwarf-2 -O0 else ifdef CONFIG_KEXEC CFLAGS += -gdwarf-2 -O1 else CFLAGS += -g endif endif endif ------------------------------------------------------------------------------------ > I know this is not a > bug but with -O optimizations and inlines on tcp-ip stack, program > counter goes everywhere madly even with step or next command and this > makes debugging incomprehensible. Yep, I don't understand why everyone else doesn't. It's also like using debug printf's, I like being able to trace the code to get the big picture and then a -O0 to look at details with kgdb. Some believe doing this kind of stuff is blasphemy. The Bible says I should be killed for working on Sunday; I happen to disagree. > At this point I have two questions: > 1- Is there any way to compile kernel with -O0 flag and if it's > possible, may it cause any problems? I offered to post them to Amit back on Sept 06(2:45 PM) but I don't think I ever heard back. I'd prefer to see the -O0 and KGDB_DEBUG code for tracing the kgdb stub assimilated. If they would be accepted I could make a patch to Tom's git repository... > 2- Why does kernel fail while compiling with O0 flag and why does > linux kernel depends on inline functions so much? I think it's an obsession with performance. As long as I/we can map "static inline" to "static" it's not a big deal. > Is there anyone > whoever uses kgdb for debugging linux tcp-ip stack or any effort to > compile kernel with no optimization? I'm using it every day; works great. I also recommend by SOCK_DEBUG, SKB_DEBUG, and TCP_DEBUG macros to trace the TCP code. I also indent the trace to make it easy to read. function1() { function2() { function3() { function4(); } } } The brackets make it easy to see the scope of the trace with vi. I like tracing with 'C' syntax since it what the reader is use to. If folks are interested I could also add that to the git diff, but I think that likely belongs else where and isn't likely the current dogma. See snippet from attached network trace. I gave a talk at a UNENIX conference back in the 1980 recommending a common UNIX tracing paradigm and a few liked it. The director of Siemens, Struck Zimmerman, didn't; you can't please everyone, so I just do what I think is best and live with the world not being as I'd expect it to be. For TCP I'm using the attached sock.h fragment which has a backward compatible SOCK_DEBUG() macro. I used the same paradigm in skbuff.h; see attachment. Likewise I'm doing the same in kgdb.h; also attached. In printk I added: -------------------------------------------------------------------- for (tp = tbuf; tp < tbuf + tlen; tp++) emit_log_char(*tp); printed_len += tlen - 3; #ifdef CONFIG_PRINTK_INDENT if (!in_interrupt()) { int depth = stack_depth(); int i; if ((depth > 0) && (depth < 120)) { for(i = 0; i < depth; i++) { emit_log_char(' '); printed_len++; } } } #endif ----------------------------------------------------------- and I added stack_depth() function to ../linux-4/arch/i386/kernel/traps.c ----------------------------------------------------------- int stack_depth(void) { struct thread_info *tinfo; unsigned long ebp; int depth = 0; #ifdef CONFIG_FRAME_POINTER asm("andl %%esp,%0; ":"=r" (tinfo) : "0" (~(THREAD_SIZE - 1))); asm ("movl %%ebp, %0" : "=r" (ebp) : ); while (valid_stack_ptr(tinfo, (void *)ebp)) { ebp = *(unsigned long *)ebp; if (depth++ > 100) { break; } } #endif return(depth); } --------------------------------------------------------------------------- Let me know if you you have any questions. Sounds like your on the right track; IMHO. -piet > > Thanks alot. > Emin > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Kgdb-bugreport mailing list > Kgdb-bugreport(a)lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport -- Piet Delaney Phone: (408) 200-5256 Blue Lane Technologies Fax: (408) 200-5299 10450 Bubb Rd. Cupertino, Ca. 95014 Email: piet(a)bluelane.com

19 years, 5 months

1
0
0 / 0

kexec/kdump and crash in October's "Linux Magazine"

by Dave Anderson

For what it's worth, the crash utility got a little exposure in the October issue of Linux Magazine. In its "Gearheads" section, Sreekrishnan Venkateswaran of IBM India wrote a 5-page article titled "Using kexec and kdump". The first 2 pages were specific to the use of kexec/kdump, but the last 3 pages were dedicated to crash utility debugging sessions on two sample kdump vmcores. Of course, given the magazine's two-column-per-page output, crash's 80-column output gets mercilessly wrapped, and looks like hell... But hey, any publicity is good I guess... Dave

19 years, 5 months

2
2
0 / 0

crash version 4.0-3.4 is available

by Dave Anderson

- Implemented support for x86_64 and ia64 compressed kdump dumpfiles created by the makedumpfile command, which need to pass their respective physical address load locations in a kdump-specific dumpfile sub-header. (oomichi(a)mxs.nes.nec.co.jp) - Fix for the "timer" command on 2.6.17 and later kernels. Without this patch, the command would spew out error messages of the sort: timer: invalid list entry: 0 timer: ignoring faulty timer list at index 0 of timer array This was due to the kernel's tvec_bases data structures being moved out of the per-cpu memory regions, and replaced with just per-cpu pointers to the data. (anderson(a)redhat.com) - Fix for ia64 machines whose kernel's text and static data region 5 segment is not loaded at physical address 64MB; live systems get the physical load address from /proc/iomem, while kdump dumpfiles contain the load address in the ELF header. Without this patch, the crash session would fail during initialization with a "crash: invalid kernel virtual address: [address] type: xtime" error message. The physical address may still be forcibly set using the command line option "--machdep phys_start=[address]" (anderson(a)redhat.com) - When using the "--machdep phys_start=[address]" on an ia64 machine, an irrelevant error message indicating: "WARNING: invalid vm= option" would be displayed. (anderson(a)redhat.com) - Updated the ppc64 page size determination from always using getpagesize() on the host machine to symbolically determining whether 64k page sizes are in use. (sachinp(a)in.ibm.com) - Enhancement of the "sig" command to display the lists of both private and/or shared queued signals, if any. (olivier.daudel(a)u-paris10.fr) - Adapted "mount [-n pid|task]" patch, which displays the mounted filesystems with respect to the namespace of a given pid or task. (olivier.daudel(a)u-paris10.fr) - Fix for running crash without parameters on a live system that does not have a "/usr/src" directory, which would result in a segmentation violation. (holzheu(a)de.ibm.com) - The /proc/version check against vmlinux "strings" output needed to be made aware that some other character may be adjacent to the "L" in the "Linux version..." string. This would lead to erroneous "vmlinux and /proc/version do not match!" errors during initialization. (holzheu(a)de.ibm.com) - gdb-6.1.patch update for gdb-6.1/sim/ppc/debug.c to compile in SUSE build environment. (olh(a)suse.de) (9/19/06) Download from: http://people.redhat.com/anderson

19 years, 5 months

1
0
0 / 0

PATCH] support vmcores with 64k page size on 4k page size kernels

by Sachin P. Sant

Hi Dave Recently there were changes made to kexec tools to support 64K page size. With those changes vmcore file obtained from a kernel which supports 64K page size cannot be analyzed using crash on a machine running with kernel supporting 4K page size. The following changes in crash tool resolves the problem. Look if the symbol __hash_page_64k exists. This symbol is defined only for kernels with 64K PAGE SIZE support. If yes then the dump was taken with a kernel supporting 64k page size. So use the vmcore page size[64K] instead of getpagesize(). Thanks -Sachin * Recently there were changes made to kexec tools to support 64K page size. With those changes vmcore file obtained from a kernel which supports 64K page size cannot be analyzed using crash on a machine running with kernel supporting 4K page size. The following change were made in crash tool to solve the problem. Look if the symbol __hash_page_64k exists. If yes then the dump was taken with a kernel supporting 64k page size. So change the page size accordingly. Signed-Off-By: Sachin Sant <sachinp(a)in.ibm.com> -- diff -Naurp crash-4.0/defs.h crash-4.0-new/defs.h --- crash-4.0/defs.h 2006-09-14 21:41:50.000000000 -0400 +++ crash-4.0-new/defs.h 2006-09-14 21:48:37.000000000 -0400 @@ -2297,6 +2297,7 @@ struct efi_memory_desc_t { #define _64BIT_ #define MACHINE_TYPE "PPC64" +#define PPC64_64K_PAGE_SIZE 65536 #define PAGEBASE(X) (((ulong)(X)) & (ulong)machdep->pagemask) #define PTOV(X) ((unsigned long)(X)+(machdep->kvbase)) diff -Naurp crash-4.0/ppc64.c crash-4.0-new/ppc64.c --- crash-4.0/ppc64.c 2006-09-14 21:41:50.000000000 -0400 +++ crash-4.0-new/ppc64.c 2006-09-14 21:50:26.000000000 -0400 @@ -67,19 +67,6 @@ ppc64_init(int when) machdep->verify_symbol = ppc64_verify_symbol; if (pc->flags & KERNEL_DEBUG_QUERY) return; - machdep->pagesize = memory_page_size(); - machdep->pageshift = ffs(machdep->pagesize) - 1; - machdep->pageoffset = machdep->pagesize - 1; - machdep->pagemask = ~((ulonglong)machdep->pageoffset); - machdep->stacksize = 4 * machdep->pagesize; - if ((machdep->pgd = (char *)malloc(PAGESIZE())) == NULL) - error(FATAL, "cannot malloc pgd space."); - if ((machdep->pmd = (char *)malloc(PAGESIZE())) == NULL) - error(FATAL, "cannot malloc pmd space."); - if ((machdep->ptbl = (char *)malloc(PAGESIZE())) == NULL) - error(FATAL, "cannot malloc ptbl space."); - if ((machdep->machspec->level4 = (char *)malloc(PAGESIZE())) == NULL) - error(FATAL, "cannot malloc level4 space."); machdep->last_pgd_read = 0; machdep->last_pmd_read = 0; machdep->last_ptbl_read = 0; @@ -93,6 +80,40 @@ ppc64_init(int when) break; case PRE_GDB: + /* + * Recently there were changes made to kexec tools + * to support 64K page size. With those changes + * vmcore file obtained from a kernel which supports + * 64K page size cannot be analyzed using crash on a + * machine running with kernel supporting 4K page size + * + * The following modifications are required in crash + * tool to be in sync with kexec tools. + * + * Look if the following symbol exists. If yes then + * the dump was taken with a kernel supporting 64k + * page size. So change the page size accordingly. + * + * Also moved the following code block from + * PRE_SYMTAB case here. + */ + if (symbol_exists("__hash_page_64K")) + machdep->pagesize = PPC64_64K_PAGE_SIZE; + else + machdep->pagesize = memory_page_size(); + machdep->pageshift = ffs(machdep->pagesize) - 1; + machdep->pageoffset = machdep->pagesize - 1; + machdep->pagemask = ~((ulonglong)machdep->pageoffset); + machdep->stacksize = 4 * machdep->pagesize; + if ((machdep->pgd = (char *)malloc(PAGESIZE())) == NULL) + error(FATAL, "cannot malloc pgd space."); + if ((machdep->pmd = (char *)malloc(PAGESIZE())) == NULL) + error(FATAL, "cannot malloc pmd space."); + if ((machdep->ptbl = (char *)malloc(PAGESIZE())) == NULL) + error(FATAL, "cannot malloc ptbl space."); + if ((machdep->machspec->level4 = (char *)malloc(PAGESIZE())) == NULL) + error(FATAL, "cannot malloc level4 space."); + machdep->kvbase = symbol_value("_stext"); machdep->identity_map_base = machdep->kvbase; machdep->is_kvaddr = generic_is_kvaddr;

19 years, 5 months

3
5
0 / 0

Re: [Crash-utility] shared pending signal queue added to sig command

by Olivier Daudel

----- Original Message ----- From: Dave Anderson To: Discussion list for crash utility usage, maintenance and development Sent: Wednesday, September 13, 2006 10:57 PM Subject: Re: [Crash-utility] shared pending signal queue added to sig command Sorry -- I can't use it as us because it doesn't work with 2.4 kernels, or at least with RHEL3 kernels. Here's a RHEL3 (2.4.21-37.ELsmp) I don't actualy have this Linux version, but in looking in the task.c code, i suppose OFFSET_OPTION(sigpending_head, sigpending_list)) should do the job ? Three other things: (1) Shouldn't your "SHARED_PENDING" line have a following "yes" or "no" indication, as is the case with the "SIGPENDING:" output? For me the test on TIF_SIGPENDING is where the decision is made. (2) Thanks for remembering to add "signal_struct_shared_pending" do dump_offset_table() -- but also don't forget to add the new "sigpending_signal" addition you made to the size_table structure, which is also dumped inside the dump_offset_table() function. OK. (3) The SIQUEUE output is now associated with either SIGPENDING or SHARED_PENDING, so the two SIGQUEUE outputs should be indented so it's more obvious. OK for that. Also, if we have both queues, i use PRIVATE_PENDING and SHARED_PENDING. SIGPENDING: no PRIVATE_PENDING SIGNAL: 0000000200000800 BLOCKED: 8000000200000800 SIGQUEUE: SIG SIGINFO 12 f4561a34 34 f45619a0 SHARED_PENDING SIGNAL: 8000000000000800 SIGQUEUE: SIG SIGINFO 12 f456190c 64 f4561f68 64 f4561ed4

19 years, 5 months

2
4
0 / 0

[PATCH] SIGSEGV in build_searchdirs()

by Michael Holzheu

Hi Dave! This patch fixes the following (minor) problem: If the directory "/usr/src" does not exist and crash is called without parameters, it dies with SIGSEGV. The reason is that the searchdirs buffer is not allocated, if "/usr/src" is not present. This fix allocates the buffer in any case. --- filesys.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff -Naur crash-4.0-3.3/filesys.c crash-4.0-3.3-searchdirs-fix/filesys.c --- crash-4.0-3.3/filesys.c 2006-09-07 21:00:08.000000000 +0200 +++ crash-4.0-3.3-searchdirs-fix/filesys.c 2006-09-14 14:02:35.000000000 +0200 @@ -315,14 +315,12 @@ for (dp = readdir(dirp); dp != NULL; dp = readdir(dirp)) cnt++; - if ((searchdirs = (char **)malloc(cnt * sizeof(char *))) - == NULL) { + if ((searchdirs = calloc(cnt, sizeof(char *))) == NULL) { error(INFO, "/usr/src/ directory list malloc: %s\n", strerror(errno)); closedir(dirp); return default_searchdirs; } - BZERO(searchdirs, cnt * sizeof(char *)); for (i = 0; i < DEFAULT_SEARCHDIRS; i++) searchdirs[i] = default_searchdirs[i]; @@ -357,6 +355,16 @@ closedir(dirp); searchdirs[cnt] = NULL; + } else { + if ((searchdirs = calloc(cnt, sizeof(char *))) == NULL) { + error(INFO, "search directory list malloc: %s\n", + strerror(errno)); + closedir(dirp); + return default_searchdirs; + } + for (i = 0; i < DEFAULT_SEARCHDIRS; i++) + searchdirs[i] = default_searchdirs[i]; + cnt = DEFAULT_SEARCHDIRS; } if (redhat_kernel_directory_v1(dirbuf)) {

19 years, 5 months

2
1
0 / 0

[PATCH] fix for '/proc/version' check

by Michael Holzheu

Hi Dave! It can happen that crash does not extract the correct Linux version string from vmlinux. Therefore the comparison with /proc/version fails at startup: WARNING: /usr/lib/debug/lib/modules/2.6.17-1.2519.4.21.el5/vmlinux and /proc/version do not match! Crash gets the release string by searching strings in vmlinux. Something like: >> strings vmlinux | grep "Linux version" >`0Linux version 2.6.17-1.2519.4.21.el5 ... On our system the string does not start with "Linux...", but with "0Linux...". Therefore crash assumes a wrong version string. To fix this, we should skip the leading "0": --- diff o-Naur crash-4.0-3.3/filesys.c crash-4.0-3.3-proc-version-check-fix/filesys.c --- crash-4.0-3.3/filesys.c 2006-09-07 21:00:08.000000000 +0200 +++ crash-4.0-3.3-proc-version-check-fix/filesys.c 2006-09-13 15:36:54.000000000 +0200 @@ -244,10 +244,12 @@ found = FALSE; while (fgets(buffer, BUFSIZE-1, pipe)) { - if (!strstr(buffer, "Linux version 2.")) + char* ptr; + ptr = strstr(buffer, "Linux version 2."); + if (!ptr) continue; - if (STREQ(buffer, kt->proc_version)) + if (STREQ(ptr, kt->proc_version)) found = TRUE; break; }

19 years, 5 months

2
1
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility September 2006