April 2010 - Crash-utility - Crash Utility List Archives

Re: [Crash-utility] dev -p command fails post linux 2.6.25

by Dave Anderson

----- "Sharyathi Nagesh" <sharyath(a)in.ibm.com> wrote: > Hi Dave > As you may be aware dev -p command post linux kernel 2.6.25 fails with > "no PCI devices found on this system.". When I went through the kernel > a specific commit has removed pci_devices variable from the kernel code > > ================================================================== > git commit: 5ff580c10ec06fd296bd23d4570c1a95194094a0 > by Greg Kroah-Hartman <gregkh(a)suse.de> > > > This is what he says in the commit > ------------------------------------------------------------------------------------------------------------ > This patch finally removes the global list of PCI devices. We are > relying entirely on the list held in the driver core now, and do not > need a separate "shadow" list as no one uses it. > ------------------------------------------------------------------------------------------------------------ > > ================================================================== > > I saw some of your earlier postings where you have specifically mentioned about this problem: > http://www.mail-archive.com/crash-utility@redhat.com/msg00346.html > > > With this I wanted to know, if you intend to keep dev -p behavior as it is now or > there is any plan to change it to display actual values? > > Thank You > Sharyathi N I (personally) have no plans to change it. If I remember correctly, Bud Brown came up with an alternate scheme, but the imported data from the kernel proper required to accomplish it was enormous (bordering on absurd), so I suggested that it would be more appropriate as an extension module. Bud -- feel free to chime in here... ;-) For that matter, even the "old" way required the import of ~1000 lines of kernel #define's -- which always bugged me -- and was pretty much the only crash command that had to do such a thing. Dave

15 years, 1 month

2
3
0 / 0

Re: [Crash-utility] dev -p command fails post linux 2.6.25

by bud.brown＠redhat.com

I've a crash extension called 'pci', its a direct replacement for the 'dev' command but contains its own (huge) internal database of pci information. I thought I'd finished integrating oui.txt from ieee and pci.ids from sourceforge but looking at the code today, I have *not* done that yet.... hmmm thought I'd finished that stuff. But, its on the todo list. I wrote it about a year ago and haven't really done anything with the sources since then. It only uses the pci information from www.pcidatabase.com at present. I'll see if I can get this up on the net for use if there is interest. There are two parts, you download the source pci id information from the web and run a pre-processor on it, and then compile the extension with the header file you've created. The ext-pci.so is currently ~1.1MB in size. Another todo list item was to shrink the size of the database via restructuring it but I don't expect more than a 25%-33% reduction in size at most. Bud Brown Red Hat, Inc Westford, MA SEG/Storage Team ------------------------------ Message: 4 Date: Tue, 27 Apr 2010 16:36:31 +0530 From: Sharyathi Nagesh <sharyath(a)in.ibm.com> To: Dave Anderson <anderson(a)redhat.com> Cc: Crash-utility(a)redhat.com, huachenl(a)cn.ibm.com Subject: Message-ID: <4BD6C537.9030102(a)in.ibm.com> Content-Type: text/plain; charset=UTF-8 Thanks for the reply Dave. Even I feel it is better to change the error message to what you have suggested. That is simplest way to address the issue as well. It gives a clearer message to the user. I am curious to hear from Bud on the alternate scheme and progress as well. Bud your views ? Thanks Sharyathi On 04/23/2010 08:01 PM, Dave Anderson wrote: > > ----- "Dave Anderson" <anderson(a)redhat.com> wrote: > >>> With this I wanted to know, if you intend to keep dev -p behavior >>> as it is now or there is any plan to change it to display actual >>> values? >>> >>> Thank You Sharyathi N >> >> I (personally) have no plans to change it. If I remember >> correctly, Bud Brown came up with an alternate scheme, but the >> imported data from the kernel proper required to accomplish it was >> enormous (bordering on absurd), so I suggested that it would be >> more appropriate as an extension module. >> >> Bud -- feel free to chime in here... ;-) >> >> For that matter, even the "old" way required the import of ~1000 >> lines of kernel #define's -- which always bugged me -- and was >> pretty much the only crash command that had to do such a thing. >> >> Dave > > Actually, at a minimum, I should change this: > > if (!symbol_exists("pci_devices")) error(FATAL, "no PCI devices found > on this system.\n"); > > to the generic option_not_supported() message. > > Dave ------------------------------ -- Crash-utility mailing list Crash-utility(a)redhat.com https://www.redhat.com/mailman/listinfo/crash-utility End of Crash-utility Digest, Vol 55, Issue 11 *********************************************

15 years, 2 months

1
0
0 / 0

Re: [Crash-utility] [PATCH] Use only tasks on online CPUs for bt -a

by Dave Anderson

----- "Michael Holzheu" <holzheu(a)linux.vnet.ibm.com> wrote: > Hi Dave, > > On Mon, 2010-04-26 at 11:56 -0400, Dave Anderson wrote: > > Sorry -- I take it back. Running a test shows that it breaks "bt -a" > > on Xen dumpfiles where the cpus are marked offline prior to dumping > > the kernel memory. > > > > I think this should be moved to the processor-specific backtrace functions, > > which can just display "OFFLINE" or something to that effect. > > Ok, fine. What about the following... That's good -- queued for the next release. Thanks, Dave > --- > s390.c | 5 +++++ > s390x.c | 5 +++++ > 2 files changed, 10 insertions(+) > > --- a/s390.c > +++ b/s390.c > @@ -603,11 +603,16 @@ s390_back_trace_cmd(struct bt_info *bt) > unsigned long async_start = 0, async_end = 0; > unsigned long panic_start = 0, panic_end = 0; > unsigned long stack_end, stack_start, stack_base; > + int cpu = bt->tc->processor; > > if (bt->hp && bt->hp->eip) { > error(WARNING, > "instruction pointer argument ignored on this architecture!\n"); > } > + if (is_task_active(bt->task) && (!(kt->cpu_flags[cpu] & ONLINE))) { > + fprintf(fp, " CPU offline\n"); > + return; > + } > ksp = bt->stkptr; > > /* print lowcore and get async stack when task has cpu */ > --- a/s390x.c > +++ b/s390x.c > @@ -836,11 +836,16 @@ s390x_back_trace_cmd(struct bt_info *bt) > unsigned long panic_start = 0, panic_end = 0; > unsigned long stack_end, stack_start, stack_base; > unsigned long r14; > + int cpu = bt->tc->processor; > > if (bt->hp && bt->hp->eip) { > error(WARNING, > "instruction pointer argument ignored on this > architecture!\n"); > } > + if (is_task_active(bt->task) && (!(kt->cpu_flags[cpu] & ONLINE))) { > + fprintf(fp, " CPU offline\n"); > + return; > + } > ksp = bt->stkptr; > > /* print lowcore and get async stack when task has cpu */

15 years, 2 months

1
0
0 / 0

Re: [Crash-utility] [PATCH] Use only tasks on online CPUs for bt -a

by Dave Anderson

----- "Dave Anderson" <anderson(a)redhat.com> wrote: > ----- "Michael Holzheu" <holzheu(a)linux.vnet.ibm.com> wrote: > > > Hello Dave, > > > > On Mon, 2010-04-26 at 10:29 -0400, Dave Anderson wrote: > > > I'd prefer not to leave them out of the various internal task > arrays, > > > especially the active_set[] array. Regardless of their on/offline > > > status, they do still exist as tasks, have runqueues, etc. > > > > Ok, fine. > > > > > If you're just worried about "bt -a", then why not just catch > > > the offline status in the for loop inside "if (active)" section > > > of cmd_bt()? > > > > Good idea! The following attached patch also works for me. > > > > Michael > > That looks good -- queued for the next release. > > Thanks, > Dave Sorry -- I take it back. Running a test shows that it breaks "bt -a" on Xen dumpfiles where the cpus are marked offline prior to dumping the kernel memory. I think this should be moved to the processor-specific backtrace functions, which can just display "OFFLINE" or something to that effect. Dave > > > --- > > kernel.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > --- a/kernel.c > > +++ b/kernel.c > > @@ -1989,6 +1989,8 @@ cmd_bt(void) > > free_all_bufs(); > > continue; > > } > > + if (!(kt->cpu_flags[c] & ONLINE)) > > + continue; > > if ((tc = task_to_context(tt->panic_threads[c]))) { > > pc->flags |= IN_FOREACH; > > DO_TASK_BACKTRACE();

15 years, 2 months

2
1
0 / 0

Re: [Crash-utility] [PATCH] Use only tasks on online CPUs for bt -a

by Dave Anderson

----- "Michael Holzheu" <holzheu(a)linux.vnet.ibm.com> wrote: > Hello Dave, > > On Mon, 2010-04-26 at 10:29 -0400, Dave Anderson wrote: > > I'd prefer not to leave them out of the various internal task arrays, > > especially the active_set[] array. Regardless of their on/offline > > status, they do still exist as tasks, have runqueues, etc. > > Ok, fine. > > > If you're just worried about "bt -a", then why not just catch > > the offline status in the for loop inside "if (active)" section > > of cmd_bt()? > > Good idea! The following attached patch also works for me. > > Michael That looks good -- queued for the next release. Thanks, Dave > --- > kernel.c | 2 ++ > 1 file changed, 2 insertions(+) > > --- a/kernel.c > +++ b/kernel.c > @@ -1989,6 +1989,8 @@ cmd_bt(void) > free_all_bufs(); > continue; > } > + if (!(kt->cpu_flags[c] & ONLINE)) > + continue; > if ((tc = task_to_context(tt->panic_threads[c]))) { > pc->flags |= IN_FOREACH; > DO_TASK_BACKTRACE();

15 years, 2 months

2
1
0 / 0

Re: [Crash-utility] [PATCH] Use only tasks on online CPUs for bt -a

by Dave Anderson

----- "Michael Holzheu" <holzheu(a)linux.vnet.ibm.com> wrote: > Hello Dave, > > Currently for "bt -a" also swapper tasks on offline CPUs are printed > (at least on s390). Wouldn't it be better to only print a backtrace, > when the task is running on an online CPU? > > My suggestion would be to implement that with the following patch > by only setting the panic threads for online CPUs. I also attached a > second alternative patch that fills the active set array only with > tasks on online CPUs. > > What do you think? > > Michael I'd prefer not to leave them out of the various internal task arrays, especially the active_set[] array. Regardless of their on/offline status, they do still exist as tasks, have runqueues, etc. If you're just worried about "bt -a", then why not just catch the offline status in the for loop inside "if (active)" section of cmd_bt()? Or just indicate some kind of "OFFLINE" status in the output? Dave > --- > task.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > --- a/task.c > +++ b/task.c > @@ -5658,8 +5658,12 @@ populate_panic_threads(void) > struct task_context *tc; > > if (get_active_set()) { > - for (i = 0; i < NR_CPUS; i++) > - tt->panic_threads[i] = tt->active_set[i]; > + for (i = 0; i < NR_CPUS; i++) { > + if (kt->cpu_flags[i] & ONLINE) > + tt->panic_threads[i] = tt->active_set[i]; > + else > + tt->panic_threads[i] = 0; > + } > return; > } > > -- > Crash-utility mailing list > Crash-utility(a)redhat.com > https://www.redhat.com/mailman/listinfo/crash-utility

15 years, 2 months

2
1
0 / 0

[PATCH] Use only tasks on online CPUs for bt -a

by Michael Holzheu

Hello Dave, Currently for "bt -a" also swapper tasks on offline CPUs are printed (at least on s390). Wouldn't it be better to only print a backtrace, when the task is running on an online CPU? My suggestion would be to implement that with the following patch by only setting the panic threads for online CPUs. I also attached a second alternative patch that fills the active set array only with tasks on online CPUs. What do you think? Michael --- task.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) --- a/task.c +++ b/task.c @@ -5658,8 +5658,12 @@ populate_panic_threads(void) struct task_context *tc; if (get_active_set()) { - for (i = 0; i < NR_CPUS; i++) - tt->panic_threads[i] = tt->active_set[i]; + for (i = 0; i < NR_CPUS; i++) { + if (kt->cpu_flags[i] & ONLINE) + tt->panic_threads[i] = tt->active_set[i]; + else + tt->panic_threads[i] = 0; + } return; }

15 years, 2 months

1
0
0 / 0

Re: [Crash-utility] crash does not support recent qemu save-vm formats

by Dave Anderson

----- "Sergey Svishchev" <svs(a)ropnet.ru> wrote: > Dave Anderson wrote: > > >> See also > https://bugs.launchpad.net/ubuntu/+source/crash/+bug/559219 > >> > >> -- > >> Sergey Svishchev > > > > BTW, if you follow the advice in the ubuntu bug description, does > > the crash session work normally? > > (It's my bug report :-) > > Yes, mostly. Date is wrong (DATE: Sun Oct 26 03:35:21 2594), 'bt' > reports some errors but otherwise backtraces look sane. > > crash> bt > PID: 0 TASK: ffffffff81796600 CPU: 0 COMMAND: "swapper" > #0 [ffffffff8176fe58] schedule at ffffffff81525251 > bt: invalid kernel virtual address: 41 type: "call byte" > bt: invalid kernel virtual address: 1587b type: "call byte" > bt: invalid kernel virtual address: 1587b type: "call byte" > bt: invalid kernel virtual address: 1587b type: "call byte" > bt: invalid kernel virtual address: 1587b type: "call byte" > bt: invalid kernel virtual address: 1587b type: "call byte" > bt: invalid kernel virtual address: 1587b type: "call byte" > bt: invalid kernel virtual address: 1587b type: "call byte" > bt: invalid kernel virtual address: 6db6db6db6db6db2 type: "call > byte" > bt: invalid kernel virtual address: 937fb type: "call byte" > #1 [ffffffff8176ff00] cpu_idle at ffffffff81010e45 > > -- > Sergey Svishchev Yeah, something is still askew. I just got this from Paolo Bonzini, the author of the qemu-related code in crash: > > By any chance do you have any insights re: the structure-related > > changes associated with CPU_SAVE_VERSION?. This is an upstream bug > > report, but I note that in qemu-kvm-0.11.0-rc1.fc12.i686, it's equal > > to 10 in target-i386/cpu.h. > > It needs an update for newer qemus. I'll take a look, as probably > we'd want this on RHEL6 too (or maybe not, I used crash/kvm on RHEL6 > about a month ago). > > Paolo Please keep that dump around for an upcoming test patch... Thanks, Dave

15 years, 2 months

1
0
0 / 0

Re: [Crash-utility] crash does not support recent qemu save-vm formats

by Dave Anderson

----- "Sergey Svishchev" <svs(a)ropnet.ru> wrote: > Hi, > > I've been trying to diagnose a livelock-like condition (no response to > > ping, serial console dead, KVM process consumes 100% CPU) of KVM virtual > machines on Ubuntu 9.10 (Ubuntu AMD64 kernel 2.6.31-17-server, QEMU > 0.11.0 and libvirt 0.7.0). > > "virsh dump" generates a dump file that crash cannot read: > > crash: qemu-load.c:501: cpu_init_load_64: Assertion `version_id >= 4 && > version_id <= 9' failed. > > Indeed, CPU_SAVE_VERSION in this dump file is 10. > > See also https://bugs.launchpad.net/ubuntu/+source/crash/+bug/559219 > > -- > Sergey Svishchev I'll check with the author of the crash qemu code. BTW, if you follow the advice in the ubuntu bug description, does the crash session work normally? Dave

15 years, 2 months

2
1
0 / 0

crash does not support recent qemu save-vm formats

by Sergey Svishchev

Hi, I've been trying to diagnose a livelock-like condition (no response to ping, serial console dead, KVM process consumes 100% CPU) of KVM virtual machines on Ubuntu 9.10 (Ubuntu AMD64 kernel 2.6.31-17-server, QEMU 0.11.0 and libvirt 0.7.0). "virsh dump" generates a dump file that crash cannot read: crash: qemu-load.c:501: cpu_init_load_64: Assertion `version_id >= 4 && version_id <= 9' failed. Indeed, CPU_SAVE_VERSION in this dump file is 10. See also https://bugs.launchpad.net/ubuntu/+source/crash/+bug/559219 -- Sergey Svishchev

15 years, 2 months

1
0
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility April 2010