Re: [Crash-utility] dev -p command fails post linux 2.6.25
by Dave Anderson
----- "Sharyathi Nagesh" <sharyath(a)in.ibm.com> wrote:
> Hi Dave
> As you may be aware dev -p command post linux kernel 2.6.25 fails with
> "no PCI devices found on this system.". When I went through the kernel
> a specific commit has removed pci_devices variable from the kernel code
>
> ==================================================================
> git commit: 5ff580c10ec06fd296bd23d4570c1a95194094a0
> by Greg Kroah-Hartman <gregkh(a)suse.de>
>
>
> This is what he says in the commit
> ------------------------------------------------------------------------------------------------------------
> This patch finally removes the global list of PCI devices. We are
> relying entirely on the list held in the driver core now, and do not
> need a separate "shadow" list as no one uses it.
> ------------------------------------------------------------------------------------------------------------
>
> ==================================================================
>
> I saw some of your earlier postings where you have specifically mentioned about this problem:
> http://www.mail-archive.com/crash-utility@redhat.com/msg00346.html
>
>
> With this I wanted to know, if you intend to keep dev -p behavior as it is now or
> there is any plan to change it to display actual values?
>
> Thank You
> Sharyathi N
I (personally) have no plans to change it. If I remember correctly,
Bud Brown came up with an alternate scheme, but the imported data from
the kernel proper required to accomplish it was enormous (bordering
on absurd), so I suggested that it would be more appropriate as an
extension module.
Bud -- feel free to chime in here... ;-)
For that matter, even the "old" way required the import of ~1000 lines
of kernel #define's -- which always bugged me -- and was pretty much the
only crash command that had to do such a thing.
Dave
14 years, 6 months
Re: [Crash-utility] dev -p command fails post linux 2.6.25
by bud.brown@redhat.com
I've a crash extension called 'pci', its a direct replacement for the 'dev' command but contains its own (huge) internal database of pci information. I thought I'd finished integrating oui.txt from ieee and pci.ids from sourceforge but looking at the code today, I have *not* done that yet.... hmmm thought I'd finished that stuff. But, its on the todo list. I wrote it about a year ago and haven't really done anything with the sources since then.
It only uses the pci information from www.pcidatabase.com at present. I'll see if I can get this up on the net for use if there is interest. There are two parts, you download the source pci id information from the web and run a pre-processor on it, and then compile the extension with the header file you've created.
The ext-pci.so is currently ~1.1MB in size. Another todo list item was to shrink the size of the database via restructuring it but I don't expect more than a 25%-33% reduction in size at most.
Bud Brown
Red Hat, Inc
Westford, MA
SEG/Storage Team
------------------------------
Message: 4
Date: Tue, 27 Apr 2010 16:36:31 +0530
From: Sharyathi Nagesh <sharyath(a)in.ibm.com>
To: Dave Anderson <anderson(a)redhat.com>
Cc: Crash-utility(a)redhat.com, huachenl(a)cn.ibm.com
Subject:
Message-ID: <4BD6C537.9030102(a)in.ibm.com>
Content-Type: text/plain; charset=UTF-8
Thanks for the reply Dave.
Even I feel it is better to change the error message to what you have
suggested. That is simplest way to address the issue as well. It gives a
clearer message to the user.
I am curious to hear from Bud on the alternate scheme and progress
as well. Bud your views ?
Thanks
Sharyathi
On 04/23/2010 08:01 PM, Dave Anderson wrote:
>
> ----- "Dave Anderson" <anderson(a)redhat.com> wrote:
>
>>> With this I wanted to know, if you intend to keep dev -p behavior
>>> as it is now or there is any plan to change it to display actual
>>> values?
>>>
>>> Thank You Sharyathi N
>>
>> I (personally) have no plans to change it. If I remember
>> correctly, Bud Brown came up with an alternate scheme, but the
>> imported data from the kernel proper required to accomplish it was
>> enormous (bordering on absurd), so I suggested that it would be
>> more appropriate as an extension module.
>>
>> Bud -- feel free to chime in here... ;-)
>>
>> For that matter, even the "old" way required the import of ~1000
>> lines of kernel #define's -- which always bugged me -- and was
>> pretty much the only crash command that had to do such a thing.
>>
>> Dave
>
> Actually, at a minimum, I should change this:
>
> if (!symbol_exists("pci_devices")) error(FATAL, "no PCI devices found
> on this system.\n");
>
> to the generic option_not_supported() message.
>
> Dave
------------------------------
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
End of Crash-utility Digest, Vol 55, Issue 11
*********************************************
14 years, 7 months
Re: [Crash-utility] [PATCH] Use only tasks on online CPUs for bt -a
by Dave Anderson
----- "Michael Holzheu" <holzheu(a)linux.vnet.ibm.com> wrote:
> Hi Dave,
>
> On Mon, 2010-04-26 at 11:56 -0400, Dave Anderson wrote:
> > Sorry -- I take it back. Running a test shows that it breaks "bt -a"
> > on Xen dumpfiles where the cpus are marked offline prior to dumping
> > the kernel memory.
> >
> > I think this should be moved to the processor-specific backtrace functions,
> > which can just display "OFFLINE" or something to that effect.
>
> Ok, fine. What about the following...
That's good -- queued for the next release.
Thanks,
Dave
> ---
> s390.c | 5 +++++
> s390x.c | 5 +++++
> 2 files changed, 10 insertions(+)
>
> --- a/s390.c
> +++ b/s390.c
> @@ -603,11 +603,16 @@ s390_back_trace_cmd(struct bt_info *bt)
> unsigned long async_start = 0, async_end = 0;
> unsigned long panic_start = 0, panic_end = 0;
> unsigned long stack_end, stack_start, stack_base;
> + int cpu = bt->tc->processor;
>
> if (bt->hp && bt->hp->eip) {
> error(WARNING,
> "instruction pointer argument ignored on this architecture!\n");
> }
> + if (is_task_active(bt->task) && (!(kt->cpu_flags[cpu] & ONLINE))) {
> + fprintf(fp, " CPU offline\n");
> + return;
> + }
> ksp = bt->stkptr;
>
> /* print lowcore and get async stack when task has cpu */
> --- a/s390x.c
> +++ b/s390x.c
> @@ -836,11 +836,16 @@ s390x_back_trace_cmd(struct bt_info *bt)
> unsigned long panic_start = 0, panic_end = 0;
> unsigned long stack_end, stack_start, stack_base;
> unsigned long r14;
> + int cpu = bt->tc->processor;
>
> if (bt->hp && bt->hp->eip) {
> error(WARNING,
> "instruction pointer argument ignored on this
> architecture!\n");
> }
> + if (is_task_active(bt->task) && (!(kt->cpu_flags[cpu] & ONLINE))) {
> + fprintf(fp, " CPU offline\n");
> + return;
> + }
> ksp = bt->stkptr;
>
> /* print lowcore and get async stack when task has cpu */
14 years, 7 months
Re: [Crash-utility] [PATCH] Use only tasks on online CPUs for bt -a
by Dave Anderson
----- "Dave Anderson" <anderson(a)redhat.com> wrote:
> ----- "Michael Holzheu" <holzheu(a)linux.vnet.ibm.com> wrote:
>
> > Hello Dave,
> >
> > On Mon, 2010-04-26 at 10:29 -0400, Dave Anderson wrote:
> > > I'd prefer not to leave them out of the various internal task
> arrays,
> > > especially the active_set[] array. Regardless of their on/offline
> > > status, they do still exist as tasks, have runqueues, etc.
> >
> > Ok, fine.
> >
> > > If you're just worried about "bt -a", then why not just catch
> > > the offline status in the for loop inside "if (active)" section
> > > of cmd_bt()?
> >
> > Good idea! The following attached patch also works for me.
> >
> > Michael
>
> That looks good -- queued for the next release.
>
> Thanks,
> Dave
Sorry -- I take it back. Running a test shows that it breaks "bt -a"
on Xen dumpfiles where the cpus are marked offline prior to dumping
the kernel memory.
I think this should be moved to the processor-specific backtrace functions,
which can just display "OFFLINE" or something to that effect.
Dave
>
> > ---
> > kernel.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > --- a/kernel.c
> > +++ b/kernel.c
> > @@ -1989,6 +1989,8 @@ cmd_bt(void)
> > free_all_bufs();
> > continue;
> > }
> > + if (!(kt->cpu_flags[c] & ONLINE))
> > + continue;
> > if ((tc = task_to_context(tt->panic_threads[c]))) {
> > pc->flags |= IN_FOREACH;
> > DO_TASK_BACKTRACE();
14 years, 7 months
Re: [Crash-utility] [PATCH] Use only tasks on online CPUs for bt -a
by Dave Anderson
----- "Michael Holzheu" <holzheu(a)linux.vnet.ibm.com> wrote:
> Hello Dave,
>
> On Mon, 2010-04-26 at 10:29 -0400, Dave Anderson wrote:
> > I'd prefer not to leave them out of the various internal task arrays,
> > especially the active_set[] array. Regardless of their on/offline
> > status, they do still exist as tasks, have runqueues, etc.
>
> Ok, fine.
>
> > If you're just worried about "bt -a", then why not just catch
> > the offline status in the for loop inside "if (active)" section
> > of cmd_bt()?
>
> Good idea! The following attached patch also works for me.
>
> Michael
That looks good -- queued for the next release.
Thanks,
Dave
> ---
> kernel.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> --- a/kernel.c
> +++ b/kernel.c
> @@ -1989,6 +1989,8 @@ cmd_bt(void)
> free_all_bufs();
> continue;
> }
> + if (!(kt->cpu_flags[c] & ONLINE))
> + continue;
> if ((tc = task_to_context(tt->panic_threads[c]))) {
> pc->flags |= IN_FOREACH;
> DO_TASK_BACKTRACE();
14 years, 7 months
Re: [Crash-utility] [PATCH] Use only tasks on online CPUs for bt -a
by Dave Anderson
----- "Michael Holzheu" <holzheu(a)linux.vnet.ibm.com> wrote:
> Hello Dave,
>
> Currently for "bt -a" also swapper tasks on offline CPUs are printed
> (at least on s390). Wouldn't it be better to only print a backtrace,
> when the task is running on an online CPU?
>
> My suggestion would be to implement that with the following patch
> by only setting the panic threads for online CPUs. I also attached a
> second alternative patch that fills the active set array only with
> tasks on online CPUs.
>
> What do you think?
>
> Michael
I'd prefer not to leave them out of the various internal task arrays,
especially the active_set[] array. Regardless of their on/offline
status, they do still exist as tasks, have runqueues, etc.
If you're just worried about "bt -a", then why not just catch
the offline status in the for loop inside "if (active)" section
of cmd_bt()? Or just indicate some kind of "OFFLINE" status in
the output?
Dave
> ---
> task.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> --- a/task.c
> +++ b/task.c
> @@ -5658,8 +5658,12 @@ populate_panic_threads(void)
> struct task_context *tc;
>
> if (get_active_set()) {
> - for (i = 0; i < NR_CPUS; i++)
> - tt->panic_threads[i] = tt->active_set[i];
> + for (i = 0; i < NR_CPUS; i++) {
> + if (kt->cpu_flags[i] & ONLINE)
> + tt->panic_threads[i] = tt->active_set[i];
> + else
> + tt->panic_threads[i] = 0;
> + }
> return;
> }
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
14 years, 7 months
[PATCH] Use only tasks on online CPUs for bt -a
by Michael Holzheu
Hello Dave,
Currently for "bt -a" also swapper tasks on offline CPUs are printed
(at least on s390). Wouldn't it be better to only print a backtrace,
when the task is running on an online CPU?
My suggestion would be to implement that with the following patch
by only setting the panic threads for online CPUs. I also attached a
second alternative patch that fills the active set array only with
tasks on online CPUs.
What do you think?
Michael
---
task.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/task.c
+++ b/task.c
@@ -5658,8 +5658,12 @@ populate_panic_threads(void)
struct task_context *tc;
if (get_active_set()) {
- for (i = 0; i < NR_CPUS; i++)
- tt->panic_threads[i] = tt->active_set[i];
+ for (i = 0; i < NR_CPUS; i++) {
+ if (kt->cpu_flags[i] & ONLINE)
+ tt->panic_threads[i] = tt->active_set[i];
+ else
+ tt->panic_threads[i] = 0;
+ }
return;
}
14 years, 7 months
Re: [Crash-utility] crash does not support recent qemu save-vm formats
by Dave Anderson
----- "Sergey Svishchev" <svs(a)ropnet.ru> wrote:
> Dave Anderson wrote:
>
> >> See also
> https://bugs.launchpad.net/ubuntu/+source/crash/+bug/559219
> >>
> >> --
> >> Sergey Svishchev
> >
> > BTW, if you follow the advice in the ubuntu bug description, does
> > the crash session work normally?
>
> (It's my bug report :-)
>
> Yes, mostly. Date is wrong (DATE: Sun Oct 26 03:35:21 2594), 'bt'
> reports some errors but otherwise backtraces look sane.
>
> crash> bt
> PID: 0 TASK: ffffffff81796600 CPU: 0 COMMAND: "swapper"
> #0 [ffffffff8176fe58] schedule at ffffffff81525251
> bt: invalid kernel virtual address: 41 type: "call byte"
> bt: invalid kernel virtual address: 1587b type: "call byte"
> bt: invalid kernel virtual address: 1587b type: "call byte"
> bt: invalid kernel virtual address: 1587b type: "call byte"
> bt: invalid kernel virtual address: 1587b type: "call byte"
> bt: invalid kernel virtual address: 1587b type: "call byte"
> bt: invalid kernel virtual address: 1587b type: "call byte"
> bt: invalid kernel virtual address: 1587b type: "call byte"
> bt: invalid kernel virtual address: 6db6db6db6db6db2 type: "call
> byte"
> bt: invalid kernel virtual address: 937fb type: "call byte"
> #1 [ffffffff8176ff00] cpu_idle at ffffffff81010e45
>
> --
> Sergey Svishchev
Yeah, something is still askew. I just got this from Paolo Bonzini,
the author of the qemu-related code in crash:
> > By any chance do you have any insights re: the structure-related
> > changes associated with CPU_SAVE_VERSION?. This is an upstream bug
> > report, but I note that in qemu-kvm-0.11.0-rc1.fc12.i686, it's equal
> > to 10 in target-i386/cpu.h.
>
> It needs an update for newer qemus. I'll take a look, as probably
> we'd want this on RHEL6 too (or maybe not, I used crash/kvm on RHEL6
> about a month ago).
>
> Paolo
Please keep that dump around for an upcoming test patch...
Thanks,
Dave
14 years, 7 months
Re: [Crash-utility] crash does not support recent qemu save-vm formats
by Dave Anderson
----- "Sergey Svishchev" <svs(a)ropnet.ru> wrote:
> Hi,
>
> I've been trying to diagnose a livelock-like condition (no response to
>
> ping, serial console dead, KVM process consumes 100% CPU) of KVM virtual
> machines on Ubuntu 9.10 (Ubuntu AMD64 kernel 2.6.31-17-server, QEMU
> 0.11.0 and libvirt 0.7.0).
>
> "virsh dump" generates a dump file that crash cannot read:
>
> crash: qemu-load.c:501: cpu_init_load_64: Assertion `version_id >= 4 &&
> version_id <= 9' failed.
>
> Indeed, CPU_SAVE_VERSION in this dump file is 10.
>
> See also https://bugs.launchpad.net/ubuntu/+source/crash/+bug/559219
>
> --
> Sergey Svishchev
I'll check with the author of the crash qemu code.
BTW, if you follow the advice in the ubuntu bug description, does
the crash session work normally?
Dave
14 years, 7 months
crash does not support recent qemu save-vm formats
by Sergey Svishchev
Hi,
I've been trying to diagnose a livelock-like condition (no response to
ping, serial console dead, KVM process consumes 100% CPU) of KVM virtual
machines on Ubuntu 9.10 (Ubuntu AMD64 kernel 2.6.31-17-server, QEMU
0.11.0 and libvirt 0.7.0).
"virsh dump" generates a dump file that crash cannot read:
crash: qemu-load.c:501: cpu_init_load_64: Assertion `version_id >= 4 &&
version_id <= 9' failed.
Indeed, CPU_SAVE_VERSION in this dump file is 10.
See also https://bugs.launchpad.net/ubuntu/+source/crash/+bug/559219
--
Sergey Svishchev
14 years, 7 months