[PATCH] Fix vmlinux verification for s390(x)
by Michael Holzheu
Hi Dave,
Another fix for s390(x) ...
When starting crash with an s390x standalone dump, we get the following message:
WARNING: machine type mismatch:
crash utility: S390X
/usr/lib/debug/lib/modules/2.6.18-86.el5/vmlinux: (unknown)
To fix that this patch adds s390(x) support in the is_kernel() function,
where the vmlinux ELF file is verified.
Signed-off-by: Michael Holzheu <holzheu(a)linux.vnet.ibm.com>
---
symbols.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff -Naurp crash-4.0-6.2/symbols.c crash-4.0-6.2-s390x-warn-fix/symbols.c
--- crash-4.0-6.2/symbols.c 2008-04-14 17:49:21.000000000 +0200
+++ crash-4.0-6.2-s390x-warn-fix/symbols.c 2008-04-14 17:49:27.000000000 +0200
@@ -2540,6 +2540,11 @@ is_kernel(char *file)
goto bailout;
break;
+ case EM_S390:
+ if (machine_type_mismatch(file, "S390", NULL, 0))
+ goto bailout;
+ break;
+
default:
if (machine_type_mismatch(file, "(unknown)", NULL, 0))
goto bailout;
@@ -2573,6 +2578,11 @@ is_kernel(char *file)
goto bailout;
break;
+ case EM_S390:
+ if (machine_type_mismatch(file, "S390X", NULL, 0))
+ goto bailout;
+ break;
+
default:
if (machine_type_mismatch(file, "(unknown)", NULL, 0))
goto bailout;
16 years, 7 months
[PATCH] Add large page support for s390x
by Michael Holzheu
Hi Dave,
Please include this patch in the next crash release:
The new z10 System z (s390x) machines have now support for large pages (1MB). This patch
updates the s390x page table walk function in order to handle this.
Signed-off-by: Michael Holzheu <holzheu(a)linux.vnet.ibm.com>
---
s390x.c | 5 +++++
1 files changed, 5 insertions(+)
diff -Naurp crash-4.0-6.2/s390x.c crash-4.0-6.2-s390x-large-page/s390x.c
--- crash-4.0-6.2/s390x.c 2008-04-11 14:42:35.000000000 +0200
+++ crash-4.0-6.2-s390x-large-page/s390x.c 2008-04-11 14:42:41.000000000 +0200
@@ -337,6 +337,11 @@ int s390x_vtop(ulong table, ulong vaddr,
level--;
}
+ /* Check if this is a large page. */
+ if (entry & 0x400ULL)
+ /* Add the 1MB page offset and return the final value. */
+ return table + (vaddr & 0xfffffULL);
+
/* Get the page table entry */
entry = _kl_pg_table_deref_s390x(vaddr, entry & ~0x7ffULL);
if (!entry)
16 years, 7 months
Re: [Crash-utility] crash aborts with cannot determine idle task
by Dave Anderson
> While running crash-4.0-6.1 on a vmcore , crash is aborting with
>
> --------
> crash: cannot determine idle task addresses from init_tasks[] or runqueues[]
>
> crash: cannot resolve "init_task_union"
> -------
>
>
> during startup. The kernel is later than 2.6.18 . The changelog
> http://people.redhat.com/anderson/crash.changelog.html mentions that this
> is possibly fixed in version 4.0-3.1 . Hence could you pls point me to the
> patch that fixed this problem.
>
> thanks,
> Chandru
That particular two-year-old patch simply recognized and dealt with the kernel
name change from "struct runqueue" to "struct rq":
--- kernel.c 2 Aug 2006 14:34:35 -0000 1.140
+++ kernel.c 2 Aug 2006 18:35:31 -0000 1.141
@@ -55,6 +55,7 @@
int i;
char *p1, *p2, buf[BUFSIZE];
struct syment *sp1, *sp2;
+ char *rqstruct;
if (pc->flags & KERNEL_DEBUG_QUERY)
return;
@@ -158,7 +159,15 @@
&kt->__per_cpu_offset[0]);
kt->flags |= PER_CPU_OFF;
}
- MEMBER_OFFSET_INIT(runqueue_cpu, "runqueue", "cpu");
+ if (STRUCT_EXISTS("runqueue"))
+ rqstruct = "runqueue";
+ else if (STRUCT_EXISTS("rq"))
+ rqstruct = "rq";
+
+ MEMBER_OFFSET_INIT(runqueue_cpu, rqstruct, "cpu");
+ /*
+ * 'cpu' does not exist in 'struct rq'.
+ */
if (VALID_MEMBER(runqueue_cpu) &&
(get_array_length("runqueue.cpu", NULL, 0) > 0)) {
MEMBER_OFFSET_INIT(cpu_s_curr, "cpu_s", "curr");
@@ -183,17 +192,17 @@
"runq_siblings: %d: __cpu_idx and __rq_idx arrays don't exist?\n",
kt->runq_siblings);
} else {
- MEMBER_OFFSET_INIT(runqueue_idle, "runqueue", "idle");
- MEMBER_OFFSET_INIT(runqueue_curr, "runqueue", "curr");
+ MEMBER_OFFSET_INIT(runqueue_idle, rqstruct, "idle");
+ MEMBER_OFFSET_INIT(runqueue_curr, rqstruct, "curr");
ASSIGN_OFFSET(runqueue_cpu) = INVALID_OFFSET;
}
- MEMBER_OFFSET_INIT(runqueue_active, "runqueue", "active");
- MEMBER_OFFSET_INIT(runqueue_expired, "runqueue", "expired");
- MEMBER_OFFSET_INIT(runqueue_arrays, "runqueue", "arrays");
+ MEMBER_OFFSET_INIT(runqueue_active, rqstruct, "active");
+ MEMBER_OFFSET_INIT(runqueue_expired, rqstruct, "expired");
+ MEMBER_OFFSET_INIT(runqueue_arrays, rqstruct, "arrays");
MEMBER_OFFSET_INIT(prio_array_queue, "prio_array", "queue");
MEMBER_OFFSET_INIT(prio_array_nr_active, "prio_array",
"nr_active");
- STRUCT_SIZE_INIT(runqueue, "runqueue");
+ STRUCT_SIZE_INIT(runqueue, rqstruct);
STRUCT_SIZE_INIT(prio_array, "prio_array");
/*
So that patch was required for 2.6.18.
When you say that the "kernel is later than 2.6.18", well, that doesn't
help me much.
Look at the crash function get_idle_threads() in task.c, which is where
you're failing. It runs through the history of the symbols that Linux
has used over the years for the run queues. For the most recent kernels,
it looks for the "per_cpu__runqueues" symbol. At least on 2.6.25-rc2,
the kernel still defines them in kernel/sched.c like this:
static DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
So if you do an "nm -Bn vmlinux | grep runqueues", you should see:
# nm -Bn vmlinux-2.6.25-rc1-ext4-1 | grep runqueues
ffffffff8082b700 d per_cpu__runqueues
#
I'm guessing that's not the problem -- so presuming that the symbol *does*
exist, find out why it's failing to increment "cnt" in this part of
get_idle_threads():
if (symbol_exists("per_cpu__runqueues") &&
VALID_MEMBER(runqueue_idle)) {
runqbuf = GETBUF(SIZE(runqueue));
for (i = 0; i < nr_cpus; i++) {
if ((kt->flags & SMP) && (kt->flags & PER_CPU_OFF)) {
runq = symbol_value("per_cpu__runqueues") +
kt->__per_cpu_offset[i];
} else
runq = symbol_value("per_cpu__runqueues");
readmem(runq, KVADDR, runqbuf,
SIZE(runqueue), "runqueues entry (per_cpu)",
FAULT_ON_ERROR);
tasklist[i] = ULONG(runqbuf + OFFSET(runqueue_idle));
if (IS_KVADDR(tasklist[i]))
cnt++;
}
}
Determine whether it even makes it to the inner for loop, whether
the pre-determined nr_cpus value makes sense, whether the SMP flag
reflects whether the kernel was compiled for SMP, whether the PER_CPU_OFF
flag was set, what address was calculated, etc...
Dave
16 years, 7 months
Re: [Crash-utility] "invalid structure member offset: task_struct_parent" on x86_64 rawhide
by Jeff Layton
On Fri, 04 Apr 2008 10:02:20 -0400
Dave Anderson <anderson(a)redhat.com> wrote:
> > Jeff Layton wrote:
> >> Looks like we might have gotten bitten by some upstream changes
> >> again...
> >>
> >> When I run crash on a recent rawhide x86_64 kernel, I seem to be
> >> getting this error:
> >>
> >> crash: invalid structure member offset: task_struct_parent
> >> FILE: task.c LINE: 2163 FUNCTION: store_context()
> >>
> >> [/usr/bin/crash] error trace: 49150a => 495bb8 => 4963be => 4fc1bc
> >> /usr/bin/nm: /usr/bin/crash: no symbols
> >> /usr/bin/nm: /usr/bin/crash: no symbols
> >> /usr/bin/nm: /usr/bin/crash: no symbols
> >> /usr/bin/nm: /usr/bin/crash: no symbols
> >>
> >> Relevant package versions:
> >>
> >> crash-4.0-6.2.x86_64
> >> kernel-2.6.25-0.185.rc7.git6.fc9.x86_64
> >>
> >> ...machine is a x86_64 FV xen guest. Any thoughts?
> >>
> >> Thanks,
> >
> > Yep, although the change is not upstream in Linus's tree, Roland's
> > linux-2.6-utrace.patch removes it in Fedora:
> >
> > @@ -1070,18 +1063,26 @@ struct task_struct {
> > /*
> > * pointers to (original) parent process, youngest child,
> > younger sibling,
> > * older sibling, respectively. (p->father can be replaced with
> > - * p->parent->pid)
> > + * p->real_parent->pid)
> > */
> > - struct task_struct *real_parent; /* real parent process (when
> > being debugged) */
> > - struct task_struct *parent; /* parent process */
> > + struct task_struct *real_parent; /* real parent process */
> > /*
> > - * children/sibling forms the list of my children plus the
> > - * tasks I'm ptracing.
> > + * children/sibling forms the list of my natural children
> > */
> >
> > AFAICT, task_struct.real_parent can be substituted. Try the attached
> > patch. (and then wait to see what else has been broken...)
> >
> > Dave
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> > --- task.c.orig 2008-04-04 09:48:38.000000000 -0400
> > +++ task.c 2008-04-04 09:50:13.000000000 -0400
> > @@ -208,6 +208,9 @@
> > MEMBER_OFFSET_INIT(task_struct_processor, "task_struct", "processor");
> > MEMBER_OFFSET_INIT(task_struct_p_pptr, "task_struct", "p_pptr");
> > MEMBER_OFFSET_INIT(task_struct_parent, "task_struct", "parent");
> > + if (INVALID_MEMBER(task_struct_parent))
> > + MEMBER_OFFSET_INIT(task_struct_parent, "task_struct",
> > + "real_parent");
> > MEMBER_OFFSET_INIT(task_struct_has_cpu, "task_struct", "has_cpu");
> > MEMBER_OFFSET_INIT(task_struct_cpus_runnable,
> > "task_struct", "cpus_runnable");
>
That worked! I didn't do any extensive testing, but that seems to allow
crash to start and do a "ps".
Many thanks!
--
Jeff Layton <jlayton(a)redhat.com>
16 years, 7 months
"invalid structure member offset: task_struct_parent" on x86_64 rawhide
by Jeff Layton
Looks like we might have gotten bitten by some upstream changes
again...
When I run crash on a recent rawhide x86_64 kernel, I seem to be
getting this error:
crash: invalid structure member offset: task_struct_parent
FILE: task.c LINE: 2163 FUNCTION: store_context()
[/usr/bin/crash] error trace: 49150a => 495bb8 => 4963be => 4fc1bc
/usr/bin/nm: /usr/bin/crash: no symbols
/usr/bin/nm: /usr/bin/crash: no symbols
/usr/bin/nm: /usr/bin/crash: no symbols
/usr/bin/nm: /usr/bin/crash: no symbols
Relevant package versions:
crash-4.0-6.2.x86_64
kernel-2.6.25-0.185.rc7.git6.fc9.x86_64
...machine is a x86_64 FV xen guest. Any thoughts?
Thanks,
--
Jeff Layton <jlayton(a)redhat.com>
16 years, 7 months