Question for LKCD maintainers
by Dave Anderson
Long after I stopped tinkering with the LKCD code in crash,
changes were contributed to support physical memory zones
in the LKCD dumpfile format. Specifically there is this
piece of save_offset() in lkcd_common.c:
/* find the zone */
for (ii=0; ii < lkcd->num_zones; ii++) {
if (lkcd->zones[ii].start == zone) {
if (lkcd->zones[ii].pages[page].offset != 0) {
if (lkcd->zones[ii].pages[page].offset != off) {
error(INFO, "conflicting page: zone %lld, "
"page %lld: %lld, %lld != %lld\n",
(unsigned long long)zone,
(unsigned long long)page,
(unsigned long long)paddr,
(unsigned long long)off,
(unsigned long long) \
lkcd->zones[ii].pages[page].offset);
abort();
}
ret = 0;
} else {
lkcd->zones[ii].pages[page].offset = off;
ret = 1;
}
break;
}
}
The call to abort() above kills the crash session, which is both
annoying and unnecessary.
I am seeing it in a customer dumpfile, who have their own dumping scheme
that is based upon LKCD version 7. I understand that this may be a
problem with their LKCD port, but nonetheless, it's the only place in
the crash utility that doesn't recover gracefully from dumpfile access
errors.
Anyway, I would like to either:
1. change the error(INFO...) to error(FATAL...) so that run-time
commands encountering this error will just fail, and the session
will return to the crash> prompt, or
2. return 0, so that a "seek error" can be subsequently displayed
by the readmem() command.
Number 2 is preferable, because it yields more clues as to where the
readmem() came from, but since I don't know much about the LKCD
physical memory zones stuff, is there any reason that shouldn't
be done?
Thanks,
Dave
16 years, 12 months
files.c SIAL script
by Bernhard Walle
Hi,
http://people.redhat.com/anderson/extensions/files.c doesn't work on
2.6.22 kernels (and later). Fix is below.
Thanks,
Bernhard
--- files.c.orig 2007-11-30 14:25:05.000000000 +0100
+++ files.c 2007-11-30 14:29:39.000000000 +0100
@@ -139,7 +139,12 @@
printf("%sPID: %-5ld TASK: 0x%p CPU: %-2d COMMAND: \"%s\"\n",
newline ? "\n" : "", t->pid,
t,
- t->thread_info->cpu, getstr(t->comm));
+#if LINUX_RELEASE >= 0x020616
+ ((struct thread_info *)(t)->stack)->cpu,
+#else
+ t->thread_info->cpu,
+#endif
+ getstr(t->comm));
}
/* Traditional mask definitions for st_mode. */
17 years, 1 month
analyze 32-bit core on 64-bit host ?
by Jan-Frode Myklebust
Am I not supposed to be able to run crash on a 64-bit host,
on a vmcore generated on a 32-bit host?
This core was generated on a 32-bit RHEL5:
$ file vmcore-2007-11-26-04-08-14
vmcore-2007-11-26-04-08-14: ELF 64-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style
And on my 64-bit RHEL5 I get:
$ crash /usr/lib/debug/lib/modules/2.6.18-8.1.6.el5PAE/vmlinux vmcore-2007-11-26-04-08-14
crash 4.0-4.6.1
<snip>
crash: vmcore-2007-11-26-04-08-14: not a supported file format
And the vmlinux I'm pointing at is the right 32-bit one:
$ rpm -qf /usr/lib/debug/lib/modules/2.6.18-8.1.6.el5PAE/vmlinux --queryformat '%{arch}\n'
i686
-jf
17 years, 1 month
crash version 4.0-4.10 is available
by Dave Anderson
- Fix a regression introduced in 4.0-4.9 that causes the "kmem -p"
command to fail in SPARSEMEM kernels that that have the struct
page.index member embedded in an anonymous union, which occurred
when the CONFIG_SLUB-related modifications were made to the page
struct in 2.6.22. Without the patch, "kmem -p" fails with the error
message "kmem: invalid structure member offset: page_index".
(anderson(a)redhat.com)
Download from: http://people.redhat.com/anderson
17 years, 1 month
crash version 4.0-4.9 is availlable
by Dave Anderson
- Fix for the "kmem -p" command in kernels configured with
CONFIG_SPARSEMEM, i.e., not CONFIG_SPARSEMEM_EXTREME. Without
the patch, the page structure address for each physical page
was erroneous. (oomichi(a)mxs.nes.nec.co.jp)
- Fix for the "kmem -p" command output of MAPPING and INDEX values
on kernels where the mapping and index members of the page structure
are contained within anonymous unions. Without the patch, those
fields may be dashed-out.
(bob.montgomery(a)hp.com, anderson(a)redhat.com)
- Fix for the "mod" command to search for module object files in the
/lib/modules/<release>/updates directory tree before looking
in /lib/modules/<release>. (charlotte.richardson(a)stratus.com)
- Fix for the "waitq" command for 2.6.15-era and later kernels, which
replaced the __wait_queue.task member with the __wait_queue.private
member. Without the patch, the command would fail with the error
message: "waitq: invalid structure member offset: __wait_queue_task".
(atyson(a)hp.com)
- SIAL interpreter fix for an "operation on 'v1' may be undefined"
warning in sial_exeop(). (bwalle(a)suse.de)
- Fix for several unpredictable failure modes when attempting
"crash -h [command] > outputfile" from a shell command line.
(anderson(a)redhat.com)
- Addressed compiler warnings generated by extensions/echo.c and
extensions/dminfo.c. (bwalle(a)suse.de, anderson(a)redhat.com)
- Addressed compiler warnings generated by lkcd_common.c, lkcd_v8.c
and symbols.c when using:
-O2 -fmessage-length=0 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector
-fno-builtin-memset -fno-strict-aliasing
(bwalle(a)suse.de)
- Fix for "kmem -p" on i386 CONFIG_SPARSEMEM kernels with greater than
4GB of memory. Without the patch, the physical address value wraps
back to zero after physical page ffff0000.
(oomichi(a)mxs.nes.nec.co.jp)
- Fix to redirect SIAL script command output to pipes, files, etc., in
the same manner as native crash commands.
(Robert.Denman(a)teradata.com, anderson(a)redhat.com)
- Fix for ppc64 kernels with 64K pages whose PTE_RPN_SHIFT has changed
from 32 to 30. Without the patch, an initialization-time warning
message "WARNING: cannot access vmalloc'd module memory" would occur,
the "mod" command would fail with the same message, and "kmem -s"
failures could occur when attempting to read a kmem slab cache name
string. Translations and reads of vmalloc'd kernel virtual addresses
and user virtual addresses would appear to work, but bogus data was
returned because the resultant physical address that was read was
incorrect. (anderson(a)redhat.com)
- Fix for "kmem -s" if a slab cache whose name string cannot be read
is encountered. Without the patch, a fatal error message would be
displayed and the command aborted. With this patch, a non-fatal
warning message is displayed, and the cache name is indicated as
"(unknown)". (anderson(a)redhat.com)
- Fix for x86-64 SPARSEMEM kernels with CONFIG_NUMA off. Without the
patch, the crash session fails during initialization with the message
"crash: invalid structure member offset: pglist_data_node_mem_map".
(sachinp(a)in.ibm.com)
- Fix to use the ia64 physical start address from the LKCD dump header
instead of the default value. This was reported as bug on an SGI
machine. (bwalle(a)suse.de)
- For s390[x] kernels the page table allocation method will be changed
such that instead of 3 levels, it will be now possible to allocate 4
levels. The current implementation of the page table walk functions
in the crash utility makes assumptions on how the page tables are
allocated by the kernel, e.g. 3 levels are hard coded. This patch
changes that, and the page table walk is done only according to the
s390 architecture without assumptions on the implementation in the
kernel. (holszheu(a)linux.venet.ibm.com)
- Fix for LKCD dumpfile access failures that abort() the crash session
after displaying an error message indicating a problem with physical
memory zones in the dumpfile. Without the patch, the crash session
would end immediately after displaying an error message of the sort:
"conflicting page: zone 0, page 0: 0, 177160130 != 65536". That
error message will now only be displayed if the crash debug mode is 1
or more, a readmem() "seek error" will be displayed instead, and the
session will return to the "crash>" prompt. (anderson(a)redhat.com)
Download from: http://people.redhat.com/anderson
17 years, 1 month
Re: [Crash-utility] Question for LKCD maintainers
by Dave Anderson
This works for me, and is what I'm going with unless
anybody objects. I'm returning -1 instead of calling
abort(), and the error message is now CRASHDEBUG(1)
because it tends to spew way too many of them over the
course of one readmem() call, not to mention that the
contents of the message are of little use to anybody
but an LKCD developer. I'm really only interested in
knowing that the dumpfile access failed, and this patch
accomplishes that without shutting down the crash session.
Dave
Index: lkcd_common.c
===================================================================
RCS file: /nfs/projects/cvs/crash/lkcd_common.c,v
retrieving revision 1.29
diff -u -r1.29 lkcd_common.c
--- lkcd_common.c 15 Nov 2007 15:44:38 -0000 1.29
+++ lkcd_common.c 16 Nov 2007 20:18:05 -0000
@@ -708,14 +708,15 @@
if (lkcd->zones[ii].start == zone) {
if (lkcd->zones[ii].pages[page].offset != 0) {
if (lkcd->zones[ii].pages[page].offset != off) {
- error(INFO, "conflicting page: zone %lld, "
+ if (CRASHDEBUG(1))
+ error(INFO, "LKCD: conflicting page: zone %lld, "
"page %lld: %lld, %lld != %lld\n",
(unsigned long long)zone,
(unsigned long long)page,
(unsigned long long)paddr,
(unsigned long long)off,
(unsigned long long)lkcd->zones[ii].pages[page].offset);
- abort();
+ return -1;
}
ret = 0;
} else {
17 years, 1 month
[PATCH] s390: Make page table functions more generic
by Michael Holzheu
Hi Dave,
For s390(x) kernels the page table allocation method will be changed.
Instead of 3 levels, it will be now possible to allocate 4 levels.
The current implementation of the page table walk functions in crash
makes assumptions on how the page tables are allocated by the kernel.
E.g. three levels are hard coded.
This patch changes that and the page table walk is done only according
to the s390 architecture without assumptions on the implementation in
the kernel.
So both old and new kernels are supported.
---
s390.c | 144 +++++++++++++++++++++++-------------------------
s390x.c | 191 ++++++++++++++++++++++++++--------------------------------------
2 files changed, 151 insertions(+), 184 deletions(-)
diff -Naur crash-4.0-4.8/s390.c crash-4.0-4.8-page-table-walk/s390.c
--- crash-4.0-4.8/s390.c 2007-10-30 16:51:54.000000000 +0100
+++ crash-4.0-4.8-page-table-walk/s390.c 2007-11-15 15:44:07.000000000 +0100
@@ -21,17 +21,6 @@
#define S390_WORD_SIZE 4
#define S390_ADDR_MASK 0x7fffffff
-#define S390_PAGE_SHIFT 12
-#define S390_PAGE_SIZE (1UL << S390_PAGE_SHIFT)
-#define S390_PAGE_MASK (~(S390_PAGE_SIZE-1))
-
-#define S390_PGDIR_SHIFT 20
-#define S390_PGDIR_SIZE (1UL << S390_PGDIR_SHIFT)
-#define S390_PGDIR_MASK (~(S390_PGDIR_SIZE-1))
-
-#define S390_PTRS_PER_PGD 2048
-#define S390_PTRS_PER_PTE 256
-
#define S390_PMD_BASE_MASK (~((1UL<<6)-1))
#define S390_PT_BASE_MASK S390_PMD_BASE_MASK
#define S390_PAGE_BASE_MASK (~((1UL<<12)-1))
@@ -44,26 +33,10 @@
#define S390_PAGE_INVALID 0x400 /* HW invalid */
#define S390_PAGE_INVALID_MASK 0x601ULL /* for linux 2.6 */
#define S390_PAGE_INVALID_NONE 0x401ULL /* for linux 2.6 */
-#define S390_PAGE_TABLE_LEN 0xf /* only full page-tables */
-#define S390_PAGE_TABLE_INV 0x20 /* invalid page-table */
#define S390_PTE_INVALID_MASK 0x80000900
#define S390_PTE_INVALID(x) ((x) & S390_PTE_INVALID_MASK)
-#define S390_PMD_INVALID_MASK 0x80000000
-#define S390_PMD_INVALID(x) ((x) & S390_PMD_INVALID_MASK)
-
-/* pgd/pmd/pte query macros */
-#define s390_pmd_none(x) ((x) & S390_PAGE_TABLE_INV)
-#define s390_pmd_bad(x) (((x) & (~S390_PMD_BASE_MASK & \
- ~S390_PAGE_TABLE_INV)) != \
- S390_PAGE_TABLE_LEN)
-
-#define s390_pte_none(x) (((x) & (S390_PAGE_INVALID | S390_RO_S390 | \
- S390_PAGE_PRESENT)) == \
- S390_PAGE_INVALID)
-
-
#define ASYNC_STACK_SIZE STACKSIZE() // can be 4096 or 8192
#define KERNEL_STACK_SIZE STACKSIZE() // can be 4096 or 8192
@@ -73,8 +46,6 @@
* declarations of static functions
*/
static void s390_print_lowcore(char*, struct bt_info*,int);
-static unsigned long s390_pgd_offset(unsigned long, unsigned long);
-static unsigned long s390_pte_offset(unsigned long, unsigned long);
static int s390_kvtop(struct task_context *, ulong, physaddr_t *, int);
static int s390_uvtop(struct task_context *, ulong, physaddr_t *, int);
static int s390_vtop(unsigned long, ulong, physaddr_t*, int);
@@ -292,60 +263,87 @@
/*
* page table traversal functions
*/
-static unsigned long
-s390_pgd_offset(unsigned long pgd_base, unsigned long vaddr)
-{
- unsigned long pgd_off, pmd_base;
- pgd_off = ((vaddr >> S390_PGDIR_SHIFT) & (S390_PTRS_PER_PGD - 1))
- * S390_WORD_SIZE;
- readmem(pgd_base + pgd_off, PHYSADDR, &pmd_base,sizeof(long),
- "pgd_base",FAULT_ON_ERROR);
- return pmd_base;
-}
-
-unsigned long s390_pte_offset(unsigned long pte_base, unsigned long vaddr)
+/* Segment table traversal function */
+static ulong _kl_sg_table_deref_s390(ulong vaddr, ulong table, int len)
{
- unsigned pte_off, pte_val;
+ ulong offset, entry;
+
+ offset = ((vaddr >> 20) & 0x7ffUL) * 4;
+ if (offset >= (len + 1)*64)
+ /* Offset is over the table limit. */
+ return 0;
+ readmem(table + offset, KVADDR, &entry, sizeof(entry), "entry",
+ FAULT_ON_ERROR);
- pte_off = ((vaddr >> S390_PAGE_SHIFT) & (S390_PTRS_PER_PTE - 1))
- * S390_WORD_SIZE;
- readmem(pte_base + pte_off, PHYSADDR, &pte_val, sizeof(long),
- "pte_val",FAULT_ON_ERROR);
- return pte_val;
+ /*
+ * Check if the segment table entry could be read and doesn't have
+ * any of the reserved bits set.
+ */
+ if (entry & 0x80000000UL)
+ return 0;
+ /* Check if the segment table entry has the invalid bit set. */
+ if (entry & 0x40UL)
+ return 0;
+ /* Segment table entry is valid and well formed. */
+ return entry;
+}
+
+/* Page table traversal function */
+static ulong _kl_pg_table_deref_s390(ulong vaddr, ulong table, int len)
+{
+ ulong offset, entry;
+
+ offset = ((vaddr >> 12) & 0xffUL) * 4;
+ if (offset >= (len + 1)*64)
+ /* Offset is over the table limit. */
+ return 0;
+ readmem(table + offset, KVADDR, &entry, sizeof(entry), "entry",
+ FAULT_ON_ERROR);
+ /*
+ * Check if the page table entry could be read and doesn't have
+ * any of the reserved bits set.
+ */
+ if (entry & 0x80000900UL)
+ return 0;
+ /* Check if the page table entry has the invalid bit set. */
+ if (entry & 0x400UL)
+ return 0;
+ /* Page table entry is valid and well formed. */
+ return entry;
}
-/*
- * Generic vtop function for user and kernel addresses
- */
+/* lookup virtual address in page tables */
static int
-s390_vtop(unsigned long pgd_base, ulong kvaddr, physaddr_t *paddr, int verbose)
+s390_vtop(unsigned long table, ulong vaddr, physaddr_t *phys_addr, int verbose)
{
- unsigned pte_base, pte_val;
+ ulong entry, paddr;
+ int len;
- /* get the pgd entry */
- pte_base = s390_pgd_offset(pgd_base,kvaddr);
- if(S390_PMD_INVALID(pte_base) ||
- s390_pmd_bad(pte_base) ||
- s390_pmd_none(pte_base)) {
- *paddr = 0;
- return FALSE;
- }
- /* get the pte */
- pte_base = pte_base & S390_PT_BASE_MASK;
- pte_val = s390_pte_offset(pte_base,kvaddr);
- if(S390_PTE_INVALID(pte_val) ||
- s390_pte_none(pte_val)){
- *paddr = 0;
+ /*
+ * Get the segment table entry.
+ * We assume that the segment table length field in the asce
+ * is set to the maximum value of 127 (which translates to
+ * a segment table with 2048 entries) and that the addressing
+ * mode is 31 bit.
+ */
+ entry = _kl_sg_table_deref_s390(vaddr, table, 127);
+ if (!entry)
return FALSE;
- }
- if(!s390_pte_present(pte_val)){
- /* swapped out */
- *paddr = pte_val;
+ table = entry & 0x7ffffc00UL;
+ len = entry & 0xfUL;
+
+ /* Get the page table entry */
+ entry = _kl_pg_table_deref_s390(vaddr, table, len);
+ if (!entry)
return FALSE;
- }
- *paddr = (pte_val & S390_PAGE_BASE_MASK) |
- (kvaddr & (~(S390_PAGE_MASK)));
+
+ /* Isolate the page origin from the page table entry. */
+ paddr = entry & 0x7ffff000UL;
+
+ /* Add the page offset and return the final value. */
+ *phys_addr = paddr + (vaddr & 0xfffUL);
+
return TRUE;
}
diff -Naur crash-4.0-4.8/s390x.c crash-4.0-4.8-page-table-walk/s390x.c
--- crash-4.0-4.8/s390x.c 2007-10-30 16:51:54.000000000 +0100
+++ crash-4.0-4.8-page-table-walk/s390x.c 2007-11-15 15:44:33.000000000 +0100
@@ -20,24 +20,6 @@
#define S390X_WORD_SIZE 8
-#define S390X_PAGE_SHIFT 12
-#define S390X_PAGE_SIZE (1ULL << S390X_PAGE_SHIFT)
-#define S390X_PAGE_MASK (~(S390X_PAGE_SIZE-1))
-
-#define S390X_PGDIR_SHIFT 31
-#define S390X_PGDIR_SIZE (1ULL << S390X_PGDIR_SHIFT)
-#define S390X_PGDIR_MASK (~(S390X_PGDIR_SIZE-1))
-
-#define S390X_PMD_SHIFT 20
-#define S390X_PMD_SIZE (1ULL << S390X_PMD_SHIFT)
-#define S390X_PMD_MASK (~(S390X_PMD_SIZE-1))
-
-#define S390X_PTRS_PER_PGD 2048
-#define S390X_PTRS_PER_PMD 2048
-#define S390X_PTRS_PER_PTE 256
-
-#define S390X_PMD_BASE_MASK (~((1ULL<<12)-1))
-#define S390X_PT_BASE_MASK (~((1ULL<<11)-1))
#define S390X_PAGE_BASE_MASK (~((1ULL<<12)-1))
/* Flags used in entries of page dirs and page tables.
@@ -48,37 +30,11 @@
#define S390X_PAGE_INVALID 0x400ULL /* HW invalid */
#define S390X_PAGE_INVALID_MASK 0x601ULL /* for linux 2.6 */
#define S390X_PAGE_INVALID_NONE 0x401ULL /* for linux 2.6 */
-#define S390X_PMD_ENTRY_INV 0x20ULL /* invalid segment table entry */
-#define S390X_PGD_ENTRY_INV 0x20ULL /* invalid region table entry */
-#define S390X_PMD_ENTRY 0x00
-#define S390X_PGD_ENTRY_FIRST 0x05 /* first part of pmd is valid */
-#define S390X_PGD_ENTRY_SECOND 0xc7 /* second part of pmd is valid */
-#define S390X_PGD_ENTRY_FULL 0x07 /* complete pmd is valid */
/* bits 52, 55 must contain zeroes in a pte */
#define S390X_PTE_INVALID_MASK 0x900ULL
#define S390X_PTE_INVALID(x) ((x) & S390X_PTE_INVALID_MASK)
-/* pgd/pmd/pte query macros */
-#define s390x_pgd_none(x) ((x) & S390X_PGD_ENTRY_INV)
-#define s390x_pgd_bad(x) !( (((x) & S390X_PGD_ENTRY_FIRST) == \
- S390X_PGD_ENTRY_FIRST) || \
- (((x) & S390X_PGD_ENTRY_SECOND) == \
- S390X_PGD_ENTRY_SECOND) || \
- (((x) & S390X_PGD_ENTRY_FULL) == \
- S390X_PGD_ENTRY_FULL))
-
-#define s390x_pmd_none(x) ((x) & S390X_PMD_ENTRY_INV)
-#define s390x_pmd_bad(x) (((x) & (~S390X_PT_BASE_MASK & \
- ~S390X_PMD_ENTRY_INV)) != \
- S390X_PMD_ENTRY)
-
-#define s390x_pte_none(x) (((x) & (S390X_PAGE_INVALID | \
- S390X_PAGE_RO | \
- S390X_PAGE_PRESENT)) == \
- S390X_PAGE_INVALID)
-
-
#define ASYNC_STACK_SIZE STACKSIZE() // can be 8192 or 16384
#define KERNEL_STACK_SIZE STACKSIZE() // can be 8192 or 16384
@@ -88,9 +44,6 @@
* declarations of static functions
*/
static void s390x_print_lowcore(char*, struct bt_info*,int);
-static unsigned long s390x_pgd_offset(unsigned long, unsigned long);
-static unsigned long s390x_pmd_offset(unsigned long, unsigned long);
-static unsigned long s390x_pte_offset(unsigned long, unsigned long);
static int s390x_kvtop(struct task_context *, ulong, physaddr_t *, int);
static int s390x_uvtop(struct task_context *, ulong, physaddr_t *, int);
static int s390x_vtop(unsigned long, ulong, physaddr_t*, int);
@@ -304,81 +257,97 @@
}
}
-/*
+/*
* page table traversal functions
*/
-unsigned long s390x_pgd_offset(unsigned long pgd_base, unsigned long vaddr)
-{
- unsigned long pgd_off, pmd_base;
-
- pgd_off = ((vaddr >> S390X_PGDIR_SHIFT) &
- (S390X_PTRS_PER_PGD - 1)) * 8;
- readmem(pgd_base + pgd_off, PHYSADDR, &pmd_base, sizeof(long),
- "pmd_base",FAULT_ON_ERROR);
-
- return pmd_base;
-}
-unsigned long s390x_pmd_offset(unsigned long pmd_base, unsigned long vaddr)
-{
- unsigned long pmd_off, pte_base;
-
- pmd_off = ((vaddr >> S390X_PMD_SHIFT) & (S390X_PTRS_PER_PMD - 1))
- * 8;
- readmem(pmd_base + pmd_off, PHYSADDR, &pte_base, sizeof(long),
- "pte_base",FAULT_ON_ERROR);
- return pte_base;
-}
-
-unsigned long s390x_pte_offset(unsigned long pte_base, unsigned long vaddr)
-{
- unsigned long pte_off, pte_val;
-
- pte_off = ((vaddr >> S390X_PAGE_SHIFT) & (S390X_PTRS_PER_PTE - 1))
- * 8;
- readmem(pte_base + pte_off, PHYSADDR, &pte_val, sizeof(long),
- "pte_val",FAULT_ON_ERROR);
- return pte_val;
+/* Region or segment table traversal function */
+static ulong _kl_rsg_table_deref_s390x(ulong vaddr, ulong table,
+ int len, int level)
+{
+ ulong offset, entry;
+
+ offset = ((vaddr >> (11*level + 20)) & 0x7ffULL) * 8;
+ if (offset >= (len + 1)*4096)
+ /* Offset is over the table limit. */
+ return 0;
+ readmem(table + offset, KVADDR, &entry, sizeof(entry), "entry",
+ FAULT_ON_ERROR);
+ /*
+ * Check if the segment table entry could be read and doesn't have
+ * any of the reserved bits set.
+ */
+ if ((entry & 0xcULL) != (level << 2))
+ return 0;
+ /* Check if the region table entry has the invalid bit set. */
+ if (entry & 0x40ULL)
+ return 0;
+ /* Region table entry is valid and well formed. */
+ return entry;
}
-/*
- * Generic vtop function for user and kernel addresses
- */
-static int
-s390x_vtop(unsigned long pgd_base, ulong kvaddr, physaddr_t *paddr, int verbose)
+/* Page table traversal function */
+static ulong _kl_pg_table_deref_s390x(ulong vaddr, ulong table)
{
- unsigned long pmd_base, pte_base, pte_val;
+ ulong offset, entry;
- /* get the pgd entry */
- pmd_base = s390x_pgd_offset(pgd_base,kvaddr);
- if(s390x_pgd_bad(pmd_base) ||
- s390x_pgd_none(pmd_base)){
- *paddr = 0;
+ offset = ((vaddr >> 12) & 0xffULL) * 8;
+ readmem(table + offset, KVADDR, &entry, sizeof(entry), "entry",
+ FAULT_ON_ERROR);
+ /*
+ * Check if the page table entry could be read and doesn't have
+ * any of the reserved bits set.
+ */
+ if (entry & 0x900ULL)
+ return 0;
+ /* Check if the page table entry has the invalid bit set. */
+ if (entry & 0x400ULL)
+ return 0;
+ /* Page table entry is valid and well formed. */
+ return entry;
+}
+
+/* lookup virtual address in page tables */
+int s390x_vtop(ulong table, ulong vaddr, physaddr_t *phys_addr, int verbose)
+{
+ ulong entry, paddr;
+ int level, len;
+
+ /*
+ * Walk the region and segment tables.
+ * We assume that the table length field in the asce is set to the
+ * maximum value of 3 (which translates to a region first, region
+ * second, region third or segment table with 2048 entries) and that
+ * the addressing mode is 64 bit.
+ */
+ len = 3;
+ /* Read the first entry to find the number of page table levels. */
+ readmem(table, KVADDR, &entry, sizeof(entry), "entry", FAULT_ON_ERROR);
+ level = (entry & 0xcULL) >> 2;
+ if ((vaddr >> (31 + 11*level)) != 0ULL) {
+ /* Address too big for the number of page table levels. */
return FALSE;
}
- /* get the pmd */
- pmd_base = pmd_base & S390X_PMD_BASE_MASK;
- pte_base = s390x_pmd_offset(pmd_base,kvaddr);
- if(s390x_pmd_bad(pte_base) ||
- s390x_pmd_none(pte_base)) {
- *paddr = 0;
- return FALSE;
+ while (level >= 0) {
+ entry = _kl_rsg_table_deref_s390x(vaddr, table, len, level);
+ if (!entry)
+ return 0;
+ table = entry & ~0xfffULL;
+ len = entry & 0x3ULL;
+ level--;
}
- /* get the pte */
- pte_base = pte_base & S390X_PT_BASE_MASK;
- pte_val = s390x_pte_offset(pte_base,kvaddr);
- if (S390X_PTE_INVALID(pte_val) ||
- s390x_pte_none(pte_val)){
- *paddr = 0;
- return FALSE;
- }
- if(!s390x_pte_present(pte_val)){
- /* swapped out */
- *paddr = pte_val;
+
+ /* Get the page table entry */
+ entry = _kl_pg_table_deref_s390x(vaddr, entry & ~0x7ffULL);
+ if (!entry)
return FALSE;
- }
- *paddr = (pte_val & S390X_PAGE_BASE_MASK) |
- (kvaddr & (~(S390X_PAGE_MASK)));
+
+ /* Isolate the page origin from the page table entry. */
+ paddr = entry & ~0xfffULL;
+
+ /* Add the page offset and return the final value. */
+ *phys_addr = paddr + (vaddr & 0xfffULL);
+
return TRUE;
}
17 years, 1 month
[PATCH] LKCD: Use kernel start address from dump header
by Bernhard Walle
Hi,
This patch uses the kernel start address from the dump header on IA64
instead of the default value. This has been reported as bug on a SGI
machine -- lcrash was able to open the dump because it uses the start
address of the header, crash was not.
Troy, ACK?
Dave: That's my last LKCD patch for now. ;-)
Signed-off-by: Bernhard Walle <bwalle(a)suse.de>
---
defs.h | 2 ++
ia64.c | 11 ++++++++++-
lkcd_common.c | 21 +++++++++++++++++++++
lkcd_fix_mem.c | 10 ++++++++++
4 files changed, 43 insertions(+), 1 deletion(-)
--- a/defs.h
+++ b/defs.h
@@ -3874,6 +3874,7 @@ int fix_addr_v8(struct _dump_header_asm_
int lkcd_dump_init_v8_arch(struct _dump_header_s *dh);
int fix_addr_v7(int);
int get_lkcd_regs_for_cpu_arch(int cpu, ulong *eip, ulong *esp);
+int lkcd_get_kernel_start_v8(ulong *addr);
/*
* lkcd_v8.c
@@ -4144,6 +4145,7 @@ int lkcd_load_dump_page_header(void *, u
void lkcd_dumpfile_complaint(uint32_t, uint32_t, int);
int set_mb_benchmark(ulong);
ulonglong fix_lkcd_address(ulonglong);
+int lkcd_get_kernel_start(ulong *addr);
int get_lkcd_regs_for_cpu(struct bt_info *bt, ulong *eip, ulong *esp);
/*
--- a/ia64.c
+++ b/ia64.c
@@ -3810,7 +3810,16 @@ ia64_calc_phys_start(void)
phys_start);
}
return;
- }
+ } else if (LKCD_DUMPFILE()) {
+
+ if (lkcd_get_kernel_start(&phys_start)) {
+ machdep->machspec->phys_start = phys_start;
+ if (CRASHDEBUG(1))
+ fprintf(fp,
+ "LKCD dump: phys_start: %lx\n",
+ phys_start);
+ }
+ }
if ((vd = get_kdump_vmcore_data())) {
/*
--- a/lkcd_common.c
+++ b/lkcd_common.c
@@ -787,6 +787,27 @@ get_offset(uint64_t paddr)
}
+#ifdef IA64
+
+int
+lkcd_get_kernel_start(ulong *addr)
+{
+ if (!addr)
+ return 0;
+
+ switch (lkcd->version)
+ {
+ case LKCD_DUMP_V8:
+ case LKCD_DUMP_V9:
+ return lkcd_get_kernel_start_v8(addr);
+
+ default:
+ return 0;
+ }
+}
+
+#endif
+
int
lkcd_lseek(physaddr_t paddr)
--- a/lkcd_fix_mem.c
+++ b/lkcd_fix_mem.c
@@ -97,4 +97,14 @@ get_lkcd_switch_stack(ulong task)
return 0;
}
+int lkcd_get_kernel_start_v8(ulong *addr)
+{
+ if (!addr)
+ return 0;
+
+ *addr = ((dump_header_asm_t *)lkcd->dump_header_asm)->dha_kernel_addr;
+
+ return 1;
+}
+
#endif // IA64
17 years, 1 month
Re: Error when analysing dump on 2.6.21.4 kernel
by Sachin P. Sant
Ankita Garg wrote:
> Hi,
>
> Am working on backporting relocatable kernel support for x86_64 from
> 2.6.22.1 kernel to 2.6.21.4. kdump is working fine. But when opening the
> vmcore file with crash, I get the following error:
I had a discussion with Ankita about this problem. This is what i
think is happening.
This x86-64 kernel has CONFIG_NUMA off with SPARSEMEM support.
The failure occurs as line 11738 in memory.c [ This is with
latest crash ]
crash: invalid structure member offset: pglist_data_node_mem_map
FILE: memory.c LINE: 11738 FUNCTION: dump_memory_nodes()
Looking at the crash source here is the code in question :
11728 if (IS_SPARSEMEM()) {
11729 zone_mem_map = 0;
11730 zone_start_mapnr = 0;
11731 if (zone_size) {
11732 phys = PTOB(zone_start_pfn);
11733 zone_start_mapnr = phys/PAGESIZE();
11734 }
11735
11736 } else if (!(vt->flags & NODES) &&
11737 INVALID_MEMBER(zone_zone_mem_map)) {
11738 readmem(pgdat+OFFSET(pglist_data_node_mem_map),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11739 KVADDR, &zone_mem_map, sizeof(void *),
11740 "contig_page_data mem_map",FAULT_ON_ERROR);
11741 if (zone_size)
11742 zone_mem_map += cum_zone_size * SIZE(page);
The code is trying to read pglist_data_node_mem_map value which does not exist.
[Since CONFIG_NUMA is off]. It should have entered the if (IS_SPARSEMEM())
condition [ line 11728 ] since SPARSEMEM is enabled for this kernel.
The flag value of SPARSEMEM is set by this code in memory.c
558 if (kernel_symbol_exists("mem_map")) {
559 get_symbol_data("mem_map", sizeof(char *), &vt->mem_map);
560 vt->flags |= FLATMEM;
561 } else if (kernel_symbol_exists("mem_section"))
562 vt->flags |= SPARSEMEM;
563 else
564 vt->flags |= DISCONTIGMEM;
But what i found was SPARSEMEM flag is not set, instead FLATMEM is set as
mem_map symbol exist in this particular kernel.[ mem_section kernel symbol
is also present in this kernel]
[crash-4.0-4.8]# cat /boot/System.map | grep mem_map
ffffffff8072dab0 B mem_map
[crash-4.0-4.8]# cat /boot/System.map | grep mem_section
ffffffff8072e800 B mem_section
>From kernel source mm/memory.c: mem_map is defined if CONFIG_NEED_MULTIPLE_NODES
is not defined. Which is the case here.
I am not a mm expert so i can't tell what to make out of this situation where
both mem_map and mem_section kernel symbol exist. Anyone ??
Anyway as for the crash problem this could be fixed by rearranging the
above code as follows:
- if (kernel_symbol_exists("mem_map")) {
+ if (kernel_symbol_exists("mem_section"))
+ vt->flags |= SPARSEMEM;
+ else if (kernel_symbol_exists("mem_map")) {
get_symbol_data("mem_map", sizeof(char *), &vt->mem_map);
vt->flags |= FLATMEM;
- } else if (kernel_symbol_exists("mem_section"))
- vt->flags |= SPARSEMEM;
- else
+ } else
But since i am not very sure about the mm code, there might be a better way to
fix this.
Thanks
-Sachin
17 years, 1 month
[PATCH] fix for ppc64 virtual-to-physical translation w/64K pages
by Dave Anderson
A recent patch to the the upstream 2.6.23 kernel, and RHEL5
kernels from 2.6.18-40.el5 and beyond, changes the ppc64
PTE_RPN_SHIFT in pgtable-64k.h from 32 to 30 bits. This
in turn causes the crash utility's virtual-to-physical address
translation of kernel vmalloc() and user-space virtual addresses
to fail on ppc64 kernels configured with 64K pages. During
crash session initialization, this warning message is displayed:
WARNING: cannot access vmalloc'd module memory
The same message will appear if the "mod" command is subsequently
attempted.
Translations and reads of vmalloc'd kernel virtual addresses and
user virtual addresses will appear to work, but bogus data will
be returned because the resultant physical address that is read
is incorrect.
The attached patch recognizes whether this patch has been applied
and adjusts the PTE shift value accordingly. There's also some
additional "help -m" output for the machine-specific data that
was missing from the display.
The patch tests OK, and I don't believe any other vtop-related
changes are required, but given that the ppc64.c was contributed
by IBM, can I get an ACK from the IBM brain-trust out there?
Thanks,
Dave
--- crash-4.0-4.8/defs.h 2007-11-13 11:21:32.000000000 -0500
+++ next/defs.h 2007-11-13 11:20:00.000000000 -0500
@@ -2526,7 +2526,8 @@
#define PMD_INDEX_SIZE_L4_64K 12
#define PUD_INDEX_SIZE_L4_64K 0
#define PGD_INDEX_SIZE_L4_64K 4
-#define PTE_SHIFT_L4_64K 32
+#define PTE_SHIFT_L4_64K_V1 32
+#define PTE_SHIFT_L4_64K_V2 30
#define PMD_MASKED_BITS_64K 0x1ff
#define L4_OFFSET(vaddr) ((vaddr >> (machdep->machspec->l4_shift)) & 0x1ff)
--- crash-4.0-4.8/ppc64.c 2007-11-13 11:21:32.000000000 -0500
+++ next/ppc64.c 2007-11-13 11:20:00.000000000 -0500
@@ -160,7 +160,8 @@
m->l2_index_size = PMD_INDEX_SIZE_L4_64K;
m->l3_index_size = PUD_INDEX_SIZE_L4_64K;
m->l4_index_size = PGD_INDEX_SIZE_L4_64K;
- m->pte_shift = PTE_SHIFT_L4_64K;
+ m->pte_shift = symbol_exists("demote_segment_4k") ?
+ PTE_SHIFT_L4_64K_V2 : PTE_SHIFT_L4_64K_V1;
m->l2_masked_bits = PMD_MASKED_BITS_64K;
} else {
/* 4K pagesize */
@@ -305,7 +306,7 @@
void
ppc64_dump_machdep_table(ulong arg)
{
- int others;
+ int i, c, others;
others = 0;
fprintf(fp, " flags: %lx (", machdep->flags);
@@ -368,10 +369,43 @@
fprintf(fp, " max_physmem_bits: %ld\n", machdep->max_physmem_bits);
fprintf(fp, " sections_per_root: %ld\n", machdep->sections_per_root);
fprintf(fp, " machspec: %lx\n", (ulong)machdep->machspec);
- fprintf(fp, " pgd_index_size: %d\n", machdep->machspec->l4_index_size);
- fprintf(fp, " pud_index_size: %d\n", machdep->machspec->l3_index_size);
- fprintf(fp, " pmd_index_size: %d\n", machdep->machspec->l2_index_size);
- fprintf(fp, " pte_index_size: %d\n", machdep->machspec->l1_index_size);
+ fprintf(fp, " hwintrstack[%d]: ", NR_CPUS, machdep->machspec->l4_index_size);
+ for (c = 0; c < NR_CPUS; c++) {
+ for (others = 0, i = c; i < NR_CPUS; i++) {
+ if (machdep->machspec->hwintrstack[i])
+ others++;
+ }
+ if (!others) {
+ fprintf(fp, "%s%s",
+ c && ((c % 4) == 0) ? "\n " : "",
+ c ? "(remainder unused)" : "(unused)");
+ break;
+ }
+
+ fprintf(fp, "%s%016lx ",
+ ((c % 4) == 0) ? "\n " : "",
+ machdep->machspec->hwintrstack[c]);
+ }
+ fprintf(fp, "\n");
+ fprintf(fp, " hwstackbuf: %lx\n", (ulong)machdep->machspec->hwstackbuf);
+ fprintf(fp, " hwstacksize: %d\n", (ulong)machdep->machspec->hwstacksize);
+ fprintf(fp, " level4: %lx\n", (ulong)machdep->machspec->level4);
+ fprintf(fp, " last_level4_read: %lx\n", (ulong)machdep->machspec->last_level4_read);
+ fprintf(fp, " l4_index_size: %d\n", machdep->machspec->l4_index_size);
+ fprintf(fp, " l3_index_size: %d\n", machdep->machspec->l3_index_size);
+ fprintf(fp, " l2_index_size: %d\n", machdep->machspec->l2_index_size);
+ fprintf(fp, " l1_index_size: %d\n", machdep->machspec->l1_index_size);
+ fprintf(fp, " ptrs_per_l3: %d\n", machdep->machspec->ptrs_per_l3);
+ fprintf(fp, " ptrs_per_l2: %d\n", machdep->machspec->ptrs_per_l2);
+ fprintf(fp, " ptrs_per_l1: %d\n", machdep->machspec->ptrs_per_l1);
+ fprintf(fp, " l4_shift: %d\n", machdep->machspec->l4_shift);
+ fprintf(fp, " l3_shift: %d\n", machdep->machspec->l3_shift);
+ fprintf(fp, " l2_shift: %d\n", machdep->machspec->l2_shift);
+ fprintf(fp, " l1_shift: %d\n", machdep->machspec->l1_shift);
+ fprintf(fp, " pte_shift: %d\n", machdep->machspec->pte_shift);
+ fprintf(fp, " l2_masked_bits: %d\n", machdep->machspec->l2_masked_bits);
+
+
}
/*
17 years, 1 month