xencrash fixes for xen-3.3.0
by Itsuro ODA
Hi,
This patch is for xen hypervisor analysis function of the
crash command to apply to the xen-3.3.0 (the newest version of xen).
* PERCPU_SHIFT becomes 13 (from 12) in the xen-3.3.0.
This value is calculated from "__per_cpu_start" and "__per_cpu_data_end".
* "jiffies" does not exist in the xen-3.3.0.
It was used to show the uptime. I found there is no altanernative
(i.e. the xen hypervisor does not have the uptime.).
Then if "jiffies" does not exist, "--:--:--" is showed as UPTIME in
the sys command.
(Is it better to eliminate the whole UPTIME line ?)
--- example ---
crash> sys
KERNEL: xen-syms
DUMPFILE: vmcore
CPUS: 4
DOMAINS: 5
UPTIME: --:--:--
MACHINE: Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz (2660 Mhz)
MEMORY: 2 GB
----------------
This patch is for crash-4.0-7.2.
Thanks
Itsuro Oda
---
--- xen_hyper_defs.h.org 2008-10-06 13:45:39.000000000 +0900
+++ xen_hyper_defs.h 2008-10-06 13:44:44.000000000 +0900
@@ -134,9 +134,8 @@
#endif
#if defined(X86) || defined(X86_64)
-#define XEN_HYPER_PERCPU_SHIFT 12
#define xen_hyper_per_cpu(var, cpu) \
- ((ulong)(var) + (((ulong)(cpu))<<XEN_HYPER_PERCPU_SHIFT))
+ ((ulong)(var) + (((ulong)(cpu))<<xht->percpu_shift))
#elif defined(IA64)
#define xen_hyper_per_cpu(var, cpu) \
((xht->flags & XEN_HYPER_SMP) ? \
@@ -404,6 +403,7 @@
ulong *cpumask;
uint *cpu_idxs;
ulong *__per_cpu_offset;
+ int percpu_shift;
};
struct xen_hyper_dumpinfo_context {
--- xen_hyper.c.org 2008-10-06 13:41:14.000000000 +0900
+++ xen_hyper.c 2008-10-06 14:15:03.000000000 +0900
@@ -71,6 +71,8 @@
#endif
#if defined(X86) || defined(X86_64)
+ xht->percpu_shift =
+ (symbol_value("__per_cpu_data_end") - symbol_value("__per_cpu_start") > 4096) ? 13: 12;
member_offset = MEMBER_OFFSET("cpuinfo_x86", "x86_model_id");
buf = GETBUF(XEN_HYPER_SIZE(cpuinfo_x86));
if (xen_hyper_test_pcpu_id(XEN_HYPER_CRASHING_CPU())) {
@@ -1746,9 +1748,11 @@
tmp2 = (ulong)jiffies_64;
jiffies_64 = (ulonglong)(tmp2 - tmp1);
}
- } else {
+ } else if (symbol_exists("jiffies")) {
get_symbol_data("jiffies", sizeof(long), &jiffies);
jiffies_64 = (ulonglong)jiffies;
+ } else {
+ jiffies_64 = 0; /* hypervisor does not have uptime */
}
return jiffies_64;
--- xen_hyper_command.c.org 2008-10-07 08:05:37.000000000 +0900
+++ xen_hyper_command.c 2008-10-07 08:24:29.000000000 +0900
@@ -1022,7 +1022,8 @@
(buf1, "%d\n", XEN_HYPER_NR_DOMAINS()));
/* !!!Display a date here if it can be found. */
XEN_HYPER_PRI(fp, len, "UPTIME: ", buf1, flag,
- (buf1, "%s\n", convert_time(xen_hyper_get_uptime_hyper(), buf2)));
+ (buf1, "%s\n", (xen_hyper_get_uptime_hyper() ?
+ convert_time(xen_hyper_get_uptime_hyper(), buf2) : "--:--:--")));
/* !!!Display a version here if it can be found. */
XEN_HYPER_PRI_CONST(fp, len, "MACHINE: ", flag);
if (strlen(uts->machine)) {
---
--
Itsuro ODA <oda(a)valinux.co.jp>
16 years, 1 month
Re: Bt -r on IA64
by Dave Anderson
This is a post from a non-member that I approved (as list moderator),
but for some reason it didn't get sent out:
caiqian(a)redhat.com wrote:
> Hi,
>
> I would like to check with you about one thing happened on IA64 only.
>
> crash> bt -r
>
> ...
> a0000001007ec100: 0000000000000000 00000000000001f4
> a0000001007ec110: 0000000000000000 0000000000000000
> a0000001007ec120: 0000000000000000 0000000000000000
> a0000001007ec130: 0000000000000000 v+10731618616
> a0000001007ec140: v+4808933688 0000000000000000
> a0000001007ec150: v+4350567296 0000000000000000
> a0000001007ec160: 0000000000000000 0000000000000000
> a0000001007ec170: 0000000000000000 0000000000000000
> a0000001007ec180: 0000000000000000 0000000000000000
> a0000001007ec190: init_task v+10731618728
> a0000001007ec1a0: v+10731618728 init_task+424
> a0000001007ec1b0: init_task+424 init_task
> ...
>
> What are those "v+XXXXXXXXXX"?
>
> Thanks,
> Cai Qian
Yean, you'll also see the same thing if you do a "rd -s <address>",
like this example:
crash> bt -r
...
e0000100ff0d1640: v+1103759521536 v+1103765979464
e0000100ff0d1650: v+1103758983168 scsi_softirq_done+656
e0000100ff0d1660: 000000000000030a scsi_execute_async+6928
e0000100ff0d1670: v+1103759521536 000000000251e042
...
crash> rd -s e0000100ff0d1640 8
e0000100ff0d1640: v+1103759521536 v+1103765979464
e0000100ff0d1650: v+1103758983168 scsi_softirq_done+656
e0000100ff0d1660: 000000000000030a scsi_execute_async+6928
e0000100ff0d1670: v+1103759521536 000000000251e042
crash>
The "v+offset" values are translations of unity-mapped (ia64 region 7)
kernel virtual addresses:
crash> rd e0000100ff0d1640 8
e0000100ff0d1640: e0000100fd31b700 e0000100fd944148 ..1.....HA......
e0000100ff0d1650: e0000100fd298000 a000000203f9bb90 ..).............
e0000100ff0d1660: 000000000000030a a000000203fbc3d0 ................
e0000100ff0d1670: e0000100fd31b700 000000000251e042 ..1.....B.Q.....
crash>
which are seen as offsets from the closest "symbol", which is "v":
crash> sym -l
...
a000000100dc1e00 (b) xfrm_state_gc_work
a000000100dc1e60 (b) __key.33744
a000000100dc1e68 (B) unix_socket_table
a000000100dc2670 (A) _end
e000000000000000 (A) v
ffffffffffff0000 (D) __per_cpu_start
ffffffffffff0000 (d) per_cpu__cpu_idle_state
ffffffffffff0008 (D) per_cpu__pfm_syst_info
ffffffffffff0010 (D) per_cpu__pmu_owner
...
This has to do with older ia64 kernels which compiled the kernel text
and data segments into region 7, based at e000000000000000. Newer
ia64 kernels map the kernel text and static data segments into
ia64 region 5, based at a000000000000000, although all other
physical memory is still unity-mapped at ia64 region 7.
Although it could probably be done with an ia64-specific hack,
I'm somewhat hesitant to change this behavior.
Dave
16 years, 1 month
Re: [Crash-utility] Crash setup!
by Dave Anderson
> But the /dev/crash driver does require small modifications to the kernel source,
> primarily to EXPORT_SYMBOL_GPL() the page_is_ram() function.
Interesting -- FWIW, the EXPORT_SYMBOL_GPL() requirement for page_is_ram()
may no longer be required if an analogous, static, version of page_is_ram()
were to be written into the crash driver itself -- and that static version
could use the e820_any_mapped() function, which is EXPORT_SYMBOL_GPL() in
the upstream kernel:
/*
* This function checks if any part of the range <start,end> is mapped
* with type.
*/
int
e820_any_mapped(u64 start, u64 end, unsigned type)
{
int i;
for (i = 0; i < e820.nr_map; i++) {
const struct e820entry *ei = &e820.map[i];
if (type && ei->type != type)
continue;
if (ei->addr >= end || ei->addr + ei->size <= start)
continue;
return 1;
}
return 0;
}
EXPORT_SYMBOL_GPL(e820_any_mapped);
For that matter, e820_any_mapped() is also in RHEL5. But it was not
in the 2.6.9-based RHEL4 kernel, which was what the RHEL5 version of the
crash driver was based upon.
But RHEL5 also has modified the x86-only page_is_ram() to check for efi_enabled,
and if it's set, to use the "memmap" efi_memory_map instead of the e820 map.
Although, that's not done upstream.
Anyway, just another data point...
Dave
16 years, 1 month
Crash setup!
by Jayaraman, Bhaskar
Hi i want to run crash on a dom0 or paravirtualized guest. Please let me know if there's a patch which I can include with the linux source to build the /dev/crash driver with the xen kernel. Also is there any specific way to build crash command line to be used on a xen kernel.
Any pointers to documentation also will be very useful.
Thanks in advance.
Bhaskar.
16 years, 1 month
target compilation?
by Jun Koi
Hi,
I looked at configure.c, and find some code like this:
void
get_current_configuration(void)
{
FILE *fp;
static char buf[512];
char *p;
#ifdef __alpha__
target_data.target = ALPHA;
#endif
#ifdef __i386__
target_data.target = X86;
#endif
#ifdef __powerpc__
target_data.target = PPC;
#endif
#ifdef __ia64__
target_data.target = IA64;
#endif
....
}
I have few questions:
- Is it correct that the above code want to find out the architecture
(means target here) we are compiling our code on?
- Who defined those architectures in the above code, like "__i386__"
(in the check "#ifdef __i386__")? I guessed that the architecture is
defined in a particular prototype file in /usr/include, but cannot
find anything there. So I think that those macros are defined by
compilation process of crash, but again I dont see anywhere in the
source doing that.
Thanks,
J
target we want
16 years, 1 month
"cannot access vmalloc'd module memory" when loading kdump'ed vmcore in crash
by Worth, Kevin
Hello kexec and crash mailing lists,
Sorry to spam whoever's code this ISN'T an issue with, but I really am unsure of whether is a kdump or a crash issue. I am running an Ubuntu 7.04 with a 2.6.20 kernel (includes Ubuntus patches- source at http://packages.ubuntu.com/feisty/linux-source-2.6.20 ) and a modified VMSPLIT/PAGE_OFFSET value (see bottom for details) on an i386 machine with 4GB of memory. At first I thought this could be an issue with makedumpfile stripping out things it shouldn't, but I've found that setting up my initrd script so that it simply performs "cp /proc/vmcore /var/crash/vmcore" results in the same issue.
I've tried this with both crash 4.0-6.3 and 4.0-7.2 and get the same result. Unfortunately I'm locked at kernel 2.6.20 for other reasons, or else I would try that.
If anyone can offer suggestions of what to try, please let me know. If this is something that has already been resolved elsewhere, sorry to waste time, and if someone can point me to what resolved it, perhaps I can look at backporting the fix myself. Thanks for your time.
crash-4.0-7.2$ ./crash ~/vmcore ~/targetfiles/vmlinux-2.6.20-17.39-custom2
crash 4.0-7.2
<snip>Copyright notices...</snip>
GNU gdb 6.1
<snip>Copyright notices...</snip>
This GDB was configured as "i686-pc-linux-gnu"...
please wait... (gathering module symbol data)
WARNING: cannot access vmalloc'd module memory
KERNEL: /home/worthk/targetfiles/vmlinux-2.6.20-17.39-custom2
DUMPFILE: /home/worthk/vmcore
CPUS: 2
DATE: Wed Oct 1 12:30:50 2008
UPTIME: 00:35:11
LOAD AVERAGE: 0.07, 0.09, 0.08
TASKS: 94
NODENAME: test-module
RELEASE: 2.6.20-17.39-custom2
VERSION: #3 SMP Wed Sep 24 10:11:03 PDT 2008
MACHINE: i686 (2200 Mhz)
MEMORY: 5 GB
<6>SysRq : Trigger a crashdump"
PID: 4304
COMMAND: "bash"
TASK: 5d7e9030 [THREAD_INFO: f4b70000]
CPU: 0
STATE: TASK_RUNNING (SYSRQ)
crash> mod -s test
mod: cannot access vmalloc'd module memory
My kernel config is a bit outside the norm, in that the VMSPLIT value has been modified to give 3GB of memory the kernelspace and 1GB of memory to userspace. Below is a diff between the default Ubuntu "generic" config and mine:
diff /boot/config-2.6.20-17-generic /boot/config-2.6.20-17.37-custom2
3,4c3,4
< # Linux kernel version: 2.6.20-17-generic < # Wed Aug 20 14:43:36 2008
---
> # Linux kernel version: 2.6.20-17.37-custom2 # Tue Aug 19 18:50:53
> 2008
33c33
< CONFIG_VERSION_SIGNATURE="Ubuntu 2.6.20-17.39-generic"
---
> CONFIG_VERSION_SIGNATURE="Ubuntu 2.6.20-17.37-generic"
51c51
< # CONFIG_EMBEDDED is not set
---
> CONFIG_EMBEDDED=y
188,190c188,194
< CONFIG_HIGHMEM4G=y
< # CONFIG_HIGHMEM64G is not set
< CONFIG_PAGE_OFFSET=0xC0000000
---
> # CONFIG_HIGHMEM4G is not set
> CONFIG_HIGHMEM64G=y
> # CONFIG_VMSPLIT_3G is not set
> # CONFIG_VMSPLIT_3G_OPT is not set
> # CONFIG_VMSPLIT_2G is not set
> CONFIG_VMSPLIT_1G=y
> CONFIG_PAGE_OFFSET=0x40000000
191a196
> CONFIG_X86_PAE=y
204c209
< # CONFIG_RESOURCES_64BIT is not set
---
> CONFIG_RESOURCES_64BIT=y
1161a1167
> CONFIG_IDE_MAX_HWIFS=4
1443a1450
> # CONFIG_PATA_PLATFORM is not set
1525a1533
> CONFIG_I2O_EXT_ADAPTEC_DMA64=y
Kevin Worth
Network Security Software Engineer
ProCurve networking by HP
kevin.worth(a)hp.com<mailto:kevin.worth@hp.com>
ph 916.785.4528
fx 916.785.1196
16 years, 1 month
Re: [Xen-devel] crash can't analyze memory dumpfile of Xen
by Itsuro ODA
Hi,
I found the root cause of this problem is that the value of "PERCPU_SHIFT"
was changed to 13 from 12.
The quick workaround is to apply the following patch to the crash command:
----------------------------------------------------------------------
--- xen_hyper_defs.h.org 2008-10-03 14:46:28.000000000 +0900
+++ xen_hyper_defs.h 2008-10-03 14:46:50.000000000 +0900
@@ -134,7 +134,7 @@
#endif
#if defined(X86) || defined(X86_64)
-#define XEN_HYPER_PERCPU_SHIFT 12
+#define XEN_HYPER_PERCPU_SHIFT 13
#define xen_hyper_per_cpu(var, cpu) \
((ulong)(var) + (((ulong)(cpu))<<XEN_HYPER_PERCPU_SHIFT))
#elif defined(IA64)
------------------------------------------------------------------------
I need to think the backword compatibility. I wonder how to determine
the value of "PERCPU_SHIFT". The change of "PERCPU_SHIFT" was made at
a certain point of xen-unstable before xen-3.3 release. The xen version
number (3.3) can't use as key... I will consider more...
Thanks.
Itsuro Oda
On Fri, 05 Sep 2008 13:42:46 +0900
Itsuro ODA <oda(a)valinux.co.jp> wrote:
> Hi,
>
> I recieved the dump file via FTP from Yuji and I can reproduced
> the problem.
>
> Hmm, it seems the format of the crash note section is not expected.
> (and there is another problem; "jiffies" is lost.)
> I will check more deeply.
>
> Until the problem is fixed, try the following quick hack.
> ------------------------------------------------------------------
> --- xen_hyper.c.org 2008-09-05 12:48:57.000000000 +0900
> +++ xen_hyper.c 2008-09-05 13:32:28.000000000 +0900
> @@ -150,7 +150,7 @@
> * Do some initialization.
> */
> #ifndef IA64
> - xen_hyper_dumpinfo_init();
> +// xen_hyper_dumpinfo_init(); /* XXX: should be fixed !! */
> #endif
> xhmachdep->pcpu_init();
> xen_hyper_domain_init();
> @@ -1746,9 +1746,11 @@
> tmp2 = (ulong)jiffies_64;
> jiffies_64 = (ulonglong)(tmp2 - tmp1);
> }
> - } else {
> + } else if (symbol_exists("jiffies")) {
> get_symbol_data("jiffies", sizeof(long), &jiffies);
> jiffies_64 = (ulonglong)jiffies;
> + } else {
> + jiffies_64 = 0; /* XXX: find alternative !! */
> }
>
> return jiffies_64;
> ------------------------------------------------------------------
> (the "dumpinfo" sub command cannot be used.)
>
> Thanks.
> Itsuro Oda
>
> On Thu, 04 Sep 2008 16:29:51 +0900
> Yuji Shimada <shimada-yxb(a)necst.nec.co.jp> wrote:
>
> > Hi ODA-san,
> >
> > Thank you so much for your reply.
> >
> > > What arch did you use ?
> >
> > I used x86_64 arch.
> >
> > > If you send me xen-syms-3.3-unstable and dumpfile.core
> > > I will investigate more.
> >
> > Please find "xen-syms-3.3-unstable", attached with the mail.
> > If you need "dumpfile.core" to investigate this issue, please let me know.
> >
> > In such case, I think I should send you "dumpfile.core" stored on
> > the disk by post, or upload it to your site. Because it is a huge size.
> >
> > Thanks,
> >
> > --
> > Yuji Shimada
>
> --
> Itsuro ODA <oda(a)valinux.co.jp>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel(a)lists.xensource.com
> http://lists.xensource.com/xen-devel
--
Itsuro ODA <oda(a)valinux.co.jp>
16 years, 1 month
crash versioning?
by Jun Koi
Hi,
I notice that the way Dave name crash version is a bit special (never
seen anywhere for me) : 4.0-7.1, 4.0-7.2, .... What is the point of
naming versions that way??
Thanks,
J
16 years, 1 month
question on some command params
by Jun Koi
Hi,
I found below cmdline params having no documentation anywhere, so
could somebody explain their meaning?
- memory_module
- no_modules
- no_ikconfig
- no_namelist_gzip
- no_kmem_cache
- kmem_cache_delay
- readnow
- buildinfo
- zero_excluded
Many thanks,
J
16 years, 1 month