January 2010 - Crash-utility - Crash Utility List Archives

Re: [Crash-utility] Running idle threads show wrong CPU numbers

by Dave Anderson

----- "Michael Holzheu" <holzheu(a)linux.vnet.ibm.com> wrote: > Hi Dave, > > I have a problem with a dump where I have defined five CPUs and two of > them are offline. In fact the logical CPUs are defined as follows: > > 0 on > 1 on > 2 off > 3 off > 4 on > > The CPU online map looks correct: > > crash> print/x *cpu_online_mask > $4 = { > bits = {0x13} ---> b10011 > } > > When I issue "ps" I see that all running tasks are idle, but the CPU > numbers are not correct (0,1,2 and not 0,1,4): > > PID PPID CPU TASK ST %MEM VSZ RSS COMM > > 0 0 0 800ef0 RU 0.0 0 0 [swapper] > > 0 0 1 18c24240 RU 0.0 0 0 [swapper] > > 0 0 2 18c2c340 RU 0.0 0 0 [swapper] > > I tried to debug the problem, but got stuck somewhere in "task.c". I > think there is a problem with the idle threads initialization, where the > online map is not considered. > > Maybe you can see the bug immediately. Otherwise I will have spend more > effort for debugging that problem. I hope not :-) Does "sys" show 5 or 3 cpus? I'm guessing it shows 3, but should show 5. It looks like the s390/s390x files need to use "get_highest_cpu_online()-1" (like x86_64 and ppc64) in order to determine the number of cpus to account for. As it is now, they do this, and would therefore only account for the first 3 cpus: int s390x_get_smp_cpus(void) { return get_cpus_online(); } int s390_get_smp_cpus(void) { return get_cpus_online(); } Dave

15 years, 6 months

2
3
0 / 0

Re: [Crash-utility] Using crash - is a debug kernel required during vmcore collection (additional question - typical vmcore size)

by Gallus

On 29 January 2010 15:47, Dave Anderson <anderson(a)redhat.com> wrote: > > If you're asking whether the secondary kdump kernel needs to > be the same as the crashing kernel, then the answer is no. > > Dave > That was my question. Thank you for answer. What is the size of vmcore file? Is it the size of all allocated virtual memory at the moment of kernel panic? Regards, Gallus

15 years, 6 months

2
1
0 / 0

Re: [Crash-utility] Using crash - is a debug kernel required during vmcore collection

by Dave Anderson

----- "Gallus" <gall.cwpl(a)gmail.com> wrote: > I have a simple question: In order to use crash, the vmcore doesn't > have to be collected under "debug" kernel? The symbols can be provided > later, during the analysis with the crash tool, right? I not sure I understand your question. Are you asking that if the vmcore of a particular kernel was collected, and you you do not have the debuginfo vmlinux that is associated with it, can you still analyze the vmcore? If that's what you're asking, then yes, there are ways to do that. You can rebuild the same kernel, and use the newly-built debuginfo vmlinux file along with the System.map file of the original kernel, like this: $ crash vmlinux-built-after-the-fact System-map vmcore You will have a few restrictions -- such as not being able to get line-number information from commands that display it. This is the *only* reason a System.map file is *ever* needed, and that's because the symbol values of the rebuilt debuginfo vmlinux file typically do not match those of the original crashing kernel. If you're asking whether the secondary kdump kernel needs to be the same as the crashing kernel, then the answer is no. Dave

15 years, 6 months

1
0
0 / 0

Using crash - is a debug kernel required during vmcore collection

by Gallus

I have a simple question: In order to use crash, the vmcore doesn't have to be collected under "debug" kernel? The symbols can be provided later, during the analysis with the crash tool, right? Regards, Gallus

15 years, 6 months

1
0
0 / 0

Re: [Crash-utility] crash fails to build with gcc-4.5

by Dave Anderson

----- "Troy Heber" <troy.heber(a)hp.com> wrote: > When trying to build crash with gcc-4.5 on x86-64 you get: > > unwind_x86_32_64.c:50:2: error: initializer element is not constant > unwind_x86_32_64.c:50:2: error: (near initialization for 'reg_info[7].offs') > unwind_x86_32_64.c:50:2: error: initializer element is not constant > unwind_x86_32_64.c:50:2: error: (near initialization for 'reg_info[8].offs') > unwind_x86_32_64.c:50:2: error: initializer element is not constant > ... > > When you start to dig into this you quickly end up playing with lots > of really fun macros from unwind_x86_64.h. Eventually, you end up > playing with this one: > > #define BUILD_BUG_ON_ZERO(e) (sizeof(char[1 - 2 * !!(e)]) - 1) > > If you pull this macro out and play with it by itself it seems to > work fine with both gcc-4.5 and gcc < 4.5. It is only when it is used in > combinations with the other macro expression that gcc-4.5 fails to > evaluate it and I have no clue why. > > When looking at the BUILD_BUG_ON_ZERO macro upstream in > include/linux/kernel.h we can see it has been replaced with this > version: > > #define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); })) > > It turns out that gcc-4.5 is perfectly happy with the updated version! > > > This was done in commit: 8c87df457cb58fe75b9b893007917cf8095660a0 > > BUILD_BUG_ON(): fix it and a couple of bogus uses of it > > gcc permitting variable length arrays makes the current construct used for > BUILD_BUG_ON() useless, as that doesn't produce any diagnostic if the > controlling expression isn't really constant. Instead, this patch makes > it so that a bit field gets used here. Consequently, those uses where the > condition isn't really constant now also need fixing. > > Note that in the gfp.h, kmemcheck.h, and virtio_config.h cases > MAYBE_BUILD_BUG_ON() really just serves documentation purposes - even if > the expression is compile time constant (__builtin_constant_p() yields > true), the array is still deemed of variable length by gcc, and hence the > whole expression doesn't have the intended effect. > > It looks like this could end up being a potential bug in gcc. I'll > file a bug with gcc and try to provide them with a simplified test > case. However, since this macro changed upstream and acts as a > workaround for the issue I would propose making the update in crash > as well. > > Troy I've been tempted to just rip out unwind_x86_32_64.c, unwind_x86_64.h and unwind_x86.h since they're pretty much useless. The unwind code in those files is only used if explicitly requested by "set unwind on" *and* if the kernel supports it (which it hasn't since Jan Beulich's x86/x86_64 temporary DWARF/unwind stuff was pulled). But thanks for digging this out -- queued for the next release. Dave > > --- > diff --git a/unwind_x86_64.h b/unwind_x86_64.h > index a79c2d5..52fcf7a 100644 > --- a/unwind_x86_64.h > +++ b/unwind_x86_64.h > @@ -61,7 +61,7 @@ extern void free_unwind_table(void); > #define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER) > #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) > #define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - > 2*!!(condition)])) > -#define BUILD_BUG_ON_ZERO(e) (sizeof(char[1 - 2 * !!(e)]) - 1) > +#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); })) > #define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f)) > #define get_unaligned(ptr) (*(ptr)) > //#define __get_user(x,ptr) > __get_user_nocheck((x),(ptr),sizeof(*(ptr))) > > -- > Crash-utility mailing list > Crash-utility(a)redhat.com > https://www.redhat.com/mailman/listinfo/crash-utility

15 years, 6 months

1
0
0 / 0

crash fails to build with gcc-4.5

by Troy Heber

When trying to build crash with gcc-4.5 on x86-64 you get: unwind_x86_32_64.c:50:2: error: initializer element is not constant unwind_x86_32_64.c:50:2: error: (near initialization for 'reg_info[7].offs') unwind_x86_32_64.c:50:2: error: initializer element is not constant unwind_x86_32_64.c:50:2: error: (near initialization for 'reg_info[8].offs') unwind_x86_32_64.c:50:2: error: initializer element is not constant ... When you start to dig into this you quickly end up playing with lots of really fun macros from unwind_x86_64.h. Eventually, you end up playing with this one: #define BUILD_BUG_ON_ZERO(e) (sizeof(char[1 - 2 * !!(e)]) - 1) If you pull this macro out and play with it by itself it seems to work fine with both gcc-4.5 and gcc < 4.5. It is only when it is used in combinations with the other macro expression that gcc-4.5 fails to evaluate it and I have no clue why. When looking at the BUILD_BUG_ON_ZERO macro upstream in include/linux/kernel.h we can see it has been replaced with this version: #define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); })) It turns out that gcc-4.5 is perfectly happy with the updated version! This was done in commit: 8c87df457cb58fe75b9b893007917cf8095660a0 BUILD_BUG_ON(): fix it and a couple of bogus uses of it gcc permitting variable length arrays makes the current construct used for BUILD_BUG_ON() useless, as that doesn't produce any diagnostic if the controlling expression isn't really constant. Instead, this patch makes it so that a bit field gets used here. Consequently, those uses where the condition isn't really constant now also need fixing. Note that in the gfp.h, kmemcheck.h, and virtio_config.h cases MAYBE_BUILD_BUG_ON() really just serves documentation purposes - even if the expression is compile time constant (__builtin_constant_p() yields true), the array is still deemed of variable length by gcc, and hence the whole expression doesn't have the intended effect. It looks like this could end up being a potential bug in gcc. I'll file a bug with gcc and try to provide them with a simplified test case. However, since this macro changed upstream and acts as a workaround for the issue I would propose making the update in crash as well. Troy --- diff --git a/unwind_x86_64.h b/unwind_x86_64.h index a79c2d5..52fcf7a 100644 --- a/unwind_x86_64.h +++ b/unwind_x86_64.h @@ -61,7 +61,7 @@ extern void free_unwind_table(void); #define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER) #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) #define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)])) -#define BUILD_BUG_ON_ZERO(e) (sizeof(char[1 - 2 * !!(e)]) - 1) +#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); })) #define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f)) #define get_unaligned(ptr) (*(ptr)) //#define __get_user(x,ptr) __get_user_nocheck((x),(ptr),sizeof(*(ptr)))

15 years, 6 months

1
0
0 / 0

An example of crashdc output

by Louis Bouchard

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello everyone, For those interested, here is an example of crashdc output (BASIC mode) : http://cariblog.kamikamamak.com/crashdc-example-in-basic-mode/ Latest developments bits are available here : http://sourceforge.net/projects/crashdc/files/ FYI, I'm still planning/hoping to have crashdc as part of the crash utility RPM, but sf.net makes it easier for me to make it publicly available for early testing. It is also intended to be made easily available for our internal support staff. Kind Regards, ...Louis - -- Louis Bouchard, Linux Support Engineer Team lead, EMEA Linux Competency Center, Linux Ambassador, HP HP Services 1 Ave du Canada HP France Z.A. de Courtaboeuf louis.bouchard(a)hp.com 91 947 Les Ulis http://www.hp.com/go/linux France http://www.hp.com/fr -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAktdjHoACgkQDvqokHrhnCyz8wCg8s877PboJkF0ia4XewSR/v0/ GqkAoPcg7qM7wRI7zrWBWdvDASX4lx4c =r2zq -----END PGP SIGNATURE-----

15 years, 6 months

1
0
0 / 0

Running idle threads show wrong CPU numbers

by Michael Holzheu

Hi Dave, I have a problem with a dump where I have defined five CPUs and two of them are offline. In fact the logical CPUs are defined as follows: 0 on 1 on 2 off 3 off 4 on The CPU online map looks correct: crash> print/x *cpu_online_mask $4 = { bits = {0x13} ---> b10011 } When I issue "ps" I see that all running tasks are idle, but the CPU numbers are not correct (0,1,2 and not 0,1,4): PID PPID CPU TASK ST %MEM VSZ RSS COMM > 0 0 0 800ef0 RU 0.0 0 0 [swapper] > 0 0 1 18c24240 RU 0.0 0 0 [swapper] > 0 0 2 18c2c340 RU 0.0 0 0 [swapper] I tried to debug the problem, but got stuck somewhere in "task.c". I think there is a problem with the idle threads initialization, where the online map is not considered. Maybe you can see the bug immediately. Otherwise I will have spend more effort for debugging that problem. I hope not :-) Michael

15 years, 6 months

1
0
0 / 0

Re: [Crash-utility] crash-5.0: Segmentation fault with x86_64_get_active_set

by Dave Anderson

----- "ville mattila" <ville.mattila(a)stonesoft.com> wrote: > > > > > Hello, > > > > > > I get segementation fault from our 64-bit kernel crash > > > This crash is caused by "echo c > /proc/sys-trigger". > > > The reason seems to be that the x86_64_cpu_pda_init is > > > not called at least gdb do not break there. ... [ snip ] ... A patch for your initialization-time segmentation violation is attached. ... [ snip ] ... But as for this one: > > Btw, the "struct" command caused another segementation fault. > Here is gdb bt: > > (gdb) bt > #0 0x00007f74b3524a92 in strcmp () from /lib/libc.so.6 > #1 0x0000000000534284 in lookup_partial_symtab (name=0x120e3c0 "x8664_pda") at symtab.c:276 > #2 0x00000000005344ed in lookup_symtab (name=0x120e3c0 "x8664_pda") at symtab.c:228 > #3 0x000000000060019d in c_lex () at c-exp.y:2149 > #4 0x00000000006008f5 in c_parse_internal () at c-exp.c.tmp:1468 > #5 0x00000000006022dd in c_parse () at c-exp.y:2225 > #6 0x000000000055f614 in parse_exp_in_context (stringptr=0x7fffbc2f2260, block=<value optimized out>, comma=<value optimized out>, void_context_p=0, out_subexp=0x0) at parse.c:1094 > #7 0x000000000055f924 in parse_expression (string=0x7fffbc2f2950 "x8664_pda") at parse.c:1144 > #8 0x000000000053291b in gdb_command_funnel (req=0xca2c00) at symtab.c:4992 > #9 0x00000000004c1740 in gdb_interface (req=0xca2c00) at gdb_interface.c:407 > #10 0x00000000004e9dca in datatype_info (name=0xb618a7 "x8664_pda", member=0x0, dm=0x7fffbc2f3620) at symbols.c:4146 > #11 0x00000000004eb1ee in arg_to_datatype (s=0xb618a7 "x8664_pda", dm=0x7fffbc2f3620, flags=524290) at symbols.c:4867 > #12 0x00000000004efa1b in cmd_datatype_common (flags=2048) at symbols.c:4664 > #13 0x000000000045efd9 in exec_command () at main.c:644 > #14 0x000000000045f1fa in main_loop () at main.c:603 > #15 0x00000000005452a9 in captured_command_loop (data=0x120e3c0) at ./main.c:226 > #16 0x00000000005434e4 in catch_errors (func=0x5452a0 <captured_command_loop>, func_args=0x0, errstring=0x7f9d7c "", mask=<value optimized out>) at exceptions.c:520 > #17 0x0000000000544d36 in captured_main (data=<value optimized out>) at ./main.c:924 > #18 0x00000000005434e4 in catch_errors (func=0x544340 <captured_main>, func_args=0x7fffbc2f38b0, errstring=0x7f9d7c "", mask=<value optimized out>) at exceptions.c:520 > #19 0x000000000054412f in gdb_main_entry (argc=<value optimized out>, argv=<value optimized out>) at ./main.c:939 > #20 0x000000000045fece in main (argc=3, argv=0x7fffbc2f3a08) at main.c:517 > (gdb) frame 1 > #1 0x0000000000534284 in lookup_partial_symtab (name=0x120e3c0 "x8664_pda") at symtab.c:276 > 276 if (FILENAME_CMP (name, pst->filename) == 0) > (gdb) p name > $4 = 0x120e3c0 "x8664_pda" > (gdb) p pst > $5 = (struct partial_symtab *) 0x14d6040 > (gdb) p pst->filename > $6 = 0x0 > (gdb) p *pst > $7 = {next = 0x0, filename = 0x0, fullname = 0x0, dirname = 0x0, > objfile = 0x0, section_offsets = 0x0, textlow = 0, texthigh = 0, > dependencies = 0x0, number_of_dependencies = 0, globals_offset = 0, > n_global_syms = 0, statics_offset = 0, n_static_syms = 0, symtab = 0x0, > read_symtab = 0, read_symtab_private = 0x0, readin = 0 '\0'} > (gdb) I cannot reproduce it, even with your supplied kernel/dumpfile pair: # crash vmlinux-2.6.31.7+up-syms kerneldump-20100114-091104 ... [ snip ] ... crash> struct x8664_pda struct: invalid data structure reference: x8664_pda crash> While walking through the "ALL_PSYMTABS" list of partial_symtabs in lookup_partial_symtab(), I never see a NULL-filled partial_symtab structure as you show above. What I do see is 13434 partial_symtab structures, with pst->filenames starting from: /workarea5/work/fw/mulperi/ararat/kernel/sg-kernel.git/arch/x86/lib/csum-copy_64.S until the last one: /workarea5/work/fw/mulperi/ararat/kernel/sg-kernel.git/arch/x86/kernel/head_64.S So I don't know what the deal is with that one. Dave

15 years, 6 months

1
0
0 / 0

crashdc : an update

by Louis Bouchard

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello everyone and happy new year. I hope that you will pardon me to highjack the mailing list. Here is a quick update on the status of crashdc. I have now received 'clearance' from my employer to release crashdc to public, so it is now visible here : http://crashdc.svn.sourceforge.net/viewvc/crashdc/ I'm in the process of testing the basic functionalities on all types of kernels delivered by each distro ( xen,pae,bigsmp,xenpae, etc) for i386. I will run the same tests again for x86_64 after that. Since testing on IA64 will require setting up a completely different test environment, I will delay IA64 up until crashdc has reached 'beta'. For a little more details including architectural diagrams, I posted an update on my blog here : http://cariblog.kamikamamak.com/2010/01/14/crashdc-an-update/ Kind Regards, ...Louis P.S. Please let me kwow if you prefer that I refrain from posting things about crashdc here. - -- Louis Bouchard, Linux Support Engineer Team lead, EMEA Linux Competency Center, Linux Ambassador, HP HP Services 1 Ave du Canada HP France Z.A. de Courtaboeuf louis.bouchard(a)hp.com 91 947 Les Ulis http://www.hp.com/go/linux France http://www.hp.com/fr -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAktO13sACgkQDvqokHrhnCy30ACgkK5D8SyJq0Ce7kQFMytYu04U FKkAoO/atx/u9iwaCuRJn04OD9dObMOQ =Pz8Z -----END PGP SIGNATURE-----

15 years, 6 months

3
2
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility January 2010