June 2009 - Crash-utility - Crash Utility List Archives

by Dave Anderson

- Kdump ELF vmcores contain NT_PRSTATUS notes for online cpus only, so if cpus have been offlined prior to a crash, there will be fewer notes than the number of cpus in the system, and therefore there will not be a one-to-one correlation between each cpu and its associated NT_PRSTATUS note. That causes backtrace failures for architectures like ppc64 that depend upon the contents of the NT_PRSTATUS notes for gathering the starting stack location. (chandru(a)in.ibm.com, anderson(a)redhat.com) - Fix and enhancement for the "dev" command. When the command was run against 2.6.26 or later kernels, it would fail with the error message "dev: invalid structure member offset: char_device_struct_fops". Additionally, even when the command did work, more often than not it would fail to determine the file_operations structure associated with the block or character device, and erroneously display "(none)" or "(unused)". This patch makes a more comprehensive search for the file_operations structure, and instead of just displaying its address and symbolic translation, it will display the address of the data structure that contains the pointer to the file_operations structure, along with the symbolic translation of the file_operations structure. For character devices, the containing structure is a "cdev", and for block devices the containing structure is a "gendisk". The command output adds new CDEV and GENDISK columns, and under the OPERATIONS column is the symbolic translation of its file_operations structure. (anderson(a)redhat.com, bob.montgomery(a)hp.com) - Fix for a potential segmentation violation when running "foreach bt" on a very active live system with many processes starting and ending. Without the patch, a segmentation violation could occur when a "bt" was attempted on a task that had become non-existent. This would happen on x86_64 or ppc64 machines, and was due to the usage of a kernel stack pointer taken from a stale/invalid task_struct. The command will now recognize the bad stack pointer and display the error message "bt: task no longer exists" or "bt: invalid/stale stack pointer for this task: <address>". (anderson(a)redhat.com) - Fix to correctly read LKCD Version 8 and later x86 dumpfile headers. (talk90091e(a)gmail.com) - If a kdump NMI issued to a non-crashing x86_64 cpu was received while running in schedule(), after having set the next task as "current" in the cpu's runqueue, but prior to changing the kernel stack to that of the next task, then a backtrace would fail to make the transition from the NMI exception stack back to the process stack, with the error message "bt: cannot transition from exception stack to current process stack". This patch will report inconsistencies found between a task marked as the current task in a cpu's runqueue, and the task found in the per-cpu x8664_pda "pcurrent" field (2.6.29 and earlier) or the per-cpu "current_task" variable (2.6.30 and later). If it can be safely determined that the runqueue setting (used by default) is premature, then the crash utility's internal per-cpu active task will be changed to be the task indicated by the appropriate architecture specific value. Also, a new "set -a <task>" option has been added to manually set a task to be the "active" task on its cpu. (anderson(a)redhat.com) - Fix for x86_64 "bt" command when transitioning from the IRQ stack back to the process stack on 2.6.29 and later kernels. Without the patch, the interrupt exception frame address on the process stack would be incorrectly determined, and its display would typically be preceded by "[exception RIP: unknown or invalid address]", and the backtrace would fail from that point on. (anderson(a)redhat.com) - Enhancement to the "runq" command to show the current task in each cpu's runqueue, plus a few formatting changes to make the output easier to understand. (anderson(a)redhat.com) - Fix for a memory leak when running on live systems, due to the repetitive reallocation of the internal array of active tasks. (anderson(a)redhat.com) - Fix for usage with vmlinux debuginfo files using Dwarf 3 format, for example, the Fedora 2.6.31-0.24.rc0.git18.fc12 kernel. Without the patch, the crash session fails during initialization with the error message: "Dwarf Error: wrong version in compilation unit header (is 3, should be 2) [in module <path-to>/vmlinux]", followed by the erroneous message "crash: <path-to>/vmlinux: no debugging data available". The patch simply accepts the Dwarf 3 header, and the embedded gdb-6.1 version still appears to work with the updated vmlinux debuginfo file format. (anderson(a)redhat.com) - Fix for faulty invocation failure when a System.map file is used as an argument with a compressed diskdump or compressed kdump dumpfile. If the System.map argument appears after the vmcore file on the command line, as in: "crash vmcore System.map vmlinux", the crash session fails immediately with the error message: "crash: vmcore: initialization failed". With the patch, the arguments may be entered in any order. (anderson(a)redhat.com) - Fix for a potential segmentation violation during invocation if a vmcore file, a System.map file, and a non-matching vmlinux file are used as command line arguments. The problem is that whenever a System.map file is used, it is presumed that the user knows what he is doing, and that the vmlinux file is not the same as the kernel that generated the vmcore; therefore the vmlinux/vmcore matching and verification routines are not performed. However, if the kernel data structures in the non-matching vmlinux vary widely enough from the kernel that generated the vmcore, all manners of bogus data may be read and consumed. The reported segmentation violation occurred when using a vmcore created from a "stock" Red Hat kernel with a vmlinux file from a Red Hat "debug" kernel, where the kernel data structures are significantly different. The patch adds a several new defensive mechanisms, and displays additional warning messages, when invalid or questionable data is read, and as a result the crash session will fail in a more reasonable manner. (anderson(a)redhat.com) - Adjusted several virtual and physical memory address definitions for 2.6.31 x86_64 kernels: MAX_PHYSMEM_BITS, VMALLOC_START, VMALLOC_END, VMEMMAP_VADDR, VMEMMAP_END, MODULES_VADDR and MODULES_END. Without the patch, when run against CONFIG_SPARSEMEM_VMEMMAP 2.6.31 kernels, the "kmem -i" option would hang, and when run against CONFIG_SLUB and CONFIG_SPARSEMEM_VMEMMAP 2.6.31 kernels, the "kmem -s" option would report numerous errors indicating "kmem: read error: kernel virtual address: <address> type: page inuse", where the <address> was a legitimate virtual-memmap page structure address. (anderson(a)redhat.com) - Improvement for CONFIG_SLUB "kmem -s" or "kmem -S" options when an invalid slab page link address is encountered. Without the patch, the commands fail with a generic "invalid kernel virtual address" read error message, and "kmem -s" would not display any previously collected statistics. With the patch, the error message displays the slab cache name, the list type, and the invalid pointer found, for example, "kmem: dentry: partial list: page.lru.next: 100100". (anderson(a)redhat.com) Download from: http://people.redhat.com/anderson

16 years, 6 months

1
0
0 / 0

Re: nr_cpus is not calculated properly

by Dave Anderson

----- "Wei Jiang" <talk90091e(a)gmail.com> wrote: > > > In my test, I did not see any exceptions else due to my 32bits dump > file is corrupted. As you know, a incorrect nr_cpus will > lead to some following fields(dha_smp_current_task, dha_stack) are > pointed to a error location, which might be a potential defect and > will be raised in future. Actually I don't know -- so that's why I asked. I almost never see an LKCD dumpfile. Anyway, the fix is queued for the next release. Thanks, Dave

16 years, 7 months

3
3
0 / 0

Re: [Crash-utility] [RFC][PATCH]: crash aborts with cannot determine idle task

by Dave Anderson

----- "Chandru" <chandru(a)in.ibm.com> wrote: > > Yes, I tested these changes and they work fine. > > > > Thanks, > > Chandru > > Hello Dave, > > Could you please let me know if these changes will make it into the next > version of crash utility ?, Yes they will -- I just wanted your sign-off before I checked them in. Thanks again, Dave

16 years, 7 months

2
2
0 / 0

Re: [Crash-utility] [RFC][PATCH]: crash aborts with cannot determine idle task

by Dave Anderson

----- "Chandru" <chandru(a)in.ibm.com> wrote: > Hi Dave, > > Thanks a lot for catching the segfault issue and finding the root cause for it. Here > follows the updated patch taking in the suggestions from the review comments. > > kdump installs NT_PRSTATUS notes into vmcore file only to the cpus that were > online at the time of crash. In such cases, while reading in the notes from the > dump file, we are unsure of the cpu to NT_PRSTATUS mapping. The cpu > possible, present and online map is not available until cpu_maps_init() initializes > them. Hence we remap the prstatus pointer array to online cpus just after > a call to this function. > > Signed-off-by: Chandru Siddalingappa <chandru(a)linux.vnet.ibm.com> > Reviewed-by: Dave Anderson <anderson(a)redhat.com> > Cc: Haren Myneni <haren(a)us.ibm.com> > --- This looks one good. The only change that I will make is in the map_cpu_prstatus() function -- which should just return immediately if get_cpus_online() is equal to nd->num_prstatus_notes. Thanks, Dave > > --- crash-4.0-8.10/ppc64.c.orig 2009-06-08 16:08:09.000000000 +0530 > +++ crash-4.0-8.10/ppc64.c 2009-06-09 15:45:39.000000000 +0530 > @@ -2407,13 +2407,16 @@ ppc64_paca_init(void) > if (!symbol_exists("paca")) > error(FATAL, "PPC64: Could not find 'paca' symbol\n"); > > - if (cpu_map_addr("present")) > + if (cpu_map_addr("possible")) > + map = POSSIBLE; > + else if (cpu_map_addr("present")) > map = PRESENT; > else if (cpu_map_addr("online")) > map = ONLINE; > else > - error(FATAL, > - "PPC64: cannot find 'cpu_present_map' or 'cpu_online_map' > symbols\n"); > + error(FATAL, > + "PPC64: cannot find 'cpu_possible_map' or\ > + 'cpu_present_map' or 'cpu_online_map' symbols\n"); > > if (!MEMBER_EXISTS("paca_struct", "data_offset")) > return; > @@ -2423,8 +2426,8 @@ ppc64_paca_init(void) > > cpu_paca_buf = GETBUF(SIZE(ppc64_paca)); > > - if (!(nr_paca = get_array_length("paca", NULL, 0))) > - nr_paca = NR_CPUS; > + if (!(nr_paca = get_array_length("paca", NULL, 0))) > + nr_paca = (kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS); > > if (nr_paca > NR_CPUS) { > error(WARNING, > @@ -2435,7 +2438,7 @@ ppc64_paca_init(void) > > for (i = cpus = 0; i < nr_paca; i++) { > /* > - * CPU present (or online)? > + * CPU present or online or can exist in the system(possible)? > */ > if (!in_cpu_map(map, i)) > continue; > --- crash-4.0-8.10/kernel.c.orig 2009-06-08 16:07:53.000000000 +0530 > +++ crash-4.0-8.10/kernel.c 2009-06-09 15:01:51.000000000 +0530 > @@ -74,6 +74,9 @@ kernel_init() > > cpu_maps_init(); > > + if (KDUMP_DUMPFILE()) > + map_cpu_prstatus(); > + > kt->stext = symbol_value("_stext"); > kt->etext = symbol_value("_etext"); > get_text_init_space(); > --- crash-4.0-8.10/netdump.c.orig 2009-06-08 16:07:58.000000000 +0530 > +++ crash-4.0-8.10/netdump.c 2009-06-09 16:24:52.000000000 +0530 > @@ -45,6 +45,38 @@ static void check_dumpfile_size(char *); > (machine_type("IA64") || machine_type("PPC64")) > > /* > + * kdump installs NT_PRSTATUS elf notes only to the cpus > + * that were online during dumping. Hence we call into > + * this function after reading the cpu map from the kernel, > + * to remap the NT_PRSTATUS notes only to the online cpus > + */ > +void map_cpu_prstatus(void) > +{ > + void *nt_ptr; > + int i, j, nrcpus; > + > + /* temporary buffer to hold the prstatus_percpu array */ > + if ((nt_ptr = (void *)calloc(nd->num_prstatus_notes, > + sizeof(void *))) == NULL) > + error(FATAL, > + "cannot allocate a buffer to hold prstatus_percpu array\n"); > + > + memcpy((void *)nt_ptr, nd->nt_prstatus_percpu, > + (nd->num_prstatus_notes * sizeof(void *))); > + memset(nd->nt_prstatus_percpu, 0, > + (nd->num_prstatus_notes * sizeof(void *))); > + > + nrcpus = (kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS); > + > + /* re-populate the array with the notes mapping to online cpus */ > + for (i = 0, j = 0; i < nrcpus; i++) > + if (in_cpu_map(ONLINE, i)) > + ((unsigned long *)nd->nt_prstatus_percpu)[i] = > + ((unsigned long *)nt_ptr)[j++]; > + free(nt_ptr); > +} > + > +/* > * Determine whether a file is a netdump/diskdump/kdump creation, > * and if TRUE, initialize the vmcore_data structure. > */ > @@ -618,7 +650,7 @@ get_netdump_panic_task(void) > crashing_cpu = -1; > if (kernel_symbol_exists("crashing_cpu")) { > get_symbol_data("crashing_cpu", sizeof(int), &i); > - if ((i >= 0) && (i < nd->num_prstatus_notes)) { > + if ((i >= 0) && in_cpu_map(ONLINE, i)) { > crashing_cpu = i; > if (CRASHDEBUG(1)) > error(INFO, > @@ -2236,7 +2268,7 @@ get_netdump_regs_ppc64(struct bt_info *b > * CPUs if they responded to an IPI. > */ > if (nd->num_prstatus_notes > 1) { > - if (bt->tc->processor >= nd->num_prstatus_notes) > + if (!nd->nt_prstatus_percpu[bt->tc->processor]) > error(FATAL, > "cannot determine NT_PRSTATUS ELF note " > "for %s task: %lx\n",

16 years, 7 months

2
3
0 / 0

Re: nr_cpus is not calculated properly

by Dave Anderson

----- "Dave Anderson" <anderson(a)redhat.com> wrote: > ----- "Wei Jiang" <talk90091e(a)gmail.com> wrote: ... > > So this line > > 140 nr_cpus = (hdr_size - offset) / sizeof(dump_CPU_info_t); > > > > would not get a correct nr_cpus due to the sizeof(). > > > > A patch to fix this problem as below. > > BTW, what exactly are the ramifications without the patch -- does the > crash session die during initialization? How come nobody ran into > this issue given that the code has been in place for almost 2 years? Again -- what actually happens as a result of the incorrect nr_cpus calculation? I need something to put in the crash.changelog. Dave

16 years, 7 months

2
1
0 / 0

Re: [Crash-utility] Re: nr_cpus is not calculated properly

by Dave Anderson

----- "Bernhard Walle" <bernhard.walle(a)gmx.de> wrote: > Dave Anderson schrieb: > >> > >> As we know, on x86(32 bits), uint32_t is 4 bytes and uint64_t is 8 > >> bytes. > >> > >> So this line > >> 140 nr_cpus = (hdr_size - offset) / sizeof(dump_CPU_info_t); > >> > >> would not get a correct nr_cpus due to the sizeof(). > >> > >> A patch to fix this problem as below. > > > > BTW, what exactly are the ramifications without the patch -- does the > > crash session die during initialization? How come nobody ran into > > this issue given that the code has been in place for almost 2 years? > > > > > 4.0-4.8 - ... > > > > - Change for support of LKCD dumpfile version 8 and later to determine > > the backtrace starting registers from the dumpfile header. Increase > > (maximum) NR_CPUS for ia64 to 4096. > > (bwalle(a)suse.de) > > > > ... > > > > (10/30/07) > > > > Anyway, the patch looks reasonable to me, but I don't touch the LKCD > > code without a sign-off from the LKCD maintainers on this mailing list. > > > > LKCD maintainers -- do you have any objection to this patch? > > Sorry for that mistake, it was me. :-( > > It's a copy & paste error (the members are just copied from the > dump_header_asm_t definition above. And I acknowledge the patch (from > reading it, I have no test material here any more). Troy may give the > ultimate acknowledge. ;-) > > Regards, > Bernhard Good, thanks Bernhard -- it looked pretty obvious, and I'll put it in. I still wish the guy had indicated exactly what the failure mode was. It looks like, at a minimum, there could be one or two LKCD-specific warning messages during initialization, but the crash session should still come up, right? Given that "nr_cpus" is a local variable and has nothing to do with crash utility's determination of how many cpus there are, I wonder what other problems might arise? Dave

16 years, 7 months

1
0
0 / 0

Re: nr_cpus is not calculated properly

by Dave Anderson

----- "Wei Jiang" <talk90091e(a)gmail.com> wrote: > Hi, > > I found nr_cpus is not calculated properly in 32 bits(x86) at > crash-4.0-8.9. > > Around line 140 in file lkcd_v8.c. > 137 * to find out how many CPUs are configured. > 138 */ > 139 offset = offsetof(dump_header_asm_t, dha_smp_regs[0]); > 140 nr_cpus = (hdr_size - offset) / sizeof(dump_CPU_info_t); > 141 > 142 fprintf(stderr, "CPU number NR_CPUS %d \n", NR_CPUS); > 143 fprintf(stderr, "header_asm_t size %d \n", > sizeof(dump_header_asm_t)); > > And in the corresponding head file. > # cat -n lkcd_dump_v8.h|grep -A 20 434 > 434 /* smp specific */ > 435 uint32_t dha_smp_num_cpus; > 436 uint32_t dha_dumping_cpu; > 437 struct pt_regs dha_smp_regs[NR_CPUS]; > 438 uint32_t dha_smp_current_task[NR_CPUS]; > 439 uint32_t dha_stack[NR_CPUS]; > 440 uint32_t dha_stack_ptr[NR_CPUS]; > 441 } __attribute__((packed)) dump_header_asm_t; > 442 > 443 /* > 444 * CPU specific part of dump_header_asm_t > 445 */ > 446 typedef struct dump_CPU_info_s { > 447 struct pt_regs dha_smp_regs; > 448 uint64_t dha_smp_current_task; > 449 uint64_t dha_stack; > 450 uint64_t dha_stack_ptr; > 451 } __attribute__ ((packed)) dump_CPU_info_t; > 452 > 453 > 454 /* > > As we know, on x86(32 bits), uint32_t is 4 bytes and uint64_t is 8 > bytes. > > So this line > 140 nr_cpus = (hdr_size - offset) / sizeof(dump_CPU_info_t); > > would not get a correct nr_cpus due to the sizeof(). > > A patch to fix this problem as below. BTW, what exactly are the ramifications without the patch -- does the crash session die during initialization? How come nobody ran into this issue given that the code has been in place for almost 2 years? 4.0-4.8 - ... - Change for support of LKCD dumpfile version 8 and later to determine the backtrace starting registers from the dumpfile header. Increase (maximum) NR_CPUS for ia64 to 4096. (bwalle(a)suse.de) ... (10/30/07) Anyway, the patch looks reasonable to me, but I don't touch the LKCD code without a sign-off from the LKCD maintainers on this mailing list. LKCD maintainers -- do you have any objection to this patch? Thanks, Dave > > Thanks. > -Wj > > --- lkcd_dump_v8.h.orig 2009-04-16 13:14:22.000000000 -0400 > +++ lkcd_dump_v8.h 2009-06-10 03:31:37.815122032 -0400 > @@ -445,9 +445,9 @@ typedef struct _dump_header_asm_s { > */ > typedef struct dump_CPU_info_s { > struct pt_regs dha_smp_regs; > - uint64_t dha_smp_current_task; > - uint64_t dha_stack; > - uint64_t dha_stack_ptr; > + uint32_t dha_smp_current_task; > + uint32_t dha_stack; > + uint32_t dha_stack_ptr; > } __attribute__ ((packed)) dump_CPU_info_t;

16 years, 7 months

2
1
0 / 0

nr_cpus is not calculated properly

by Wei Jiang

Hi, I found nr_cpus is not calculated properly in 32 bits(x86) at crash-4.0-8.9. Around line 140 in file lkcd_v8.c. 137 * to find out how many CPUs are configured. 138 */ 139 offset = offsetof(dump_header_asm_t, dha_smp_regs[0]); 140 nr_cpus = (hdr_size - offset) / sizeof(dump_CPU_info_t); 141 142 fprintf(stderr, "CPU number NR_CPUS %d \n", NR_CPUS); 143 fprintf(stderr, "header_asm_t size %d \n", sizeof(dump_header_asm_t)); And in the corresponding head file. # cat -n lkcd_dump_v8.h|grep -A 20 434 434 /* smp specific */ 435 uint32_t dha_smp_num_cpus; 436 uint32_t dha_dumping_cpu; 437 struct pt_regs dha_smp_regs[NR_CPUS]; 438 uint32_t dha_smp_current_task[NR_CPUS]; 439 uint32_t dha_stack[NR_CPUS]; 440 uint32_t dha_stack_ptr[NR_CPUS]; 441 } __attribute__((packed)) dump_header_asm_t; 442 443 /* 444 * CPU specific part of dump_header_asm_t 445 */ 446 typedef struct dump_CPU_info_s { 447 struct pt_regs dha_smp_regs; 448 uint64_t dha_smp_current_task; 449 uint64_t dha_stack; 450 uint64_t dha_stack_ptr; 451 } __attribute__ ((packed)) dump_CPU_info_t; 452 453 454 /* As we know, on x86(32 bits), uint32_t is 4 bytes and uint64_t is 8 bytes. So this line 140 nr_cpus = (hdr_size - offset) / sizeof(dump_CPU_info_t); would not get a correct nr_cpus due to the sizeof(). A patch to fix this problem as below. Thanks. -Wj --- lkcd_dump_v8.h.orig 2009-04-16 13:14:22.000000000 -0400 +++ lkcd_dump_v8.h 2009-06-10 03:31:37.815122032 -0400 @@ -445,9 +445,9 @@ typedef struct _dump_header_asm_s { */ typedef struct dump_CPU_info_s { struct pt_regs dha_smp_regs; - uint64_t dha_smp_current_task; - uint64_t dha_stack; - uint64_t dha_stack_ptr; + uint32_t dha_smp_current_task; + uint32_t dha_stack; + uint32_t dha_stack_ptr; } __attribute__ ((packed)) dump_CPU_info_t;

16 years, 7 months

1
0
0 / 0

Re: [Crash-utility] [RFC][PATCH]: crash aborts with cannot determine idle task

by Dave Anderson

----- "Dave Anderson" <anderson(a)redhat.com> wrote: > And lastly, when I run a kernel with this patch against a set of x86_64-only > dumpfiles, I get a segmentation violation like this on certain kdump > kernels: > > ... > please wait... (determining panic task) > Program received signal SIGSEGV, Segmentation fault. > 0x000000000051c79c in get_netdump_panic_task () at netdump.c:719 > 719 len = roundup(len + note64->n_namesz, 4); > (gdb) bt > #0 0x000000000051c79c in get_netdump_panic_task () at netdump.c:719 > #1 0x0000000000521ae5 in get_kdump_panic_task () at netdump.c:2316 > #2 0x00000000004a5550 in get_dumpfile_panic_task () at task.c:5493 > #3 0x00000000004a51b1 in panic_search () at task.c:5386 > #4 0x00000000004a2ef6 in get_panic_context () at task.c:4574 > #5 0x00000000004974ee in task_init () at task.c:456 > #6 0x0000000000449e3a in main_loop () at main.c:536 > ... > > And if I remove the call to map_prstatus_array(), it works OK again. > > I haven't dug into what changed to cause the problem though... The problem is this memset() statement, which makes no sense: +void map_prstatus_array(void) +{ + void *nt_ptr; + int i, j; + + /* temporary buffer to hold the prstatus_percpu array */ + if ((nt_ptr = (void *)calloc(nd->num_prstatus_notes, + sizeof(void *))) == NULL) + error(FATAL, + "cannot allocate a buffer to hold prstatus_percpu array\n"); + + memcpy((void *)nt_ptr, nd->nt_prstatus_percpu, + nd->num_prstatus_notes * sizeof(void *)); + memset(nd->nt_prstatus_percpu, 0, nd->num_prstatus_notes); ...because it zero's out the first few bytes (whatever the number of NT_PRSTATUS sections there are) of the first entry in the array. So for example, here's a before-and-after of the contents of a kdump's nd->nt_prstatus_percpu[] array which has just 2 NT_PRSTATUS sections: before memset(): 1d9f5dc8 1d9f5f2c 0 0 0 0 0 0 0 0 0 0 0 after memset(): 1d9f0000 1d9f5f2c 0 0 0 0 0 0 0 0 0 0 0 And then depending upon whether the resultant virtual address actually exists in the crash utility's virtual address space, it craps out in get_netdump_panic_task() when it tries to access the faulty address. Dave

16 years, 7 months

2
1
0 / 0

Re: [Crash-utility] dev command deteriorates with new kernels

by Dave Anderson

----- "Dave Anderson" <anderson(a)redhat.com> wrote: > ----- "Bob Montgomery" <bob.montgomery(a)hp.com> wrote: > > > On Fri, 2009-06-05 at 14:53 +0000, Dave Anderson wrote: > > > > > > > > I've attached what I'm going with. I've added the capability of getting > > > the file_operations from the cdev_map when necessary. The block device > > > code was also suffering from bit-rot as well, and so I put in a new > > > collector function that uses the bdev_map as well. > > > > Dave, this looks good. Two issues: > > > > 1) Add "-f" to dev help? (What does it mean to still be a "(none)" device?) > > It means that a pointer to a file_operations either doesn't exist > (or that I have no clue how to find it...) For the hell of it I > added that -f flag to show those devices in case somebody's > interested. > > > > > 2) The old code found the block extended device number (a feature added > > to the kernel by a 25 Aug 2008 patch from Tejun Heo): > > > > 259 blkext (unknown) > > > > Also shown in /proc/devices: > > ... > > Block devices: > > 1 ramdisk > > 259 blkext > > 7 loop > > 11 sr > > 104 cciss0 > > > > Deliberate omission? > > I did see that, and I forget now how the old code found it (although the > function still exists), but the structures being used now are bdev_map.probes[] > and major_names[]: > > crash> whatis struct kobj_map > struct kobj_map { > struct probe *probes[255]; > struct mutex *lock; > } > SIZE: 2048 > crash> whatis major_names > struct blk_major_name *major_names[255]; > crash> > > where the kernel's kobj_map.probes[] array size is just hardwired to 255, > and the major_names[] array size is BLKDEV_MAJOR_HASH_SIZE which is 255. > So obviously 259 won't be found. Correction -- it does appear in the major_names[] array, in a 2.6.30 kernel for example, like this: crash> p * major_names[4] $51 = { next = 0x0, major = 259, name = "blkext\000\000\000\000\000\000\000\000\000" } where it appears to be the only major_names[] entry whose "major" value doesn't equal the index into the array (i.e., 259 != 4). But the bdev_map.probes[4] entry is unused. Dave > > If you want to figure out how to show it, send me a patch. > > At this point I'm about ready to deprecate the whole command... ;-) > > Dave > > > > > > > Thanks for cleaning this up, > > Bob Montgomery

16 years, 7 months

2
3
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Crash-utility June 2009