Dave Anderson wrote:
 Sachin P. Sant wrote:
 
> Ankita Garg wrote:
>
>> Hi,
>>
>> Am working on backporting relocatable kernel support for x86_64 from
>> 2.6.22.1 kernel to 2.6.21.4. kdump is working fine. But when opening the
>> vmcore file with crash, I get the following error:
>
>
> I had a discussion with Ankita about this problem. This is what i
> think is happening.
>
> This x86-64 kernel has CONFIG_NUMA off with SPARSEMEM support.
>
> The failure occurs as line 11738 in memory.c [ This is with
> latest crash ]
>
> crash: invalid structure member offset: pglist_data_node_mem_map
>       FILE: memory.c  LINE: 11738  FUNCTION: dump_memory_nodes()
>
> Looking at the crash source here is the code in question :
>
> 11728                if (IS_SPARSEMEM()) {
> 11729                          zone_mem_map = 0;
> 11730                          zone_start_mapnr = 0;
> 11731                          if (zone_size) {
> 11732                                   phys = PTOB(zone_start_pfn);
> 11733                                   zone_start_mapnr = 
> phys/PAGESIZE();
> 11734                          }
> 11735
> 11736                } else if (!(vt->flags & NODES) &&
> 11737                         INVALID_MEMBER(zone_zone_mem_map)) {
> 11738                         
> readmem(pgdat+OFFSET(pglist_data_node_mem_map),
>                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 11739                                  KVADDR, &zone_mem_map, 
> sizeof(void *),
> 11740                                  "contig_page_data 
> mem_map",FAULT_ON_ERROR);
> 11741                        if (zone_size)
> 11742                                 zone_mem_map += cum_zone_size * 
> SIZE(page);
>
> The code is trying to read pglist_data_node_mem_map value which does 
> not exist.
> [Since CONFIG_NUMA is off]. It should have entered the if 
> (IS_SPARSEMEM())
> condition [ line 11728 ] since SPARSEMEM is enabled for this kernel.
> The flag value of SPARSEMEM is set by this code in memory.c
>
> 558         if (kernel_symbol_exists("mem_map")) {
> 559                get_symbol_data("mem_map", sizeof(char *), 
> &vt->mem_map);
> 560                 vt->flags |= FLATMEM;
> 561         } else if (kernel_symbol_exists("mem_section"))
> 562                 vt->flags |= SPARSEMEM;
> 563         else
> 564                 vt->flags |= DISCONTIGMEM;
>
> But what i found was SPARSEMEM flag is not set, instead FLATMEM is set as
> mem_map symbol exist in this particular kernel.[ mem_section kernel 
> symbol
> is also present in this kernel]
>
> [crash-4.0-4.8]# cat /boot/System.map | grep mem_map
> ffffffff8072dab0 B mem_map
>
> [crash-4.0-4.8]# cat /boot/System.map | grep mem_section
> ffffffff8072e800 B mem_section
>
>  From kernel source mm/memory.c: mem_map is defined if 
> CONFIG_NEED_MULTIPLE_NODES
> is not defined. Which is the case here.
> I am not a mm expert so i can't tell what to make out of this 
> situation where
> both mem_map and mem_section kernel symbol exist. Anyone ??
>
> Anyway as for the crash problem this could be fixed by rearranging the
> above code as follows:
>
> -       if (kernel_symbol_exists("mem_map")) {
> +       if (kernel_symbol_exists("mem_section"))
> +               vt->flags |= SPARSEMEM;
> +       else if (kernel_symbol_exists("mem_map")) {
>                get_symbol_data("mem_map", sizeof(char *),
&vt->mem_map);
>                vt->flags |= FLATMEM;
> -       } else if (kernel_symbol_exists("mem_section"))
> -               vt->flags |= SPARSEMEM;
> -       else
> +       } else
>
>
> But since i am not very sure about the mm code, there might be a 
> better way to
> fix this.
>
> Thanks
> -Sachin
 
 
 The crash patch above looks fine to me -- I'll give it a test run. 
Sachin,
Your patch tested fine on my stable of sample dumpfiles -- queued for
the next release.
Thanks,
   Dave