On 2017/3/17 4:57, Dave Anderson wrote:
 Yueyi Li,
 The live system case addition to your patch is interesting, and is the only thing
 I can really test.
 I tried your patch on 2 live systems which are KASLR-capable, but where
 CONFIG_RANDOMIZE_BASE is not configured.   One of them is 4.8-based, and
 the other is 4.10-based.  I commented out the ioctl() call to the /dev/crash
 driver that returns the value of kimage_voffset, and invoked the session
 as "crash --kaslr=0".  Doing it that way forces your
arm64_calc_kimage_voffset()
 function to run its ACTIVE() section.
 However, it only works on one of the 4.10-based systems.
 The problem is related to the value of the kernel's actual PHYS_BASE value that
 is stored in the "memstart_addr" symbol, vs. what is seen in the /proc/iomem
output.
 On the Linux 4.10-based system where your patch works OK, these are the values:
    # cat /proc/iomem
    ...
    4000000000-40001fffff : reserved
    4000200000-4001ffffff : System RAM
      4000280000-4000ccffff : Kernel code
      4000e00000-40016effff : Kernel data
    40023b0000-4ff733ffff : System RAM
    4ff7340000-4ff77cffff : reserved
    4ff77d0000-4ff792ffff : System RAM
    4ff7930000-4ff7e7ffff : reserved
    4ff7e80000-4ff7e8ffff : System RAM
    4ff7e90000-4ff7efffff : reserved
    4ff7f10000-4ff800ffff : reserved
    4ff8010000-4fffffffff : System RAM
    ...
    crash> px memstart_addr
    memstart_addr = $1 = 0x4000000000
    crash> help -m | grep phys_offset
               phys_offset: 4000000000
    crash>
 Note that /proc/iomem shows the lowest "RAM" secttion at 4000200000,
 whereas the real PHYS_OFFSET is 4000000000.  So your arm64_calc_kimage_voffset()
 function calculates its local phys_offset value as 4000200000 from the output of
 /proc/iomem, and uses it to calculate the *correct* value of kimage_voffset.
 But that's seems strange to me, given that 4000200000 is not the actual PHYS_OFFSET
 value that is stored in memstart_addr, and which is used by the crash utility for
 address translation? 
PHYS_OFFSET should be the offset  between VA_START physical
address and 
ZERO address. For most arm64 system, VA_START is start address of 
physical memory, but there is a 2M reserved memory region on your 
4.10-base system, so that PHYS_OFFSET is 0x400200000.
 However, on the Linux 4.8-based system where it fails, these are the values,
 where you'll note that there are no "reserved" sections shown:
    
    # cat /proc/iomem
    ...
    4000000000-40001fffff : System RAM
    4000200000-43fa59ffff : System RAM
      4000280000-4000c7ffff : Kernel code
      4000d90000-400166ffff : Kernel data
    43fa5a0000-43fa9dffff : System RAM
    43fa9e0000-43ff99ffff : System RAM
    43ff9a0000-43ff9affff : System RAM
    43ff9b0000-43ff9bffff : System RAM
    43ff9c0000-43ff9effff : System RAM
    43ff9f0000-43ffffffff : System RAM
    ...
    crash> px memstart_addr
    memstart_addr = $1 = 0x4000000000
    crash> help -m | grep phys_offset
               phys_offset: 4000000000
    crash>
 Also note that the /proc/iomem base RAM and memstart_addr values are the
 same 4000000000.  So in this case, your arm64_calc_kimage_voffset() function
 calculates the temporary phys_base as 4000000000 from /proc/iomem, and uses it
 to calculate an *incorrect* value of kimage_voffset, and the crash session
 fails.  (Interestingly enough, if I hard-code the temporary value to be
 4000200000, it works OK!)
 So that make me wonder how this makes sense when arm64_calc_kimage_voffset()
 uses your ELF dumpfile as the source of phys_base?  Do you have the
 capability of looking at /proc/iomem on the system that was the source
 of your dumpfile?  Do your /proc/iomem values reflect what's in the
 dumpfile's lowest PT_LOAD segment? 
On your 4.8-based system,  also have 2M
reserved memory region, but it 
be claimed  'System RAM'.  On kernel booting, it will request first big 
enough 'System RAM' memory region to load kernel code, the first region 
of 4.8-base system is too small to load kernel code.
So, it means there is a BUG in crash utility phys_offset calculate on 
live system, seems the committer had realized it.
in arm64_calc_phys_offset():
         /*
          * Memory regions are sorted in ascending order. We take the
          * first region which should be correct for most uses.
          */
         errflag = 1;
         while (fgets(buf, BUFSIZE, iomem)) {
             if (strstr(buf, ": System RAM")) {
                 clean_line(buf);
                 errflag = 0;
                 break;
             }
         }
         fclose(iomem);
 Dave
 --
 Crash-utility mailing list
 Crash-utility(a)redhat.com
 
https://www.redhat.com/mailman/listinfo/crash-utility I made fix in patch v4,
please check.
Thanks,
Yueyi Li