On Tue, May 24, 2016 at 01:59:06PM -0400, Dave Anderson wrote:
----- Original Message -----
> Yet some issues, but ...
>
Hi Takahiro,
Here are my general comments on my testing of the v2 patch, followed
by a few comments in the patch itself.
Thank you for your quick review/testing, again.
First, the combination of the new memory map layout and KASLR is
somewhat
confusing. I am testing your patch on a 4.6.0-0.rc7.git2.1.fc25 kernel
that has this configuration:
Right. I use the "kaslr kernel" or "kaslr-enabled kernel" in two
meanings (almost interchangeably):
- kernel with a new memory map, that is v4.6 or later
- kernel that has ability of KASLR (configured with CONFIG_RANDOMIZE_RAM)
When we talked offline, you mentioned the possibility of backporting
KASLR to older kernel, so I hesitated to use "v4.6 or later".
But I think that we'd better use "v4.6 or later" for a clarification
for now.
config-arm64:# CONFIG_RANDOMIZE_BASE is not set
So KASLR doesn't really enter into the picture. But when bringing
up the crash session, it shows the "kaslr kernel" WARNING:
# ./crash
crash 7.1.5++
Copyright (C) 2002-2016 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
WARNING: kimage_voffset not identified for kaslr kernel
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-unknown-linux-gnu"...
KERNEL: /usr/lib/debug/lib/modules/4.6.0-0.rc7.git2.1.fc25.aarch64/vmlinux
DUMPFILE: /dev/crash
CPUS: 8
DATE: Tue May 24 10:08:08 2016
UPTIME: 11 days, 18:32:41
LOAD AVERAGE: 0.17, 0.09, 0.12
TASKS: 197
NODENAME:
apm-mustang-ev3-36.khw.lab.eng.bos.redhat.com
RELEASE: 4.6.0-0.rc7.git2.1.fc25.aarch64
VERSION: #1 SMP Thu May 12 13:28:43 UTC 2016
MACHINE: aarch64 (unknown Mhz)
MEMORY: 16 GB
PID: 7556
COMMAND: "crash"
TASK: fffffe00beb45400 [THREAD_INFO: fffffe00beb98000]
CPU: 7
STATE: TASK_RUNNING (ACTIVE)
crash>
Why show that WARNING in this case? I understant that it's not
stashed during early initialization:
If we don't know kimage_voffset, we can't analyze the contents of
memory (live or kdump).
crash> help -m
flags: 104000c5
(KSYMS_START|VM_L2_64K|VMEMMAP|IRQ_STACKS|MACHDEP_BT_TEXT|NEW_VMEMMAP)
... [ cut ] ...
memory map layout: new <--- BTW, this is redundant/not-a-member, and
we've got NEW_VMEMMAP
VA_BITS: 42
userspace_top: 0000040000000000
page_offset: fffffe0000000000
vmalloc_start_addr: fffffc0008000000
vmalloc_end: fffffdff5ffeffff
modules_vaddr: fffffc0000000000
modules_end: fffffc0007ffffff
vmemmap_vaddr: fffffdff80000000
vmemmap_end: fffffdffffffffff
kimage_text: fffffc0008080000
kimage_end: fffffc0009070000
kimage_voffset: 0000000000000000 <-- available if kernel is not
randomized
phys_offset: 4000000000
...
But it can be read:
crash> px kimage_voffset
kimage_voffset = $1 = 0xfffffbc008000000
crash>
SO because it wasn't determined during session initialization,
it falls into the "no randomness" section of arm64_VTOP():
Exactly.
ulong
arm64_VTOP(ulong addr)
{
if (!(machdep->flags & NEW_VMEMMAP) ||
(addr >= machdep->machspec->page_offset)) {
return machdep->machspec->phys_offset
+ (addr - machdep->machspec->page_offset);
} else {
if (machdep->machspec->kimage_voffset)
return addr - machdep->machspec->kimage_voffset;
else /* no randomness */
return machdep->machspec->phys_offset
+ (addr -
machdep->machspec->vmalloc_start_addr);
}
}
That works, but if "kimage_offset" can be read normally later on, perhaps it
should update "machdep->machspec->kimage_voffset" at that time?
It is possible, but the check in arm64_VTOP() would be still necessary.
There are some discrepancies with respect to the calculated
addresses
and what is seen by looking at the reported memory map in the kernel log:
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] modules : 0xfffffc0000000000 - 0xfffffc0008000000 ( 128 MB)
[ 0.000000] vmalloc : 0xfffffc0008000000 - 0xfffffdfedfff0000 ( 2043 GB)
[ 0.000000] .text : 0xfffffc0008080000 - 0xfffffc0008890000 ( 8256 KB)
.rodata : 0xfffffc0008890000 - 0xfffffc0008c10000 ( 3584 KB)
.init : 0xfffffc0008c10000 - 0xfffffc0008d50000 ( 1280 KB)
.data : 0xfffffc0008d50000 - 0xfffffc0008eaac00 ( 1387 KB)
[ 0.000000] vmemmap : 0xfffffdfee0000000 - 0xfffffdffe0000000 ( 4 GB
maximum)
0xfffffdfee0000000 - 0xfffffdfee1000000 ( 16 MB
actual)
[ 0.000000] fixed : 0xfffffdfffe7d0000 - 0xfffffdfffec00000 ( 4288 KB)
[ 0.000000] PCI I/O : 0xfffffdfffee00000 - 0xfffffdffffe00000 ( 16 MB)
[ 0.000000] memory : 0xfffffe0000000000 - 0xfffffe0400000000 ( 16384 MB)
Comparing it to the calculated values:
VA_BITS: 42
userspace_top: 0000040000000000
page_offset: fffffe0000000000 OK
vmalloc_start_addr: fffffc0008000000 OK
vmalloc_end: fffffdff5ffeffff ? (seems wrong)
modules_vaddr: fffffc0000000000 OK
modules_end: fffffc0007ffffff OK (-1)
vmemmap_vaddr: fffffdff80000000 ? (seems wrong)
vmemmap_end: fffffdffffffffff ? (seems wrong)
kimage_text: fffffc0008080000 OK
kimage_end: fffffc0009070000 OK
kimage_voffset: 0000000000000000
phys_offset: 4000000000
Starting with vmalloc_start_addr/vmalloc_end, if I run "kmem -v" to dump the
vmalloc allocations, there are some allocations that are below the
0xfffffc0008000000 start value shown in the log and by your calcuation.
I'm not clear on what that means?:
crash> kmem -v
VMAP_AREA VM_STRUCT ADDRESS RANGE SIZE
fffffe03daefd900 fffffe03daefd880 fffffc0000c10000 - fffffc0000c30000 131072
fffffe00bf2c3a00 fffffe00bf2c3980 fffffc0000c90000 - fffffc0000cb0000 131072
fffffe00bf2c7400 fffffe00bf2c7380 fffffc0000d50000 - fffffc0000d70000 131072
fffffe03dc76aa00 fffffe03dc76a980 fffffc0000d90000 - fffffc0000db0000 131072
fffffe03dc76be00 fffffe03dc76bd80 fffffc0000dd0000 - fffffc0000ee0000 1114112
fffffe03d90e7900 fffffe03d90e7b80 fffffc0000f40000 - fffffc0000fb0000 458752
fffffe00bec12f00 fffffe00bec12b00 fffffc0000fe0000 - fffffc0001000000 131072
fffffe00bf3a1100 fffffe00bf3a1080 fffffc0001020000 - fffffc0001050000 196608
fffffe00bf3a3280 fffffe00bf3a3200 fffffc0001070000 - fffffc0001090000 131072
fffffe03ddf7fc00 fffffe03ddf7a880 fffffc0001230000 - fffffc0001250000 131072
fffffe03fd508900 fffffe03fd508880 fffffc0008000000 - fffffc0008020000 131072
fffffe03fd509580 fffffe03fd508d00 fffffc0008020000 - fffffc0008040000 131072
fffffe03fd508a00 fffffe03fd508980 fffffc0008040000 - fffffc0008070000 196608
... [ cut ] ...
fffffe03d7235000 fffffe03d7233480 fffffc0014b30000 - fffffc0014b50000 131072
fffffe00b921e700 fffffe00b921df00 fffffc0014b70000 - fffffc0014b90000 131072
fffffe00b851eb00 fffffe00b8512900 fffffdfedfcf0000 - fffffdfedfe70000 1572864
fffffe03dca4ef00 fffffe03dca4ef80 fffffdfedfe70000 - fffffdfedfff0000 1572864
crash>
Although the end value of fffffdfedfff0000 does match up with the value in the
kernel log, your calculation is 2GB higher at fffffdff5ffeffff:
crash> eval fffffdff5ffeffff - 0xfffffdfedfff0000
hexadecimal: 7fffffff
decimal: 2147483647
octal: 17777777777
binary: 0000000000000000000000000000000001111111111111111111111111111111
crash> eval 7fffffff + 1
hexadecimal: 80000000 (2GB)
decimal: 2147483648
octal: 20000000000
binary: 0000000000000000000000000000000010000000000000000000000000000000
crash>
With respect to the vmemmap range, vmmemap address in the log file is the one
that is displayed by "kmem -p":
crash> kmem -p
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffffdfee0000000 4000000000 fffffe03fd441950 2f 1 1002c
referenced,uptodate,lru,mappedtodisk
fffffdfee0000040 4000010000 fffffe03fd441950 30 1 1002c
referenced,uptodate,lru,mappedtodisk
fffffdfee0000080 4000020000 fffffe03fd441950 31 1 1002c
referenced,uptodate,lru,mappedtodisk
fffffdfee00000c0 4000030000 fffffe03fd441950 32 1 1002c
referenced,uptodate,lru,mappedtodisk
...
fffffdfee0ffff00 43fffc0000 0 0 1 4000000000000400
reserved
fffffdfee0ffff40 43fffd0000 0 0 1 4000000000000400
reserved
fffffdfee0ffff80 43fffe0000 0 0 1 4000000000000400
reserved
fffffdfee0ffffc0 43ffff0000 0 0 1 4000000000000400
reserved
crash>
And "kmem" alone recognizes it:
crash> kmem fffffdfee0000000
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffffdfee0000000 4000000000 fffffe03fd441950 2f 1 1002c
referenced,uptodate,lru,mappedtodisk
crash>
But your calculated vmemmap_vaddr above doesn't seem right:
crash> kmem fffffdff80000000
kmem: WARNING: cannot make virtual-to-physical translation: fffffdff80000000
fffffdff80000000: kernel virtual address not found in mem map
crash>
Do your numbers match up on a dump that actually has CONFIG_RANDOMIZE_BASE?
In my environment, the numbers above match up on a dump whether
CONFIG_RANDOMIZE_BASE is enabled or not.
Are you using a mainline kernel (final v4.6, not -rcX)?
Anyway, onto the patch...
>
> changes in v2:
> * Fixed build warnings
> * Moved ARM64_NEW_VMEMMAP to machdep->flags
> * Show additional kaslr-related parameters in arm64_dump_machdep_table()
> * Handle a VMCOREINFO, "NUMBER(kimage_voffset)"
>
> ===8<===
> >From 080a54ec232ac48ef2d8123cbedcf0f1fe27e391 Mon Sep 17 00:00:00 2001
> From: AKASHI Takahiro <takahiro.akashi(a)linaro.org>
> Date: Mon, 16 May 2016 17:31:55 +0900
> Subject: [PATCH v2] arm64: fix kernel memory map handling for kaslr-enabled
> kernel
>
> In kernel v4.6, Kernel ASLR (KASLR) is supported on arm64, and the start
> address of the kernel image can be randomized if CONFIG_RANDOMIZE_BASE is
> enabled.
> Even worse, the kernel image is no more mapped in the linear mapping, but
> in vmalloc area (i.e. below PAGE_OFFSET).
>
> Now, according to the kernel's memory.h, converting a virtual address to
> a physical address should be done like below:
>
> phys_addr_t __x = (phys_addr_t)(x); \
> __x & BIT(VA_BITS - 1) ? (__x & ~PAGE_OFFSET) + PHYS_OFFSET : \
> (__x - kimage_voffset); })
>
> Please note that PHYS_OFFSET is no more equal to the start address of
> the first usable memory block in SYSTEM RAM due to the fact mentioned
> above.
>
> This patch addresses this change and allows the crash utility to access
> memory contents with correct addresses.
> *However*, it is still rough-edged and we need more refinements.
>
> 1-1) On a live system, crash with this patch won't work because
> we currently have no way to know kimage_voffset.
>
> 1-2) For a core dump file, we can do simply:
> $ crash <vmlinux> <vmcore>
> as long as the file has "NUMBER(kimage_voffset)"
> (RELOC_AUTO|KASLR is automatically set.)
>
> I'm planning to add this enhancement in my next version of kexec/kdump
> patch, i.e. v17.
>
> 2) My current change breaks the backward compatibility.
> Crash won't work with v4.5 or early kernel partly because I changed
> the calculation of VA_BITS. (I think that the existing code is wrong.)
> So far I don't have enough time to look into this issue as I'm focusing
> on KASLR support.
OK, but you still haven't shown how the current VA_BITS determination is
not a correct representation of CONFIG_ARM64_VA_BITS.
At least, the current VA_BITS doesn't work with my calculations, which
were derived directly from v4.6's "asm/memory.h".
> 3) Backtracing a 'panic'ed task fails:
> crash> bt
> PID: 999 TASK: ffffffe960081800 CPU: 0 COMMAND: "sh"
> bt: WARNING: corrupt prstatus? pstate=0x80000000, but no user frame found
> bt: WARNING: cannot determine starting stack frame for task ffffffe960081800
>
> frame->pc indicates that it is a kernel address, so obviously something
> is wrong. I'm diggin into it.
Can you add a debug statement that displays frame->pc, frame->sp, and frame->fp?
I've already identified the cause.
pt_regs->pstate was set mistakenly from "sprsr_el1" in panic().
But I have a difficulty here to fix the problem as we have no direct way
to retrieve a value of the *current* PSTATE.
> Signed-off-by: AKASHI Takahiro <takahiro.akashi(a)linaro.org>
> ---
> arm64.c | 217
> ++++++++++++++++++++++++++++++++++++++++++++++++--------------
> defs.h | 24 +++++--
> main.c | 7 +-
> symbols.c | 10 +--
> 4 files changed, 196 insertions(+), 62 deletions(-)
>
> diff --git a/arm64.c b/arm64.c
> index 34c8c59..6b97093 100644
> --- a/arm64.c
> +++ b/arm64.c
> @@ -72,6 +72,21 @@ static int arm64_get_crash_notes(void);
> static void arm64_calc_VA_BITS(void);
> static int arm64_is_uvaddr(ulong, struct task_context *);
>
> +ulong
> +arm64_VTOP(ulong addr)
> +{
> + if (!(machdep->flags & NEW_VMEMMAP) ||
> + (addr >= machdep->machspec->page_offset)) {
> + return machdep->machspec->phys_offset
> + + (addr - machdep->machspec->page_offset);
> + } else {
> + if (machdep->machspec->kimage_voffset)
> + return addr - machdep->machspec->kimage_voffset;
> + else /* no randomness */
> + return machdep->machspec->phys_offset
> + + (addr - machdep->machspec->vmalloc_start_addr);
> + }
> +}
>
> /*
> * Do all necessary machine-specific setup here. This is called several
> times
> @@ -81,6 +96,7 @@ void
> arm64_init(int when)
> {
> ulong value;
> + char *string;
> struct machine_specific *ms;
>
> #if defined(__x86_64__)
> @@ -102,9 +118,34 @@ arm64_init(int when)
> if (machdep->cmdline_args[0])
> arm64_parse_cmdline_args();
> machdep->flags |= MACHDEP_BT_TEXT;
> +
> + ms = machdep->machspec;
> + if (!ms->kimage_voffset &&
> + (string = pc->read_vmcoreinfo("NUMBER(kimage_voffset)"))) {
> + ms->kimage_voffset = htol(string, QUIET, NULL);
> + free(string);
> + }
It doesn't make sense to call pc->read_vmcoreinfo() if !DUMPFILE().
Do you think so?
By default, pc->read_vmcoreinfo is set to no_vmcoreinfo.
So it doesn't hurt anything.
> +
> + if (ms->kimage_voffset) {
> + machdep->flags |= NEW_VMEMMAP;
> +
> + /* if not specified in command line */
> + if (!kt->relocate && !(kt->flags2 & (RELOC_AUTO|KASLR)))
> + kt->flags2 |= (RELOC_AUTO|KASLR);
> + }
> +
If CONFIG_RANDOMIZE_BASE is NOT set, should RELOC_AUTO|KASLR be set? They
don't get set on my live system.
Not necessarily, but the current implementation of derive_kaslr_offset()
just works even in this case.
(This is the reason that I don't think we need a VMCOREINFO for
CONFIG_RANDOMIZE_BASE.)
> break;
>
> case PRE_GDB:
> + /* This check is somewhat redundant */
> + if (kernel_symbol_exists("kimage_voffset")) {
> + machdep->flags |= NEW_VMEMMAP;
> +
> + if (!machdep->machspec->kimage_voffset)
> + error(WARNING,
> + "kimage_voffset not identified for kaslr kernel\n");
> + }
> +
As mentioned in the general discussion above, the WARNING above makes
no sense if CONFIG_RANDOMIZE_BASE is not configured.
What about if I change the message "for v4.6 or later kernel"?
I'm now thinking that CONFIG_RANDOMIZE_BASE (and maybe others
like KASAN?)
should be passed in the dumpfile with VMCOREINFO_CONFIG().
Regarding KASAN, we only need to check for an existence of "kasan_init"
symbol as you recommended before :)
> if (!machdep->pagesize) {
> /*
> * Kerneldoc Documentation/arm64/booting.txt describes
> @@ -160,16 +201,35 @@ arm64_init(int when)
> machdep->pagemask = ~((ulonglong)machdep->pageoffset);
>
> arm64_calc_VA_BITS();
> - machdep->machspec->page_offset = ARM64_PAGE_OFFSET;
> + ms = machdep->machspec;
> + ms->page_offset = ARM64_PAGE_OFFSET;
> + /* FIXME: idmap for NEW_VMEMMAP */
> machdep->identity_map_base = ARM64_PAGE_OFFSET;
> - machdep->machspec->userspace_top = ARM64_USERSPACE_TOP;
> - machdep->machspec->modules_vaddr = ARM64_MODULES_VADDR;
> - machdep->machspec->modules_end = ARM64_MODULES_END;
> - machdep->machspec->vmalloc_start_addr = ARM64_VMALLOC_START;
> - machdep->machspec->vmalloc_end = ARM64_VMALLOC_END;
> - machdep->kvbase = ARM64_VMALLOC_START;
> - machdep->machspec->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
> - machdep->machspec->vmemmap_end = ARM64_VMEMMAP_END;
> + machdep->kvbase = ARM64_VA_START;
> + ms->userspace_top = ARM64_USERSPACE_TOP;
> + if (machdep->flags & NEW_VMEMMAP) {
> + struct syment *sp;
> +
> + sp = kernel_symbol_search("_text");
> + ms->kimage_text = (sp ? sp->value : 0);
> + sp = kernel_symbol_search("_end");
> + ms->kimage_end = (sp ? sp->value : 0);
> +
> + ms->modules_vaddr = ARM64_VA_START;
> + if (kernel_symbol_exists("kasan_init"))
> + ms->modules_vaddr += ARM64_KASAN_SHADOW_SIZE;
> + ms->modules_end = ms->modules_vaddr
> + + ARM64_MODULES_VSIZE -1;
> +
> + ms->vmalloc_start_addr = ms->modules_end + 1;
> + } else {
> + ms->modules_vaddr = ARM64_PAGE_OFFSET - MEGABYTES(64);
> + ms->modules_end = ARM64_PAGE_OFFSET - 1;
> + ms->vmalloc_start_addr = ARM64_VA_START;
> + }
> + ms->vmalloc_end = ARM64_VMALLOC_END;
> + ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
> + ms->vmemmap_end = ARM64_VMEMMAP_END;
>
> switch (machdep->pagesize)
> {
> @@ -232,8 +292,6 @@ arm64_init(int when)
> machdep->stacksize = ARM64_STACK_SIZE;
> machdep->flags |= VMEMMAP;
>
> - arm64_calc_phys_offset();
> -
> machdep->uvtop = arm64_uvtop;
> machdep->kvtop = arm64_kvtop;
> machdep->is_kvaddr = generic_is_kvaddr;
> @@ -262,6 +320,10 @@ arm64_init(int when)
> machdep->dumpfile_init = NULL;
> machdep->verify_line_number = NULL;
> machdep->init_kernel_pgd = arm64_init_kernel_pgd;
> +
> + /* use machdep parameters */
> + arm64_calc_phys_offset();
> +
This doesn't make any sense to me. I understand you've done it because
you are now doing a readmem() in arm64_calc_phys_offset(). But readmem()
calls are never done this early. And in fact, on the live system the
readmem() in arm64_calc_phys_offset() fails like so (running "crash -d4"):
<readmem: fffffc0008d70bb0, KVADDR, "memstart_addr", 8, (ROE|Q),
3ffcc2842c0>
<read_memory_device: addr: fffffc0008d70bb0 paddr: d70bb0 cnt: 8>
crash: read error: kernel virtual address: fffffc0008d70bb0 type:
"memstart_addr"
It fails above because it needs the phys_base in order to do the readmem(). During
runtime it has it:
This is because we don't know kimage_voffset on a live system.
(and so I mentioned this issue in my commit log.)
As far as kdump case is concerned, readmem() at this point just works
for any address in kernel image. This is my intentional use of this function.
crash> set debug 4
debug: 4
crash> rd memstart_addr
<addr: fffffc0008d70bb0 count: 1 flag: 490 (KVADDR)>
<readmem: fffffc0008d70bb0, KVADDR, "64-bit KVADDR", 8, (FOE),
3ffcbfbfc28>
<read_memory_device: addr: fffffc0008d70bb0 paddr: 4000d70bb0 cnt: 8>
fffffc0008d70bb0: 0000004000000000 ....@...
crash>
So without kimage_voffset, that readmem(), how would that readmem() possibly work?
You need the *contents* of memstart_addr in order to *read* memstart_addr.
Now, if it kimage_voffset were passed into the vmcore, then I guess you could
read it. But why do it that way? That's what diskdump_get_phys_base() and
kdump_get_phys_base() are supposed to do.
Because I believe that this is the *common* place to do it either for
diskdump, or kdump ( or even a live system if we know kimage_voffset.)
> break;
>
> case POST_GDB:
> @@ -409,6 +471,8 @@ arm64_dump_machdep_table(ulong arg)
> fprintf(fp, "%sIRQ_STACKS", others++ ? "|" : "");
> if (machdep->flags & MACHDEP_BT_TEXT)
> fprintf(fp, "%sMACHDEP_BT_TEXT", others++ ? "|" :
"");
> + if (machdep->flags & NEW_VMEMMAP)
> + fprintf(fp, "%sNEW_VMEMMAP", others++ ? "|" : "");
> fprintf(fp, ")\n");
>
> fprintf(fp, " kvbase: %lx\n", machdep->kvbase);
> @@ -494,6 +558,8 @@ arm64_dump_machdep_table(ulong arg)
> ms = machdep->machspec;
>
> fprintf(fp, " machspec: %lx\n", (ulong)ms);
> + fprintf(fp, " memory map layout: %s\n",
> + (machdep->flags & NEW_VMEMMAP) ? "new" : "old");
As before, these internal "help" structure dumps just show existing structure
fields,
and NEW_VMMEMAP is already shown in the flags.
I put it as useful information for those people who might have got
any concerns about those odd numbers/kernel layout.
But if you don't like it, I will remove it.
> fprintf(fp, " VA_BITS: %ld\n", ms->VA_BITS);
> fprintf(fp, " userspace_top: %016lx\n", ms->userspace_top);
> fprintf(fp, " page_offset: %016lx\n", ms->page_offset);
> @@ -503,6 +569,11 @@ arm64_dump_machdep_table(ulong arg)
> fprintf(fp, " modules_end: %016lx\n", ms->modules_end);
> fprintf(fp, " vmemmap_vaddr: %016lx\n", ms->vmemmap_vaddr);
> fprintf(fp, " vmemmap_end: %016lx\n", ms->vmemmap_end);
> + if (machdep->flags & NEW_VMEMMAP) {
> + fprintf(fp, " kimage_text: %016lx\n", ms->kimage_text);
> + fprintf(fp, " kimage_end: %016lx\n", ms->kimage_end);
> + fprintf(fp, " kimage_voffset: %016lx\n",
ms->kimage_voffset);
> + }
> fprintf(fp, " phys_offset: %lx\n", ms->phys_offset);
> fprintf(fp, "__exception_text_start: %lx\n",
ms->__exception_text_start);
> fprintf(fp, " __exception_text_end: %lx\n",
ms->__exception_text_end);
> @@ -543,6 +614,42 @@ arm64_dump_machdep_table(ulong arg)
> }
> }
>
> +static int
> +arm64_parse_machdep_arg_l(char *argstring, char *param, ulong *value)
> +{
> + int len;
> + int megabytes = FALSE;
> + char *p;
> +
> + len = strlen(param);
> + if (!STRNEQ(argstring, param) || (argstring[len] != '='))
> + return FALSE;
> +
> + if ((LASTCHAR(argstring) == 'm') ||
> + (LASTCHAR(argstring) == 'M')) {
> + LASTCHAR(argstring) = NULLCHAR;
> + megabytes = TRUE;
> + }
> +
> + p = argstring + len + 1;
> + if (strlen(p)) {
> + int flags = RETURN_ON_ERROR | QUIET;
> + int err = 0;
> +
> + if (megabytes) {
> + *value = dtol(p, flags, &err);
> + if (!err)
> + *value = MEGABYTES(*value);
> + } else {
> + *value = htol(p, flags, &err);
> + }
> +
> + if (!err)
> + return TRUE;
> + }
> +
> + return FALSE;
> +}
>
> /*
> * Parse machine dependent command line arguments.
> @@ -554,11 +661,10 @@ arm64_dump_machdep_table(ulong arg)
> static void
> arm64_parse_cmdline_args(void)
> {
> - int index, i, c, err;
> + int index, i, c;
> char *arglist[MAXARGS];
> char buf[BUFSIZE];
> char *p;
> - ulong value = 0;
>
> for (index = 0; index < MAX_MACHDEP_ARGS; index++) {
> if (!machdep->cmdline_args[index])
> @@ -580,39 +686,23 @@ arm64_parse_cmdline_args(void)
> c = parse_line(buf, arglist);
>
> for (i = 0; i < c; i++) {
> - err = 0;
> -
> - if (STRNEQ(arglist[i], "phys_offset=")) {
> - int megabytes = FALSE;
> - int flags = RETURN_ON_ERROR | QUIET;
> -
> - if ((LASTCHAR(arglist[i]) == 'm') ||
> - (LASTCHAR(arglist[i]) == 'M')) {
> - LASTCHAR(arglist[i]) = NULLCHAR;
> - megabytes = TRUE;
> - }
> -
> - p = arglist[i] + strlen("phys_offset=");
> - if (strlen(p)) {
> - if (megabytes)
> - value = dtol(p, flags, &err);
> - else
> - value = htol(p, flags, &err);
> - }
> -
> - if (!err) {
> - if (megabytes)
> - value = MEGABYTES(value);
> -
> - machdep->machspec->phys_offset = value;
> -
> - error(NOTE,
> - "setting phys_offset to: 0x%lx\n\n",
> - machdep->machspec->phys_offset);
> + if (arm64_parse_machdep_arg_l(arglist[i],
> + "phys_offset",
> + &machdep->machspec->phys_offset)) {
> + error(NOTE,
> + "setting phys_offset to: 0x%lx\n\n",
> + machdep->machspec->phys_offset);
> +
> + machdep->flags |= PHYS_OFFSET;
> + continue;
> + } else if (arm64_parse_machdep_arg_l(arglist[i],
> + "kimage_voffset",
> + &machdep->machspec->kimage_voffset)) {
> + error(NOTE,
> + "setting kimage_voffset to: 0x%lx\n\n",
> + machdep->machspec->kimage_voffset);
>
> - machdep->flags |= PHYS_OFFSET;
> - continue;
> - }
> + continue;
> }
>
> error(WARNING, "ignoring --machdep option: %s\n",
> @@ -627,10 +717,20 @@ arm64_calc_phys_offset(void)
> {
> struct machine_specific *ms = machdep->machspec;
> ulong phys_offset;
> + struct syment *sp;
> + ulong value;
>
> if (machdep->flags & PHYS_OFFSET) /* --machdep override */
> return;
>
> + sp = kernel_symbol_search("memstart_addr");
> + if (sp && readmem(sp->value, KVADDR, (char *)&value,
sizeof(value),
> + "memstart_addr", QUIET|RETURN_ON_ERROR)) {
> + ms->phys_offset = value;
> + machdep->flags |= PHYS_OFFSET;
> + return;
> + }
> +
Also, I may be missing something, but the PHYS_OFFSET flag means that it was
passed in via --machdep, and is only used in this function to avoid any further
attempts to figure out what it is. Here you are setting the flag for no particular
reason that I can see. Why?
Well, I don't remember very well. If not necessary, I will remove it.
But again, doing a readmem() call this early is setting a bad
precedent -- they have
*always* been delayed until after gdb_session_init() in main_loop(). And the work
of determining phys_base should be done by the dumpfile call-out functions that
already exist.
> /*
> * Next determine suitable value for phys_offset. User can override this
> * by passing valid '--machdep phys_offset=<addr>' option.
> @@ -2377,6 +2477,11 @@ arm64_IS_VMALLOC_ADDR(ulong vaddr)
> {
> struct machine_specific *ms = machdep->machspec;
>
> + if ((machdep->flags & NEW_VMEMMAP) &&
> + (vaddr >= machdep->machspec->kimage_text) &&
> + (vaddr <= machdep->machspec->kimage_end))
> + return FALSE;
> +
> return ((vaddr >= ms->vmalloc_start_addr && vaddr <=
> ms->vmalloc_end) ||
> ((machdep->flags & VMEMMAP) &&
> (vaddr >= ms->vmemmap_vaddr && vaddr <=
ms->vmemmap_end))
> ||
> @@ -2407,7 +2512,11 @@ arm64_calc_VA_BITS(void)
>
> for (bitval = highest_bit_long(value); bitval; bitval--) {
> if ((value & (1UL << bitval)) == 0) {
> +#if 1
> + machdep->machspec->VA_BITS = bitval + 1;
> +#else /* FIXME: bug? */
> machdep->machspec->VA_BITS = bitval + 2;
> +#endif
> break;
> }
> }
Again, a non-starter until the above is resolved -- but you know that already...
Just a reminder for now.
> @@ -2459,10 +2568,22 @@ arm64_calc_virtual_memory_ranges(void)
> break;
> }
>
> - vmemmap_size = ALIGN((1UL << (ms->VA_BITS - machdep->pageshift)) *
> SIZE(page), PUD_SIZE);
> + if (machdep->flags & NEW_VMEMMAP)
> +#define STRUCT_PAGE_MAX_SHIFT 6
> + vmemmap_size = 1UL << (ms->VA_BITS - machdep->pageshift - 1
> + + STRUCT_PAGE_MAX_SHIFT);
> + else
> + vmemmap_size = ALIGN((1UL << (ms->VA_BITS - machdep->pageshift)) *
> SIZE(page), PUD_SIZE);
> +
> vmalloc_end = (ms->page_offset - PUD_SIZE - vmemmap_size - SZ_64K);
> - vmemmap_start = vmalloc_end + SZ_64K;
> - vmemmap_end = vmemmap_start + vmemmap_size;
> +
> + if (machdep->flags & NEW_VMEMMAP) {
> + vmemmap_start = ms->page_offset - vmemmap_size;
> + vmemmap_end = ms->page_offset;
> + } else {
> + vmemmap_start = vmalloc_end + SZ_64K;
> + vmemmap_end = vmemmap_start + vmemmap_size;
> + }
>
> ms->vmalloc_end = vmalloc_end - 1;
> ms->vmemmap_vaddr = vmemmap_start;
> diff --git a/defs.h b/defs.h
> index a09fa9a..2bbb55b 100644
> --- a/defs.h
> +++ b/defs.h
> @@ -2844,8 +2844,8 @@ typedef u64 pte_t;
>
> #define PTOV(X) \
> ((unsigned
>
long)(X)-(machdep->machspec->phys_offset)+(machdep->machspec->page_offset))
> -#define VTOP(X) \
> - ((unsigned
>
long)(X)-(machdep->machspec->page_offset)+(machdep->machspec->phys_offset))
> +
> +#define VTOP(X) arm64_VTOP((ulong)(X))
>
> #define USERSPACE_TOP (machdep->machspec->userspace_top)
> #define PAGE_OFFSET (machdep->machspec->page_offset)
> @@ -2938,18 +2938,23 @@ typedef signed int s32;
> #define VM_L3_4K (0x10)
> #define KDUMP_ENABLED (0x20)
> #define IRQ_STACKS (0x40)
> +#define NEW_VMEMMAP (0x80)
>
> /*
> * sources: Documentation/arm64/memory.txt
> * arch/arm64/include/asm/memory.h
> * arch/arm64/include/asm/pgtable.h
> */
> -
> -#define ARM64_PAGE_OFFSET ((0xffffffffffffffffUL) <<
> (machdep->machspec->VA_BITS - 1))
> +#define ARM64_VA_START ((0xffffffffffffffffUL) \
> + << machdep->machspec->VA_BITS)
> +#define ARM64_PAGE_OFFSET ((0xffffffffffffffffUL) \
> + << (machdep->machspec->VA_BITS - 1))
> #define ARM64_USERSPACE_TOP ((1UL) << machdep->machspec->VA_BITS)
> -#define ARM64_MODULES_VADDR (ARM64_PAGE_OFFSET - MEGABYTES(64))
> -#define ARM64_MODULES_END (ARM64_PAGE_OFFSET - 1)
> -#define ARM64_VMALLOC_START ((0xffffffffffffffffUL) <<
machdep->machspec->VA_BITS)
> +
> +/* only used for v4.6 or later */
> +#define ARM64_MODULES_VSIZE MEGABYTES(128)
> +#define ARM64_KASAN_SHADOW_SIZE (1UL << (machdep->machspec->VA_BITS -
3))
> +
> /*
> * The following 3 definitions are the original values, but are obsolete
> * for 3.17 and later kernels because they are now build-time calculations.
> @@ -3028,6 +3033,10 @@ struct machine_specific {
> ulong kernel_flags;
> ulong irq_stack_size;
> ulong *irq_stacks;
> + /* only needed for kaslr-enabled kernel */
> + ulong kimage_voffset;
> + ulong kimage_text;
> + ulong kimage_end;
> };
The comment is wrong -- the kimage_text and kimage_end values are needed for
NEW_VMEMMAP kernels that are NOT configured with CONFIG_RANDOMIZE_BASE.
Unless you consider my live system "kaslr-enabled"?
I will change it to "for v4.6 or later".
I have limited (or no) time to investigate a live-system case, and so if
you have any suggestions or your own patch, it is very much appreciated.
Thanks,
-Takahiro AKASHI
>
> struct arm64_stackframe {
> @@ -5385,6 +5394,7 @@ void unwind_backtrace(struct bt_info *);
> #ifdef ARM64
> void arm64_init(int);
> void arm64_dump_machdep_table(ulong);
> +ulong arm64_VTOP(ulong);
> int arm64_IS_VMALLOC_ADDR(ulong);
> ulong arm64_swp_type(ulong);
> ulong arm64_swp_offset(ulong);
> diff --git a/main.c b/main.c
> index 821bb4e..3f24908 100644
> --- a/main.c
> +++ b/main.c
> @@ -227,9 +227,10 @@ main(int argc, char **argv)
> optarg);
> }
> } else if (STREQ(long_options[option_index].name, "kaslr")) {
> - if (!machine_type("X86_64"))
> - error(INFO, "--kaslr only valid "
> - "with X86_64 machine type.\n");
> + if (!machine_type("X86_64") &&
> + !machine_type("ARM64"))
> + error(INFO, "--kaslr not valid "
> + "with this machine type.\n");
> else if (STREQ(optarg, "auto"))
> kt->flags2 |= (RELOC_AUTO|KASLR);
> else {
> diff --git a/symbols.c b/symbols.c
> index a8d3563..df27793 100644
> --- a/symbols.c
> +++ b/symbols.c
> @@ -593,7 +593,8 @@ kaslr_init(void)
> {
> char *string;
>
> - if (!machine_type("X86_64") || (kt->flags & RELOC_SET))
> + if ((!machine_type("X86_64") &&
!machine_type("ARM64")) ||
> + (kt->flags & RELOC_SET))
> return;
>
> /*
> @@ -712,7 +713,7 @@ store_symbols(bfd *abfd, int dynamic, void *minisyms,
> long symcount,
> if (machine_type("X86")) {
> if (!(kt->flags & RELOC_SET))
> kt->flags |= RELOC_FORCE;
> - } else if (machine_type("X86_64")) {
> + } else if (machine_type("X86_64") || machine_type("ARM64")) {
> if ((kt->flags2 & RELOC_AUTO) && !(kt->flags & RELOC_SET))
> derive_kaslr_offset(abfd, dynamic, from,
> fromend, size, store);
> @@ -783,7 +784,8 @@ store_sysmap_symbols(void)
> error(FATAL, "symbol table namespace malloc: %s\n",
> strerror(errno));
>
> - if (!machine_type("X86") && !machine_type("X86_64"))
> + if (!machine_type("X86") && !machine_type("X86_64")
&&
> + !machine_type("ARM64"))
> kt->flags &= ~RELOC_SET;
>
> first = 0;
> @@ -4681,7 +4683,7 @@ value_search(ulong value, ulong *offset)
> if ((sp = machdep->value_to_symbol(value, offset)))
> return sp;
>
> - if (IS_VMALLOC_ADDR(value))
> + if (IS_VMALLOC_ADDR(value))
> goto check_modules;
>
> if ((sp = symval_hash_search(value)) == NULL)
> --
> 2.8.1
>
> --
Looking better, still a ways to go...
Thanks,
Dave
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
--
Thanks,
-Takahiro AKASHI