[PATCH] Fix typo for 'bf -FF' command
by Austin Kim
When we use 'help bt' command, the instruction of 'bt' is printed as below.
crash> help bt
NAME
bt - backtrace
...
crash> bf -FF
...
#4 [ffff810072b47f10] vfs_write at ffffffff800789d8
ffff810072b47f18: [ffff81007e020380:files_cache] [ffff81007e2c2880:filp]
ffff810072b47f28: 0000000000000002 fffffffffffffff7
ffff810072b47f38: 00002b141825d000 sys_write+69
#5 [ffff810072b47f40] sys_write at ffffffff80078f75
But it seems that 'bf -FF' shows misleading information
because invalid output is displayed using 'bf -FF' command as below.
crash> bf -FF 1
crash: command not found: bf
But 'bt -FF 1' shows valid output.
crash> bt -FF 1
PID: 1 TASK: cf932d40 CPU: 0 COMMAND: "systemd"
#0 [<c0c1609c>] (__schedule) from [<c0c162c0>]
[PC: c0c1609c LR: c0c162c0 SP: cf92fe18 SIZE: 72]
cf92fe18: 40060093 _end+238341700 _end+239562316 trace_buffer_unlock_commit_regs+280
cf92fe28: 001c754b schedule+132 schedule+124 00000004
cf92fe38: 001c754b 00000000 __stack_chk_guard 00000000
cf92fe48: 00000000 80060013 _end+239554564 00000000
cf92fe58: _end+239562340 schedule+132
#1 [<c0c162c0>] (schedule) from [<c0c1ab5c>]
[PC: c0c162c0 LR: c0c1ab5c SP: cf92fe60 SIZE: 8]
cf92fe60: _end+239562452 schedule_hrtimeout_range_clock+300
So fix typo for 'bf -FF' command as below to avoid confusion.
Signed-off-by: Austin Kim <austindh.kim(a)gmail.com>
---
help.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/help.c b/help.c
index 2b2285b..934d8bc 100644
--- a/help.c
+++ b/help.c
@@ -2133,7 +2133,7 @@ char *help_bt[] = {
" ffff810072b47f38: 00002b141825d000 sys_write+69 ",
" #5 [ffff810072b47f40] sys_write at ffffffff80078f75",
" ...",
-" %s> bf -FF",
+" %s> bt -FF",
" ...",
" #4 [ffff810072b47f10] vfs_write at ffffffff800789d8",
" ffff810072b47f18: [ffff81007e020380:files_cache] [ffff81007e2c2880:filp]",
--
2.6.2
5 years
[PATCH] crash/arm64: Determine vabits_actual value from 'TCR_EL1.T1SZ' value in vmcoreinfo
by Bhupesh Sharma
I have recently sent a kernel patch upstream to add 'TCR_EL1.T1SZ' to
vmcoreinfo for arm64 (see [0]), instead of VA_BITS_ACTUAL.
'crash' can read the 'TCR_EL1.T1SZ' value from vmcoreinfo
[which indicates the size offset of the memory region addressed by
TTBR1_EL1] and hence can be used for determining the vabits_actual
value.
[0].http://lists.infradead.org/pipermail/kexec/2019-November/023962.html
Cc: Dave Anderson <anderson(a)redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi(a)linaro.org>
Cc: Prabhakar Kushwaha <prabhakar.pkin(a)gmail.com>
Cc: crash-utility(a)redhat.com
Signed-off-by: Bhupesh Sharma <bhsharma(a)redhat.com>
---
arm64.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/arm64.c b/arm64.c
index af7147d24e20..083491331985 100644
--- a/arm64.c
+++ b/arm64.c
@@ -3856,8 +3856,17 @@ arm64_calc_VA_BITS(void)
} else if (ACTIVE())
error(FATAL, "cannot determine VA_BITS_ACTUAL: please use /proc/kcore\n");
else {
- if ((string = pc->read_vmcoreinfo("NUMBER(VA_BITS_ACTUAL)"))) {
- value = atol(string);
+ if ((string = pc->read_vmcoreinfo("NUMBER(tcr_el1_t1sz)"))) {
+ /* See ARMv8 ARM for the description of
+ * TCR_EL1.T1SZ and how it can be used
+ * to calculate the vabits_actual
+ * supported by underlying kernel.
+ *
+ * Basically:
+ * vabits_actual = 64 - T1SZ;
+ */
+ value = 64 - strtoll(string, NULL, 0);
+ fprintf(fp, "vmcoreinfo : vabits_actual: %ld\n", value);
free(string);
machdep->machspec->VA_BITS_ACTUAL = value;
machdep->machspec->VA_BITS = value;
--
2.7.4
5 years
Undeliverable email sent to crash-utility@redhat.com
by Dave Anderson
There is a new problem that has arisen concerning all Red Hat external mailing
lists that is related to a new DMARC policy that our security team has recently
changed. As a result, the moderators of several Red Hat mailing lists (including
me) have started seeing issues where posts to their mailing lists are being
rejected as Undeliverable. There have been multiple internal support tickets
filed in hopes of a resolution, but there is no workaround that I am aware of
(except for some kind of mailing list configuration that needs to be done at the
sender's site, which is unacceptable).
Hopefully it will be fixed soon. If you receive such a response, please
re-send it and cc: anderson(a)redhat.com.
The response looks like this:
us-smtp-1.mimecast.com rejected your message to the following email addresses:
Discussion list for crash utility usage, maintenance and development (crash-utility(a)redhat.com)
Your message couldn't be delivered. It appears that the email address you sent your message to wasn't found at the destination domain, or the recipient's mailbox is unavailable. The email address might be misspelled or it might not exist. Try to fix the problem by doing one or more of the following:
Send the message again. Before you do, delete and retype the complete address. If your email program automatically suggests an address to use don't select it.
Clear the recipient Auto-Complete List entry in your email program by following the steps in this article. Then resend the message, but before you do, delete and retype the complete address. If your email program suggests an address to use don't select it.
Contact the recipient by some other means (by phone, for example) to confirm you're using the right address. Ask them if they've set up an email forwarding rule that could be forwarding your message to an incorrect address.
If you're still unable to fix the problem, ask the recipient to tell their email admin about the problem, and give them the server that reported the error below.
For Email Admins
When Office 365 tried to send the message, the external email server returned the error below. This error was reported by an email server outside Office 365, and if the sender is unable to fix the problem by correcting the recipient's email address or clearing the Auto-Complete List entry, then it's likely a problem that only the recipient's email admin can fix.
Check the error for information about where the problem is happening. For example, look for a domain name. The domain name will tell you which organization was responsible for the error. The recipient's email server could be causing the problem, or it could be due to a third-party service that your organization or the recipient's organization is using to process or filter email messages.
If you can't fix the problem, contact the responsible party's email admin. This could be the recipient's email admin, your smart host service admin, or someone similar. Give them the error and the name of the server that reported the error to help them troubleshoot the issue.
Unfortunately, Office 365 support is unlikely to be able to help with these kinds of externally reported errors.
us-smtp-1.mimecast.com gave this error:
Remote server returned unknown recipient or mailbox unavailable -> 550 Invalid Recipient - https://community.mimecast.com/docs/DOC-1369#550 [vVcOhPBXNFukJRNZ9UedSA.us264]
5 years
Re: [Crash-utility] Fix for the determination of the ARM64 page size
by Dave Anderson
----- Original Message -----
> Hi Dave and yueyi,
> I read what you mean, above your suggestions, I made changes for patch v2.
Looks good -- queued for crash-7.2.8:
https://github.com/crash-utility/crash/commit/babd7ae62d4e8fd6f93fd30b880...
Thanks,
Dave
>
> Best regards,
> Qiwu
>
>
> -----Original Message-----
> From: Yueyi Li <liyueyi(a)live.com>
> Sent: Friday, November 15, 2019 12:17 AM
> To: Discussion list for crash utility usage, maintenance and development
> <crash-utility(a)redhat.com>; 陈启武 <chenqiwu(a)xiaomi.com>
> Subject: [External Mail]Re: [Crash-utility] Fix for the determination of the
> ARM64 page size
>
>
>
> On 2019/11/14 22:14, Dave Anderson wrote:
> >
> >
> > ----- Original Message -----
> >> Hi Dave,
> >> Since linux 4.4 and later kernels will always be able to determine
> >> the page size by reading the kernel flags (if there is no
> >> vmcoreinfo), I agree checking for (THIS_KERNEL_VERSION < LINUX(4,4,0)) is
> >> more reasonable.
> Hi Qiwu,
>
> Do you noticed this line?
>
> 182 if (!machdep->pagesize &&
> 183 kernel_symbol_exists("swapper_pg_dir") &&
> 184 kernel_symbol_exists("idmap_pg_dir")) {
>
> That means "pagesize" can not be read from VMCOREINFO and kernel image
> header, so the kernel version must be earlier than Linux 4.4 if this code
> section be executed. So, just change it back should be OK.
>
> Besides, I can only see your message by Dave quoted. Could you please add
> mailing list crash-uility(a)redhat.com to 'CC' list, or just sending mail to
> mailing list for any discussion?
>
> Thanks,
> Yueyi
> >
> > Did you finish reading my response from yesterday?
> >
> > There is no reason to check for (THIS_KERNEL_VERSION < LINUX(4,4,0)),
> > because the code section will *only* be executed if the kernel is earlier
> > than Linux 4.4.
> >
> > Again: the "else" section is dead code because it can never be executed:
> >
> > + if (THIS_KERNEL_VERSION < LINUX(4,16,0)) {
> > + value = symbol_value("swapper_pg_dir") -
> > + symbol_value("idmap_pg_dir");
> > + } else {
> > + if (kernel_symbol_exists("tramp_pg_dir"))
> > + value =
> > symbol_value("tramp_pg_dir");
> > + else if
> > (kernel_symbol_exists("reserved_ttbr0"))
> > + value =
> > symbol_value("reserved_ttbr0");
> > + else
> > + value =
> > + symbol_value("swapper_pg_dir");
> > +
> > + value -= symbol_value("idmap_pg_dir");
> > + }
> >
> > You can just use "swapper_pg_dir" and "idmap_pg_dir".
> >
> > Dave
> >
> >
> >
> >
> >
> >>
> >> Best regards,
> >> Qiwu
> >>
> >> -----Original Message-----
> >> From: Dave Anderson <anderson(a)redhat.com>
> >> Sent: Wednesday, November 13, 2019 11:28 PM
> >> To: 陈启武 <chenqiwu(a)xiaomi.com>
> >> Cc: Discussion list for crash utility usage, maintenance and
> >> development <crash-utility(a)redhat.com>
> >> Subject: Re: [External Mail]Re: Fix for the determination of the
> >> ARM64 page size
> >>
> >>
> >>
> >> ----- Original Message -----
> >>> Hi Dave,
> >>> I find the bug from an ELF format arm64 ramdump (not vmcoreinfo)
> >>> with linux 3.18.
> >>> As we know, given the page size flags entry was introduced on Linux
> >>> 4.4 -rc1 and later versions, so the PAGE_SIZE cannot be determinated
> >>> by the following steps for ELF format
> >>> arm64 ramdump files with previous Linux 4.4 versions:
> >>> (1) checking the vmcoreinfo data, and
> >>> (2) checking the kernel image header for the flags field.
> >>>
> >>> If we ignore the following two steps, could the PAGE_SIZE be
> >>> determinated by the third step for previous Linux 4.16 versions?
> >>> I think the answer is no, because the symbols order from lowest to
> >>> highest value is idmap_pg_dir -> swapper_pg_dir -> reserved_ttbr0 ->
> >>> tramp_pg_dir.
> >>> idmap_pg_dir = .;
> >>> . += IDMAP_DIR_SIZE;
> >>> swapper_pg_dir = .;
> >>> . += SWAPPER_DIR_SIZE;
> >>>
> >>> #ifdef CONFIG_ARM64_SW_TTBR0_PAN
> >>> reserved_ttbr0 = .;
> >>> . += RESERVED_TTBR0_SIZE;
> >>> #endif
> >>>
> >>> #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> >>> tramp_pg_dir = .;
> >>> . += PAGE_SIZE;
> >>> #endif
> >>>
> >>> For Linux 4.16 and later kernels with commit
> >>> 1e1b8c04fa3451e2b7190930adae43c95f0fae31 have changed the symbols
> >>> order, from lowest to highest value is idmap_pg_dir -> tramp_pg_dir
> >>> ->
> >>> reserved_ttbr0 -> swapper_pg_dir.
> >>> idmap_pg_dir = .;
> >>> . += IDMAP_DIR_SIZE;
> >>>
> >>> #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> >>> tramp_pg_dir = .;
> >>> . += PAGE_SIZE;
> >>> #endif
> >>>
> >>> #ifdef CONFIG_ARM64_SW_TTBR0_PAN
> >>> reserved_ttbr0 = .;
> >>> . += RESERVED_TTBR0_SIZE;
> >>> #endif
> >>> swapper_pg_dir = .;
> >>> . += PAGE_SIZE;
> >>> swapper_pg_end = .;
> >>>
> >>> so we must consider the case on previous Linux 4.16 kernels,
> >>> especially for previous Linux 4.4 kernels without commit
> >>> 9d372c9fab34cd8803141871195141995f85c7f7.
> >>
> >> But we really only need to consider kernels that are earlier than
> >> Linux 4.4, because Linux 4.4 and later kernels will always be able to
> >> determine the page size by reading the kernel flags (if there is no
> >> vmcoreinfo). So the code below that you are patching will only be
> >> executed if:
> >>
> >> (1) there is no vmcoreinfo, and
> >> (2) no kernel flags (in kernels earlier than Linux 4.4):
> >>
> >> That being the case, I don't see how it would ever be possible for the
> >> "else"
> >> section below to ever be executed:
> >>
> >> + if (THIS_KERNEL_VERSION < LINUX(4,16,0)) {
> >> + value = symbol_value("swapper_pg_dir") -
> >> + symbol_value("idmap_pg_dir");
> >> + } else {
> >> + if (kernel_symbol_exists("tramp_pg_dir"))
> >> + value =
> >> symbol_value("tramp_pg_dir");
> >> + else if
> >> (kernel_symbol_exists("reserved_ttbr0"))
> >> + value =
> >> symbol_value("reserved_ttbr0");
> >> + else
> >> + value =
> >> + symbol_value("swapper_pg_dir");
> >> +
> >> + value -= symbol_value("idmap_pg_dir");
> >> + }
> >>
> >> I was going to suggest checking for (THIS_KERNEL_VERSION <
> >> LINUX(4,4,0)), but I don't think that's even necessary given that the
> >> code sequence above will
> >> *only* be executed if the kernel is Linux 4.4 or earlier. So the "else"
> >> section has become dead code.
> >>
> >> Dave
> >>
> >>
> >>> Best regards,
> >>> Qiwu
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: Dave Anderson <anderson(a)redhat.com>
> >>> Sent: Tuesday, November 12, 2019 11:34 PM
> >>> To: 陈启武 <chenqiwu(a)xiaomi.com>
> >>> Cc: Discussion list for crash utility usage, maintenance and
> >>> development <crash-utility(a)redhat.com>
> >>> Subject: [External Mail]Re: Fix for the determination of the ARM64
> >>> page size
> >>>
> >>>
> >>> ----- Original Message -----
> >>>> Hi Dave,
> >>>> There is a bug for the determination of the ARM64 page size happen
> >>>> on kernel 3.18 crash kdump.
> >>>
> >>> If it is a kdump, there should be a PAGESIZE vmcoreinfo entry.
> >>> As far as I can tell, the PAGE_SIZE has always been exported as the
> >>> second item for all architectures here:
> >>>
> >>> static int __init crash_save_vmcoreinfo_init(void)
> >>> {
> >>> VMCOREINFO_OSRELEASE(init_uts_ns.name.release);
> >>> VMCOREINFO_PAGESIZE(PAGE_SIZE);
> >>> ...
> >>>
> >>> What does "help -D" show for the vmcoreinfo data on your dumpfile?
> >>>
> >>>
> >>>> The crash session failed immediately with the error message "crash:
> >>>> cannot determine page size" since the page size cannot be
> >>>> determinted by kernel image header flags field or subtraction of
> >>>> symbol values address.
> >>>> ffffffc0024df000 A idmap_pg_dir
> >>>> ffffffc0024e2000 A swapper_pg_dir
> >>>> ffffffc0024e4000 A tramp_pg_dir
> >>>> so value = symbol_value("tramp_pg_dir") -
> >>>> symbol_value("idmap_pg_dir") = 5
> >>>> * PAGE_SIZE, the vaule result is determined by the order of symbol
> >>>> address:
> >>>> [kernel-3.18/arch/arm64/kernel/vmlinux.lds.S]
> >>>> BSS_SECTION(0, 0, 0)
> >>>>
> >>>> . = ALIGN(PAGE_SIZE);
> >>>> idmap_pg_dir = .;
> >>>> . += IDMAP_DIR_SIZE;
> >>>> swapper_pg_dir = .;
> >>>> . += SWAPPER_DIR_SIZE;
> >>>>
> >>>> #ifdef CONFIG_ARM64_SW_TTBR0_PAN
> >>>> reserved_ttbr0 = .;
> >>>> . += RESERVED_TTBR0_SIZE;
> >>>> #endif
> >>>>
> >>>> #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> >>>> tramp_pg_dir = .;
> >>>> . += PAGE_SIZE;
> >>>> #endif
> >>>>
> >>>> For Linux 4.16 and later kernels have changed the order of symbol
> >>>> definition due to containing the commit
> >>>> 1e1b8c04fa3451e2b7190930adae43c95f0fae31,
> >>>> So crash utility upstream commit
> >>>> 764e2d09978bb3f87dfaff4c6a59d4a5cc00f277 to fix it, but it ignore
> >>>> the determination of the ARM64 page size on previous Linux 4.16 kernels.
> >>>>
> >>>> So I recommend this patch to fix it.
> >>>
> >>> I have several old arm64 dumpfiles, with kernel versions 3.19, 4.2,
> >>> 4.4, 4.5, 4.7,
> >>> 4.9 and 4.14. However, none of them reach your patch because the
> >>> code section that you are patching is only used as a third option after:
> >>>
> >>> (1) checking the vmcoreinfo data, and
> >>> (2) checking the kernel image header for the flags field.
> >>>
> >>> In Linux 4.4, this patch added the page size to the kernel image header:
> >>>
> >>> commit 9d372c9fab34cd8803141871195141995f85c7f7
> >>> Author: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
> >>> Date: Mon Oct 19 14:19:36 2015 +0100
> >>>
> >>> arm64: Add page size to the kernel image header
> >>>
> >>> This patch adds the page size to the arm64 kernel image header
> >>> so that one can infer the PAGESIZE used by the kernel. This will
> >>> be helpful to diagnose failures to boot the kernel with page size
> >>> not supported by the CPU.
> >>>
> >>> And later on in Linux 4.6, "_kernel_flags_le" was replaced by
> >>> "_kernel_flags_le_lo32" and "_kernel_flags_le_hi32":
> >>>
> >>> commit 6ad1fe5d9077a1ab40bf74b61994d2e770b00b14
> >>> Author: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
> >>> Date: Sat Dec 26 13:48:02 2015 +0100
> >>>
> >>> arm64: avoid R_AARCH64_ABS64 relocations for Image header
> >>> fields
> >>>
> >>> Unfortunately, the current way of using the linker to emit build
> >>> time
> >>> constants into the Image header will no longer work once we switch
> >>> to
> >>> the use of PIE executables. The reason is that such constants are
> >>> emitted
> >>> into the binary using R_AARCH64_ABS64 relocations, which are
> >>> resolved
> >>> at
> >>> runtime, not at build time, and the places targeted by those
> >>> relocations
> >>> will contain zeroes before that.
> >>>
> >>> So refactor the endian swapping linker script constant generation
> >>> code
> >>> so
> >>> that it emits the upper and lower 32-bit words separately.
> >>>
> >>> Anyway, given that the page size flags entry was introduced in Linux
> >>> 4.4, I don't believe your patch checking for LINUX(4,16,0) is correct:
> >>>
> >>> +if (THIS_KERNEL_VERSION < LINUX(4,16,0)) { value =
> >>> +symbol_value("swapper_pg_dir") - symbol_value("idmap_pg_dir"); }
> >>> +else {
> >>>
> >>> Do you agree?
> >>>
> >>> Dave
> >>>
> >>> #/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于
> >>> 全部
> >>> 或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
> >>> This e-mail and its attachments contain confidential information
> >>> from XIAOMI, which is intended only for the person or entity whose
> >>> address is listed above. Any use of the information contained herein
> >>> in any way (including, but not limited to, total or partial
> >>> disclosure, reproduction, or dissemination) by persons other than
> >>> the intended
> >>> recipient(s) is prohibited. If you receive this e-mail in error,
> >>> please notify the sender by phone or email immediately and delete
> >>> it!******/#
> >>>
> >>
> >> #/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全
> >> 部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
> >> This e-mail and its attachments contain confidential information from
> >> XIAOMI, which is intended only for the person or entity whose address
> >> is listed above. Any use of the information contained herein in any
> >> way (including, but not limited to, total or partial disclosure,
> >> reproduction, or dissemination) by persons other than the intended
> >> recipient(s) is prohibited. If you receive this e-mail in error,
> >> please notify the sender by phone or email immediately and delete
> >> it!******/#
> >>
> >
> > --
> > Crash-utility mailing list
> > Crash-utility(a)redhat.com
> > https://www.redhat.com/mailman/listinfo/crash-utility
> >
> #/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
> This e-mail and its attachments contain confidential information from
> XIAOMI, which is intended only for the person or entity whose address is
> listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure, reproduction,
> or dissemination) by persons other than the intended recipient(s) is
> prohibited. If you receive this e-mail in error, please notify the sender by
> phone or email immediately and delete it!******/#
>
5 years
Re: [Crash-utility] Fix for the determination of the ARM64 page size
by Dave Anderson
----- Original Message -----
> Hi Dave,
> Since linux 4.4 and later kernels will always be able to determine the page
> size by reading the kernel flags (if there is no vmcoreinfo), I agree
> checking for (THIS_KERNEL_VERSION < LINUX(4,4,0)) is more reasonable.
Did you finish reading my response from yesterday?
There is no reason to check for (THIS_KERNEL_VERSION < LINUX(4,4,0)), because
the code section will *only* be executed if the kernel is earlier than Linux 4.4.
Again: the "else" section is dead code because it can never be executed:
+ if (THIS_KERNEL_VERSION < LINUX(4,16,0)) {
+ value = symbol_value("swapper_pg_dir") -
+ symbol_value("idmap_pg_dir");
+ } else {
+ if (kernel_symbol_exists("tramp_pg_dir"))
+ value = symbol_value("tramp_pg_dir");
+ else if (kernel_symbol_exists("reserved_ttbr0"))
+ value = symbol_value("reserved_ttbr0");
+ else
+ value = symbol_value("swapper_pg_dir");
+
+ value -= symbol_value("idmap_pg_dir");
+ }
You can just use "swapper_pg_dir" and "idmap_pg_dir".
Dave
>
> Best regards,
> Qiwu
>
> -----Original Message-----
> From: Dave Anderson <anderson(a)redhat.com>
> Sent: Wednesday, November 13, 2019 11:28 PM
> To: 陈启武 <chenqiwu(a)xiaomi.com>
> Cc: Discussion list for crash utility usage, maintenance and development
> <crash-utility(a)redhat.com>
> Subject: Re: [External Mail]Re: Fix for the determination of the ARM64 page
> size
>
>
>
> ----- Original Message -----
> > Hi Dave,
> > I find the bug from an ELF format arm64 ramdump (not vmcoreinfo) with linux
> > 3.18.
> > As we know, given the page size flags entry was introduced on Linux
> > 4.4 -rc1 and later versions, so the PAGE_SIZE cannot be determinated
> > by the following steps for ELF format
> > arm64 ramdump files with previous Linux 4.4 versions:
> > (1) checking the vmcoreinfo data, and
> > (2) checking the kernel image header for the flags field.
> >
> > If we ignore the following two steps, could the PAGE_SIZE be
> > determinated by the third step for previous Linux 4.16 versions?
> > I think the answer is no, because the symbols order from lowest to
> > highest value is idmap_pg_dir -> swapper_pg_dir -> reserved_ttbr0 ->
> > tramp_pg_dir.
> > idmap_pg_dir = .;
> > . += IDMAP_DIR_SIZE;
> > swapper_pg_dir = .;
> > . += SWAPPER_DIR_SIZE;
> >
> > #ifdef CONFIG_ARM64_SW_TTBR0_PAN
> > reserved_ttbr0 = .;
> > . += RESERVED_TTBR0_SIZE;
> > #endif
> >
> > #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> > tramp_pg_dir = .;
> > . += PAGE_SIZE;
> > #endif
> >
> > For Linux 4.16 and later kernels with commit
> > 1e1b8c04fa3451e2b7190930adae43c95f0fae31 have changed the symbols
> > order, from lowest to highest value is idmap_pg_dir -> tramp_pg_dir ->
> > reserved_ttbr0 -> swapper_pg_dir.
> > idmap_pg_dir = .;
> > . += IDMAP_DIR_SIZE;
> >
> > #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> > tramp_pg_dir = .;
> > . += PAGE_SIZE;
> > #endif
> >
> > #ifdef CONFIG_ARM64_SW_TTBR0_PAN
> > reserved_ttbr0 = .;
> > . += RESERVED_TTBR0_SIZE;
> > #endif
> > swapper_pg_dir = .;
> > . += PAGE_SIZE;
> > swapper_pg_end = .;
> >
> > so we must consider the case on previous Linux 4.16 kernels,
> > especially for previous Linux 4.4 kernels without commit
> > 9d372c9fab34cd8803141871195141995f85c7f7.
>
> But we really only need to consider kernels that are earlier than Linux 4.4,
> because Linux 4.4 and later kernels will always be able to determine the
> page size by reading the kernel flags (if there is no vmcoreinfo). So the
> code below that you are patching will only be executed if:
>
> (1) there is no vmcoreinfo, and
> (2) no kernel flags (in kernels earlier than Linux 4.4):
>
> That being the case, I don't see how it would ever be possible for the "else"
> section below to ever be executed:
>
> + if (THIS_KERNEL_VERSION < LINUX(4,16,0)) {
> + value = symbol_value("swapper_pg_dir") -
> + symbol_value("idmap_pg_dir");
> + } else {
> + if (kernel_symbol_exists("tramp_pg_dir"))
> + value = symbol_value("tramp_pg_dir");
> + else if
> (kernel_symbol_exists("reserved_ttbr0"))
> + value =
> symbol_value("reserved_ttbr0");
> + else
> + value =
> + symbol_value("swapper_pg_dir");
> +
> + value -= symbol_value("idmap_pg_dir");
> + }
>
> I was going to suggest checking for (THIS_KERNEL_VERSION < LINUX(4,4,0)), but
> I don't think that's even necessary given that the code sequence above will
> *only* be executed if the kernel is Linux 4.4 or earlier. So the "else"
> section has become dead code.
>
> Dave
>
>
> > Best regards,
> > Qiwu
> >
> >
> > -----Original Message-----
> > From: Dave Anderson <anderson(a)redhat.com>
> > Sent: Tuesday, November 12, 2019 11:34 PM
> > To: 陈启武 <chenqiwu(a)xiaomi.com>
> > Cc: Discussion list for crash utility usage, maintenance and
> > development <crash-utility(a)redhat.com>
> > Subject: [External Mail]Re: Fix for the determination of the ARM64
> > page size
> >
> >
> > ----- Original Message -----
> > > Hi Dave,
> > > There is a bug for the determination of the ARM64 page size happen
> > > on kernel 3.18 crash kdump.
> >
> > If it is a kdump, there should be a PAGESIZE vmcoreinfo entry.
> > As far as I can tell, the PAGE_SIZE has always been exported as the
> > second item for all architectures here:
> >
> > static int __init crash_save_vmcoreinfo_init(void)
> > {
> > VMCOREINFO_OSRELEASE(init_uts_ns.name.release);
> > VMCOREINFO_PAGESIZE(PAGE_SIZE);
> > ...
> >
> > What does "help -D" show for the vmcoreinfo data on your dumpfile?
> >
> >
> > > The crash session failed immediately with the error message "crash:
> > > cannot determine page size" since the page size cannot be
> > > determinted by kernel image header flags field or subtraction of symbol
> > > values address.
> > > ffffffc0024df000 A idmap_pg_dir
> > > ffffffc0024e2000 A swapper_pg_dir
> > > ffffffc0024e4000 A tramp_pg_dir
> > > so value = symbol_value("tramp_pg_dir") -
> > > symbol_value("idmap_pg_dir") = 5
> > > * PAGE_SIZE, the vaule result is determined by the order of symbol
> > > address:
> > > [kernel-3.18/arch/arm64/kernel/vmlinux.lds.S]
> > > BSS_SECTION(0, 0, 0)
> > >
> > > . = ALIGN(PAGE_SIZE);
> > > idmap_pg_dir = .;
> > > . += IDMAP_DIR_SIZE;
> > > swapper_pg_dir = .;
> > > . += SWAPPER_DIR_SIZE;
> > >
> > > #ifdef CONFIG_ARM64_SW_TTBR0_PAN
> > > reserved_ttbr0 = .;
> > > . += RESERVED_TTBR0_SIZE;
> > > #endif
> > >
> > > #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> > > tramp_pg_dir = .;
> > > . += PAGE_SIZE;
> > > #endif
> > >
> > > For Linux 4.16 and later kernels have changed the order of symbol
> > > definition due to containing the commit
> > > 1e1b8c04fa3451e2b7190930adae43c95f0fae31,
> > > So crash utility upstream commit
> > > 764e2d09978bb3f87dfaff4c6a59d4a5cc00f277 to fix it, but it ignore
> > > the determination of the ARM64 page size on previous Linux 4.16 kernels.
> > >
> > > So I recommend this patch to fix it.
> >
> > I have several old arm64 dumpfiles, with kernel versions 3.19, 4.2,
> > 4.4, 4.5, 4.7,
> > 4.9 and 4.14. However, none of them reach your patch because the code
> > section that you are patching is only used as a third option after:
> >
> > (1) checking the vmcoreinfo data, and
> > (2) checking the kernel image header for the flags field.
> >
> > In Linux 4.4, this patch added the page size to the kernel image header:
> >
> > commit 9d372c9fab34cd8803141871195141995f85c7f7
> > Author: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
> > Date: Mon Oct 19 14:19:36 2015 +0100
> >
> > arm64: Add page size to the kernel image header
> >
> > This patch adds the page size to the arm64 kernel image header
> > so that one can infer the PAGESIZE used by the kernel. This will
> > be helpful to diagnose failures to boot the kernel with page size
> > not supported by the CPU.
> >
> > And later on in Linux 4.6, "_kernel_flags_le" was replaced by
> > "_kernel_flags_le_lo32" and "_kernel_flags_le_hi32":
> >
> > commit 6ad1fe5d9077a1ab40bf74b61994d2e770b00b14
> > Author: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
> > Date: Sat Dec 26 13:48:02 2015 +0100
> >
> > arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
> >
> > Unfortunately, the current way of using the linker to emit build time
> > constants into the Image header will no longer work once we switch to
> > the use of PIE executables. The reason is that such constants are
> > emitted
> > into the binary using R_AARCH64_ABS64 relocations, which are resolved
> > at
> > runtime, not at build time, and the places targeted by those
> > relocations
> > will contain zeroes before that.
> >
> > So refactor the endian swapping linker script constant generation code
> > so
> > that it emits the upper and lower 32-bit words separately.
> >
> > Anyway, given that the page size flags entry was introduced in Linux
> > 4.4, I don't believe your patch checking for LINUX(4,16,0) is correct:
> >
> > +if (THIS_KERNEL_VERSION < LINUX(4,16,0)) { value =
> > +symbol_value("swapper_pg_dir") - symbol_value("idmap_pg_dir"); } else
> > +{
> >
> > Do you agree?
> >
> > Dave
> >
> > #/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部
> > 或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
> > This e-mail and its attachments contain confidential information from
> > XIAOMI, which is intended only for the person or entity whose address
> > is listed above. Any use of the information contained herein in any
> > way (including, but not limited to, total or partial disclosure,
> > reproduction, or dissemination) by persons other than the intended
> > recipient(s) is prohibited. If you receive this e-mail in error,
> > please notify the sender by phone or email immediately and delete
> > it!******/#
> >
>
> #/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
> This e-mail and its attachments contain confidential information from
> XIAOMI, which is intended only for the person or entity whose address is
> listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure, reproduction,
> or dissemination) by persons other than the intended recipient(s) is
> prohibited. If you receive this e-mail in error, please notify the sender by
> phone or email immediately and delete it!******/#
>
5 years
Re: [Crash-utility] [External Mail]Re: Fix for the determination of the ARM64 page size
by Dave Anderson
----- Original Message -----
> Hi Dave,
> I find the bug from an ELF format arm64 ramdump (not vmcoreinfo) with linux 3.18.
> As we know, given the page size flags entry was introduced on Linux 4.4 -rc1
> and later versions, so the PAGE_SIZE cannot be determinated by the following steps for ELF format
> arm64 ramdump files with previous Linux 4.4 versions:
> (1) checking the vmcoreinfo data, and
> (2) checking the kernel image header for the flags field.
>
> If we ignore the following two steps, could the PAGE_SIZE be determinated by
> the third step for previous Linux 4.16 versions?
> I think the answer is no, because the symbols order from lowest to highest
> value is idmap_pg_dir -> swapper_pg_dir -> reserved_ttbr0 -> tramp_pg_dir.
> idmap_pg_dir = .;
> . += IDMAP_DIR_SIZE;
> swapper_pg_dir = .;
> . += SWAPPER_DIR_SIZE;
>
> #ifdef CONFIG_ARM64_SW_TTBR0_PAN
> reserved_ttbr0 = .;
> . += RESERVED_TTBR0_SIZE;
> #endif
>
> #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> tramp_pg_dir = .;
> . += PAGE_SIZE;
> #endif
>
> For Linux 4.16 and later kernels with commit
> 1e1b8c04fa3451e2b7190930adae43c95f0fae31 have changed the symbols order,
> from lowest to highest value is idmap_pg_dir -> tramp_pg_dir ->
> reserved_ttbr0 -> swapper_pg_dir.
> idmap_pg_dir = .;
> . += IDMAP_DIR_SIZE;
>
> #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> tramp_pg_dir = .;
> . += PAGE_SIZE;
> #endif
>
> #ifdef CONFIG_ARM64_SW_TTBR0_PAN
> reserved_ttbr0 = .;
> . += RESERVED_TTBR0_SIZE;
> #endif
> swapper_pg_dir = .;
> . += PAGE_SIZE;
> swapper_pg_end = .;
>
> so we must consider the case on previous Linux 4.16 kernels, especially for
> previous Linux 4.4 kernels without commit
> 9d372c9fab34cd8803141871195141995f85c7f7.
But we really only need to consider kernels that are earlier than Linux 4.4,
because Linux 4.4 and later kernels will always be able to determine the page
size by reading the kernel flags (if there is no vmcoreinfo). So the code below
that you are patching will only be executed if:
(1) there is no vmcoreinfo, and
(2) no kernel flags (in kernels earlier than Linux 4.4):
That being the case, I don't see how it would ever be possible for the
"else" section below to ever be executed:
+ if (THIS_KERNEL_VERSION < LINUX(4,16,0)) {
+ value = symbol_value("swapper_pg_dir") -
+ symbol_value("idmap_pg_dir");
+ } else {
+ if (kernel_symbol_exists("tramp_pg_dir"))
+ value = symbol_value("tramp_pg_dir");
+ else if (kernel_symbol_exists("reserved_ttbr0"))
+ value = symbol_value("reserved_ttbr0");
+ else
+ value = symbol_value("swapper_pg_dir");
+
+ value -= symbol_value("idmap_pg_dir");
+ }
I was going to suggest checking for (THIS_KERNEL_VERSION < LINUX(4,4,0)),
but I don't think that's even necessary given that the code sequence above
will *only* be executed if the kernel is Linux 4.4 or earlier. So the
"else" section has become dead code.
Dave
> Best regards,
> Qiwu
>
>
> -----Original Message-----
> From: Dave Anderson <anderson(a)redhat.com>
> Sent: Tuesday, November 12, 2019 11:34 PM
> To: 陈启武 <chenqiwu(a)xiaomi.com>
> Cc: Discussion list for crash utility usage, maintenance and development
> <crash-utility(a)redhat.com>
> Subject: [External Mail]Re: Fix for the determination of the ARM64 page size
>
>
> ----- Original Message -----
> > Hi Dave,
> > There is a bug for the determination of the ARM64 page size happen on
> > kernel 3.18 crash kdump.
>
> If it is a kdump, there should be a PAGESIZE vmcoreinfo entry.
> As far as I can tell, the PAGE_SIZE has always been exported as the second
> item for all architectures here:
>
> static int __init crash_save_vmcoreinfo_init(void)
> {
> VMCOREINFO_OSRELEASE(init_uts_ns.name.release);
> VMCOREINFO_PAGESIZE(PAGE_SIZE);
> ...
>
> What does "help -D" show for the vmcoreinfo data on your dumpfile?
>
>
> > The crash session failed immediately with the error message "crash: cannot
> > determine page size" since the page size cannot be determinted by kernel
> > image header flags field or subtraction of symbol values address.
> > ffffffc0024df000 A idmap_pg_dir
> > ffffffc0024e2000 A swapper_pg_dir
> > ffffffc0024e4000 A tramp_pg_dir
> > so value = symbol_value("tramp_pg_dir") - symbol_value("idmap_pg_dir") = 5
> > * PAGE_SIZE, the vaule result is determined by the order of symbol
> > address:
> > [kernel-3.18/arch/arm64/kernel/vmlinux.lds.S]
> > BSS_SECTION(0, 0, 0)
> >
> > . = ALIGN(PAGE_SIZE);
> > idmap_pg_dir = .;
> > . += IDMAP_DIR_SIZE;
> > swapper_pg_dir = .;
> > . += SWAPPER_DIR_SIZE;
> >
> > #ifdef CONFIG_ARM64_SW_TTBR0_PAN
> > reserved_ttbr0 = .;
> > . += RESERVED_TTBR0_SIZE;
> > #endif
> >
> > #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> > tramp_pg_dir = .;
> > . += PAGE_SIZE;
> > #endif
> >
> > For Linux 4.16 and later kernels have changed the order of symbol
> > definition due to containing the commit
> > 1e1b8c04fa3451e2b7190930adae43c95f0fae31,
> > So crash utility upstream commit
> > 764e2d09978bb3f87dfaff4c6a59d4a5cc00f277 to fix it, but it ignore the
> > determination of the ARM64 page size on previous Linux 4.16 kernels.
> >
> > So I recommend this patch to fix it.
>
> I have several old arm64 dumpfiles, with kernel versions 3.19, 4.2, 4.4, 4.5,
> 4.7,
> 4.9 and 4.14. However, none of them reach your patch because the code
> section that you are patching is only used as a third option after:
>
> (1) checking the vmcoreinfo data, and
> (2) checking the kernel image header for the flags field.
>
> In Linux 4.4, this patch added the page size to the kernel image header:
>
> commit 9d372c9fab34cd8803141871195141995f85c7f7
> Author: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
> Date: Mon Oct 19 14:19:36 2015 +0100
>
> arm64: Add page size to the kernel image header
>
> This patch adds the page size to the arm64 kernel image header
> so that one can infer the PAGESIZE used by the kernel. This will
> be helpful to diagnose failures to boot the kernel with page size
> not supported by the CPU.
>
> And later on in Linux 4.6, "_kernel_flags_le" was replaced by
> "_kernel_flags_le_lo32" and "_kernel_flags_le_hi32":
>
> commit 6ad1fe5d9077a1ab40bf74b61994d2e770b00b14
> Author: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
> Date: Sat Dec 26 13:48:02 2015 +0100
>
> arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
>
> Unfortunately, the current way of using the linker to emit build time
> constants into the Image header will no longer work once we switch to
> the use of PIE executables. The reason is that such constants are emitted
> into the binary using R_AARCH64_ABS64 relocations, which are resolved at
> runtime, not at build time, and the places targeted by those relocations
> will contain zeroes before that.
>
> So refactor the endian swapping linker script constant generation code so
> that it emits the upper and lower 32-bit words separately.
>
> Anyway, given that the page size flags entry was introduced in Linux 4.4, I
> don't believe your patch checking for LINUX(4,16,0) is correct:
>
> +if (THIS_KERNEL_VERSION < LINUX(4,16,0)) {
> +value = symbol_value("swapper_pg_dir") -
> +symbol_value("idmap_pg_dir");
> +} else {
>
> Do you agree?
>
> Dave
>
> #/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
> This e-mail and its attachments contain confidential information from
> XIAOMI, which is intended only for the person or entity whose address is
> listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure, reproduction,
> or dissemination) by persons other than the intended recipient(s) is
> prohibited. If you receive this e-mail in error, please notify the sender by
> phone or email immediately and delete it!******/#
>
5 years
Re: [Crash-utility] Fix for the determination of the ARM64 page size
by Dave Anderson
----- Original Message -----
> Hi Dave,
> There is a bug for the determination of the ARM64 page size happen on kernel 3.18 crash kdump.
If it is a kdump, there should be a PAGESIZE vmcoreinfo entry.
As far as I can tell, the PAGE_SIZE has always been exported
as the second item for all architectures here:
static int __init crash_save_vmcoreinfo_init(void)
{
VMCOREINFO_OSRELEASE(init_uts_ns.name.release);
VMCOREINFO_PAGESIZE(PAGE_SIZE);
...
What does "help -D" show for the vmcoreinfo data on your dumpfile?
> The crash session failed immediately with the error message "crash: cannot determine page size" since the page size cannot be determinted by kernel image header flags field or subtraction of symbol values address.
> ffffffc0024df000 A idmap_pg_dir
> ffffffc0024e2000 A swapper_pg_dir
> ffffffc0024e4000 A tramp_pg_dir
> so value = symbol_value("tramp_pg_dir") - symbol_value("idmap_pg_dir") = 5 * PAGE_SIZE, the vaule result is determined by the order of symbol address:
> [kernel-3.18/arch/arm64/kernel/vmlinux.lds.S]
> BSS_SECTION(0, 0, 0)
>
> . = ALIGN(PAGE_SIZE);
> idmap_pg_dir = .;
> . += IDMAP_DIR_SIZE;
> swapper_pg_dir = .;
> . += SWAPPER_DIR_SIZE;
>
> #ifdef CONFIG_ARM64_SW_TTBR0_PAN
> reserved_ttbr0 = .;
> . += RESERVED_TTBR0_SIZE;
> #endif
>
> #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> tramp_pg_dir = .;
> . += PAGE_SIZE;
> #endif
>
> For Linux 4.16 and later kernels have changed the order of symbol definition due to containing the commit 1e1b8c04fa3451e2b7190930adae43c95f0fae31,
> So crash utility upstream commit 764e2d09978bb3f87dfaff4c6a59d4a5cc00f277 to fix it, but it ignore the determination of the ARM64 page size on previous
> Linux 4.16 kernels.
>
> So I recommend this patch to fix it.
I have several old arm64 dumpfiles, with kernel versions 3.19, 4.2, 4.4, 4.5, 4.7,
4.9 and 4.14. However, none of them reach your patch because the code section that
you are patching is only used as a third option after:
(1) checking the vmcoreinfo data, and
(2) checking the kernel image header for the flags field.
In Linux 4.4, this patch added the page size to the kernel image header:
commit 9d372c9fab34cd8803141871195141995f85c7f7
Author: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Date: Mon Oct 19 14:19:36 2015 +0100
arm64: Add page size to the kernel image header
This patch adds the page size to the arm64 kernel image header
so that one can infer the PAGESIZE used by the kernel. This will
be helpful to diagnose failures to boot the kernel with page size
not supported by the CPU.
And later on in Linux 4.6, "_kernel_flags_le" was replaced by
"_kernel_flags_le_lo32" and "_kernel_flags_le_hi32":
commit 6ad1fe5d9077a1ab40bf74b61994d2e770b00b14
Author: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Date: Sat Dec 26 13:48:02 2015 +0100
arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
Unfortunately, the current way of using the linker to emit build time
constants into the Image header will no longer work once we switch to
the use of PIE executables. The reason is that such constants are emitted
into the binary using R_AARCH64_ABS64 relocations, which are resolved at
runtime, not at build time, and the places targeted by those relocations
will contain zeroes before that.
So refactor the endian swapping linker script constant generation code so
that it emits the upper and lower 32-bit words separately.
Anyway, given that the page size flags entry was introduced in Linux 4.4, I don't
believe your patch checking for LINUX(4,16,0) is correct:
+ if (THIS_KERNEL_VERSION < LINUX(4,16,0)) {
+ value = symbol_value("swapper_pg_dir") -
+ symbol_value("idmap_pg_dir");
+ } else {
Do you agree?
Dave
5 years
[PATCH] Fix a potential segfault for the ARM64 "bt -S <stack-address>" command
by 陈启武
Hi Dave,
I‘m working on arm64 kdump by crash-7.2.7++.
There is a potential segmentation violation due to an invalid exception frame before
transitioning to the process stack when try using the bt command's "-S <stack-address>" options.
For example, take the sp argument from the log:
[ 84.048650] pc : _raw_spin_lock+0x30/0x88
[ 84.048658] lr : lowmem_scan+0x45c/0xbd0
[ 84.048661] sp : ffffff800c42ba00 pstate : 20c00145
A segmentation violation is generated as below:
crash> bt -S ffffff800c42ba00 108
PID: 108 TASK: ffffffcd74122000 CPU: 5 COMMAND: "rtmm_reclaim"
bt: WARNING: cannot determine starting stack frame for task ffffffcd74122000
Program received signal SIGSEGV, Segmentation fault.
0x000055555572de2b in arm64_is_kernel_exception_frame (bt=0x7fffffffd640, stkptr=18446743644915693792) at arm64.c:1785
warning: Source file is more recent than executable.
1785 if (INSTACK(regs->sp, bt) && INSTACK(regs->regs[29], bt) &&
(gdb) bt
#0 0x000055555572de2b in arm64_is_kernel_exception_frame (bt=0x7fffffffd640, stkptr=18446743644915693792) at arm64.c:1785
#1 0x000055555572ffaf in arm64_back_trace_cmd (bt=0x7fffffffd640) at arm64.c:2594
#2 0x00005555556ef0c4 in back_trace (bt=0x7fffffffd640) at kernel.c:3164
#3 0x00005555556ed624 in cmd_bt () at kernel.c:2833
#4 0x000055555564a73b in exec_command () at main.c:879
#5 0x000055555564a515 in main_loop () at main.c:826
#6 0x00005555558b5b43 in captured_command_loop (data=data@entry=0x0) at main.c:258
#7 0x00005555558b46ca in catch_errors (func=func@entry=0x5555558b5b30 <captured_command_loop>, func_args=func_args@entry=0x0,
errstring=errstring@entry=0x555555b0b728 "", mask=mask@entry=6) at exceptions.c:557
#8 0x00005555558b6c42 in captured_main (data=data@entry=0x7fffffffdfb0) at main.c:1064
#9 0x00005555558b46ca in catch_errors (func=func@entry=0x5555558b5e70 <captured_main>, func_args=func_args@entry=0x7fffffffdfb0,
errstring=errstring@entry=0x555555b0b728 "", mask=mask@entry=6) at exceptions.c:557
#10 0x00005555558b702e in gdb_main (args=0x7fffffffdfb0) at main.c:1079
#11 gdb_main_entry (argc=<optimized out>, argv=<optimized out>) at main.c:1099
#12 0x000055555570fc53 in gdb_main_loop (argc=2, argv=0x7fffffffe148) at gdb_interface.c:76
#13 0x000055555564a1e0 in main (argc=4, argv=0x7fffffffe148) at main.c:707
(gdb) p /x *(struct bt_info *) 0x7fffffffd640
$4 = {task = 0xffffffcd74122000, flags = 0x0, instptr = 0x0, stkptr = 0xffffff800c42ba00, bptr = 0x0, stackbase = 0xffffff800c428000,
stacktop = 0xffffff800c42c000, stackbuf = 0x555555f23ae0, tc = 0x5555596e1778, hp = 0x7fffffffd5f0, textlist = 0x0, ref = 0x0, frameptr = 0x0,
call_target = 0x0, machdep = 0x0, debug = 0x0, eframe_ip = 0x0, radix = 0x0, cpumask = 0x0}
The stackframe.fp(0xffffff9c29e4f8e0) is larger than the stacktop address, so lead to segmentation violation gernarated by accessing regs->sp:
(gdb) p /x 18446743644915693792//stkptr
$5 = 0xffffff9c29e4f8e0
(gdb) p /x 0xffffff9c29e4f8e0-0xffffff800c428000//STACK_OFFSET_TYPE(stkptr)
$6 = 0x1c1da278e0
(gdb) p /x regs
$7 = 0x55717394b3c0
(gdb) p *(struct arm64_pt_regs *) 0x55717394b3c0
Cannot access memory at address 0x55717394b3c0
For fix this, I think it must be add a condition "arm64_in_exception_text(stackframe.pc) && INSTACK(stackframe.fp, bt)" to avoid
an invalid exception frame before transitioning to the process stack.
The patch file has been upload to attachment.
Thanks for your review. I’m looking forward to your favourable reply!
Best regards,
Qiwu
#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#
5 years
crash and makedumpfile with 5.3 missing memory in dump
by Andi Kleen
Hi,
[I'm not sure if that is a crash or mkdumpfile problem]
I'm trying to use crash to read a makedumpfile vmcore from 5.3, but I always end up with an error when opening the dump.
I'm using the latest github crash
crash 7.2.7++
...
crash: page excluded: kernel virtual address: ffffffff82110370 type:
"possible"
WARNING: cannot read cpu_possible_map
crash: page excluded: kernel virtual address: ffffffff82110360 type:
"present"
WARNING: cannot read cpu_present_map
crash: page excluded: kernel virtual address: ffffffff82110368 type:
"online"
WARNING: cannot read cpu_online_map
crash: page excluded: kernel virtual address: ffffffff82110358 type:
"active"
WARNING: cannot read cpu_active_map
crash: page excluded: kernel virtual address: ffffffff82011544 type:
"init_uts_ns"
crash: page excluded: kernel virtual address: ffffffff82110360 type:
"cpu_present_map"
crash: page excluded: kernel virtual address: ffffffff82110360 type:
"cpu_present_map"
WARNING: ORC unwinder: cannot read lookup_num_blocks
crash: seek error: kernel virtual address: ffff88822dffb000 type:
"memory section root table"
The dump is created with the latest makedumpfile release
makedumpfile: version 1.6.6 (released on 27 Jun 2019)
It complains that it doesn't support the kernel
Any ideas?
-Andi
5 years