[PATCHv2] crash add log dmesg PRINTK_CALLER id support
by Edward Chron
Submission to Project: crash
Component: dmesg
Files: kernel.c printk.c symbols.c help.c defs.h
Code level patch applied against: 8.0.4++ - latest code pulled from
https://github.com/crash-utility/crash.git
crash Issue #164
Patch Version #2: per review from Hagio Kazuhito <k-hagio-ab(a)nec.com>
Tested with Kernel version and makedumpfile version:
Linux Kernel Testing: Linux catalina 6.6.6 #4 SMP PREEMPT_DYNAMIC
Tue Dec 12 23:11:30 PST 2023 x86_64 GNU/Linux
Linux 5.4.264 #9 SMP
Thu Dec 21 07:00:08 PST 2023
makedumpfile Testing: makedumpfile: version 1.7.4++
(released on 6 Nov 2023)
Issue 13 for makedumpfile: adds support for
demsg PRINTK_CALLER id field patch applied
dmesg Testing: util-linux 2.39.3++
Issue 2609 for sys-utils dmesg: adds support for
dmesg PRINTK_CALLER id field to standard
dmesg kmsg interface patch applied
Add support so that dmesg entries include the optional Linux Kernel
debug CONFIG option PRINTK_CALLER which adds an optional dmesg field
that contains the Thread Id or CPU Id that is issuing the printk to
add the message to the kernel ring buffer. If enabled, this CONFIG
option makes debugging simpler as dmesg entries for a specific
thread or CPU can be recognized.
The dmesg command supports printing the PRINTK_CALLER field. The
old syslog format (dmesg -S) and recently support was added for dmesg
using /dev/kmsg interface with util-linux Issue #2609 as we upstreamed
a commit that is under review.
We've upstreamed a patch for makedumpfile that adds support for
the PRINTK_CALLER id field so it will be available with the
commands:
makedumpfile --dump-dmesg /proc/vmcore dmesgfile
makedumpfile --dump-dmesg -x vmlinux /proc/vmcore dmesgfile
The additional field provided by PRINTK_CALLER is only present
if it was configured for the Linux kernel on the running system. The
PRINTK_CALLER is a debug option and not configured by default so the
dmesg output will only change for those kernels where the option was
configured when the kernel was built. For users who went to the
trouble to configure PRINTK_CALLER and have the extra field available
for debugging, having dmesg print the field is very helpful and so
will be makedumpfile and so it would be very useful to have crash
support for dump analysis.
Size of the PRINTK_CALLER field is determined by the maximum number
tasks that can be run on the system which is limited by the value of
/proc/sys/kernel/pid_max as pid values are from 0 to value - 1.
This value determines the number of id digits needed by the caller id.
The PRINTK_CALLER field is printed as T<id> for a Task Id or C<id>
for a CPU Id for a printk in CPU context. The values are left space
padded and enclosed in parentheses such as:
[ T123] or [ C16]
Displaying the PRINTK_CALLER field in the log/dmesg record output:
-----------------------------------------------------------------
Given the layout of log/dmesg records printed by crash, for example:
crash> log -m
...
[ 0.000000] <7>e820: remove [mem 0xff000000-0xffffffff] reserved
[ 0.000000] <6>SMBIOS 3.4.0 present.
...
[ 0.014179] <6>Secure boot disabled
[ 0.014179] <6>RAMDISK: [mem 0x3cf4f000-0x437bbfff]
...
[ 663.328848] <6>sysrq: Trigger a crash
[ 663.328859] <0>Kernel panic - not syncing: sysrq triggered crash
Our patch adds the PRINTK_CALLER field after the timestamp if the
printk_caller log / dmesg option (-c) is selected:
crash> log -m -c
...
[ 0.014179] [ T1] <6>Secure boot disabled
[ 0.014179] [ T29] <6>RAMDISK: [mem 0x3cf4f000-0x437bbfff]
...
This is consistent placement with dmesg and makedumpfile.
To produce dmesg output with the PRINTK_CALLER id included, we add
a new log / dmesg command option: -c
The PRINTK_CALLER id field is printed only if the -c option is selected.
The description of the log -c option that is seen in the help is:
crash> log help
log
dump system message buffer
[-Ttdmasc]
...
...
-c Display the caller id field that identifies either the thread id or
the CPU id (if in CPU context) that called printk(), if available.
Generally available on Linux 5.1 to 5.9 kernels configured with
CONFIG_PRINTK_CALLER or Linux 5.10 and later kernels.
Also seen in the help file :
Display the caller id that identifies the thread id of the task (begins
with 'T') or the processor id (begins with 'C' for in CPU context) that
called printk(), if available.
crash> log -c
...
[ 0.014179] [ T1] Secure boot disabled
[ 0.014179] [ T29] RAMDISK: [mem 0x3cf4f000-0x437bbfff]
[ 0.198789] [ C0] DMAR: DRHD: handling fault status reg 3
...
Signed-off-by: Ivan Delalande <colona(a)arista.com>
Signed-off-by: Edward Chron <echron(a)arista.com>
---
defs.h | 18 ++++++++++++------
help.c | 19 +++++++++++++++++--
kernel.c | 25 ++++++++++++++++++++++++-
printk.c | 34 ++++++++++++++++++++++++++++++++++
symbols.c | 2 ++
6 files changed, 98 insertions(+), 18 deletions(-)
diff --git a/defs.h b/defs.h
index 2a29c07..488214f 100644
--- a/defs.h
+++ b/defs.h
@@ -2228,8 +2228,13 @@ struct offset_table { /* stash of commonly-used offsets */
long irq_data_irq;
long zspage_huge;
long zram_comp_algs;
+ long log_caller_id;
};
+/* caller_id default and max character sizes based on pid field size */
+#define PID_CHARS_MAX 16 /* Max Number of PID characters */
+#define PID_CHARS_DEFAULT 8 /* Default number of PID characters */
+
struct size_table { /* stash of commonly-used sizes */
long page;
long free_area_struct;
@@ -6044,12 +6049,13 @@ void dump_log(int);
void parse_kernel_version(char *);
#define LOG_LEVEL(v) ((v) & 0x07)
-#define SHOW_LOG_LEVEL (0x1)
-#define SHOW_LOG_DICT (0x2)
-#define SHOW_LOG_TEXT (0x4)
-#define SHOW_LOG_AUDIT (0x8)
-#define SHOW_LOG_CTIME (0x10)
-#define SHOW_LOG_SAFE (0x20)
+#define SHOW_LOG_LEVEL (0x1)
+#define SHOW_LOG_DICT (0x2)
+#define SHOW_LOG_TEXT (0x4)
+#define SHOW_LOG_AUDIT (0x8)
+#define SHOW_LOG_CTIME (0x10)
+#define SHOW_LOG_SAFE (0x20)
+#define SHOW_LOG_CALLER (0x40)
void set_cpu(int);
void clear_machdep_cache(void);
struct stack_hook *gather_text_list(struct bt_info *);
diff --git a/help.c b/help.c
index a4319dd..ae02a57 100644
--- a/help.c
+++ b/help.c
@@ -4023,7 +4023,7 @@ NULL
char *help_log[] = {
"log",
"dump system message buffer",
-"[-Ttdmas]",
+"[-Ttdmasc]",
" This command dumps the kernel log_buf contents in chronological order. The",
" command supports the older log_buf formats, which may or may not contain a",
" timestamp inserted prior to each message, as well as the newer variable-length",
@@ -4046,7 +4046,11 @@ char *help_log[] = {
" been copied out to the user-space audit daemon.",
" -s Dump the printk logs remaining in kernel safe per-CPU buffers that",
" have not been flushed out to log_buf.",
-" ",
+" -c Display the caller id field that identifies either the thread id or",
+" the CPU id (if in CPU context) that called printk(), if available.",
+" Generally available on Linux 5.1 to 5.9 kernels configured with",
+" CONFIG_PRINTK_CALLER or Linux 5.10 and later kernels.",
+" ",
"\nEXAMPLES",
" Dump the kernel message buffer:\n",
" %s> log",
@@ -4214,6 +4218,17 @@ char *help_log[] = {
" CPU: 0 ADDR: ffff8ca4fbc1ad00 LEN: 0 MESSAGE_LOST: 0",
" (empty)",
" ...",
+" ",
+" Display the caller id that identifies the thread id of the task (begins",
+" with 'T') or the processor id (begins with 'C' for in CPU context) that",
+" called printk(), if available.\n",
+" %s> log -c",
+" ...",
+" [ 0.014179] [ T1] Secure boot disabled",
+" [ 0.014179] [ T29] RAMDISK: [mem 0x3cf4f000-0x437bbfff]",
+" [ 0.198789] [ C0] DMAR: DRHD: handling fault status reg 3",
+" ...",
+
NULL
};
diff --git a/kernel.c b/kernel.c
index 6dcf414..bcd10f9 100644
--- a/kernel.c
+++ b/kernel.c
@@ -5089,7 +5089,7 @@ cmd_log(void)
msg_flags = 0;
- while ((c = getopt(argcnt, args, "Ttdmas")) != EOF) {
+ while ((c = getopt(argcnt, args, "Ttdmasc")) != EOF) {
switch(c)
{
case 'T':
@@ -5110,6 +5110,9 @@ cmd_log(void)
case 's':
msg_flags |= SHOW_LOG_SAFE;
break;
+ case 'c':
+ msg_flags |= SHOW_LOG_CALLER;
+ break;
default:
argerrs++;
break;
@@ -5369,6 +5372,24 @@ dump_log_entry(char *logptr, int msg_flags)
fprintf(fp, "%s", buf);
}
+ /* The PRINTK_CALLER id field was introduced with Linux-5.1 so if
+ * requested, Kernel version >= 5.1 and field exists print caller_id.
+ */
+ if (msg_flags & SHOW_LOG_CALLER &&
+ VALID_MEMBER(log_caller_id)) {
+ const unsigned int cpuid = 0x80000000;
+ char cbuf[PID_CHARS_MAX];
+ unsigned int cid;
+
+ /* Get id type, isolate just id value in cid for print */
+ cid = UINT(logptr + OFFSET(log_caller_id));
+ sprintf(cbuf, "%c%d", (cid & cpuid) ? 'C' : 'T', cid & ~cpuid);
+ sprintf(buf, "[%*s] ", PID_CHARS_DEFAULT, cbuf);
+
+ ilen += strlen(buf);
+ fprintf(fp, "%s", buf);
+ }
+
level = LOG_LEVEL(level);
if (msg_flags & SHOW_LOG_LEVEL) {
@@ -5424,6 +5445,8 @@ dump_variable_length_record_log(int msg_flags)
* from log to printk_log. See 62e32ac3505a0cab.
*/
log_struct_name = "printk_log";
+ MEMBER_OFFSET_INIT(log_caller_id, "printk_log",
+ "caller_id");
} else
log_struct_name = "log";
diff --git a/printk.c b/printk.c
index 8658016..ae3fa4f 100644
--- a/printk.c
+++ b/printk.c
@@ -9,6 +9,7 @@ struct prb_map {
unsigned long desc_ring_count;
char *descs;
char *infos;
+ unsigned int pid_max_chars;
char *text_data_ring;
unsigned long text_data_ring_size;
@@ -162,6 +163,24 @@ dump_record(struct prb_map *m, unsigned long id, int msg_flags)
fprintf(fp, "%s", buf);
}
+ /*
+ * The lockless ringbuffer introduced in Linux-5.10 always has
+ * the caller_id field available, so if requested, print it.
+ */
+ if (msg_flags & SHOW_LOG_CALLER) {
+ const unsigned int cpuid = 0x80000000;
+ char cbuf[PID_CHARS_MAX];
+ unsigned int cid;
+
+ /* Get id type, isolate id value in cid for print */
+ cid = UINT(info + OFFSET(printk_info_caller_id));
+ sprintf(cbuf, "%c%d", (cid & cpuid) ? 'C' : 'T', cid & ~cpuid);
+ sprintf(buf, "[%*s] ", m->pid_max_chars, cbuf);
+
+ ilen += strlen(buf);
+ fprintf(fp, "%s", buf);
+ }
+
if (msg_flags & SHOW_LOG_LEVEL) {
level = UCHAR(info + OFFSET(printk_info_level)) >> 5;
sprintf(buf, "<%x>", level);
@@ -262,6 +281,21 @@ dump_lockless_record_log(int msg_flags)
goto out_text_data;
}
+ /* If caller_id was requested, get the pid_max value for print */
+ if (msg_flags & SHOW_LOG_CALLER) {
+ unsigned int pidmax;
+
+ get_symbol_data("pid_max", sizeof(pidmax), &pidmax);
+ if (pidmax <= 99999)
+ m.pid_max_chars = 6;
+ else if (pidmax <= 999999)
+ m.pid_max_chars = 7;
+ else
+ m.pid_max_chars = PID_CHARS_DEFAULT;
+ } else {
+ m.pid_max_chars = PID_CHARS_DEFAULT;
+ }
+
/* ready to go */
tail_id = ULONG(m.desc_ring + OFFSET(prb_desc_ring_tail_id) +
diff --git a/symbols.c b/symbols.c
index 88a3fd1..554d109 100644
--- a/symbols.c
+++ b/symbols.c
@@ -11524,6 +11524,8 @@ dump_offset_table(char *spec, ulong makestruct)
OFFSET(log_level));
fprintf(fp, " log_flags_level: %ld\n",
OFFSET(log_flags_level));
+ fprintf(fp, " log_caller_id: %ld\n",
+ OFFSET(log_caller_id));
fprintf(fp, " printk_info_seq: %ld\n", OFFSET(printk_info_seq));
fprintf(fp, " printk_info_ts_nseq: %ld\n", OFFSET(printk_info_ts_nsec));
--
2.43.0
10 months, 1 week
[PATCH] symbols: skip the module if the given address is not within its address range
by Tao Liu
Previously, to find a module symbol and its offset by an arbitrary address,
all symbols within the module will be iterated by address ascending order
until the last symbol with a smaller address been noticed.
However if the address is not within the module address range, e.g.
the address is higher than the module's last symbol's address, then
the module can be surely skipped, because its symbol iteration is
unnecessary. This can speed up the kernel module symbols finding and improve
the overall performance.
Without the patch:
$ time echo "bt 8993" | ~/crash-dev/crash vmcore vmlinux
crash> bt 8993
PID: 8993 TASK: ffff927569cc2100 CPU: 2 COMMAND: "WriterPool0"
#0 [ffff927569cd76f0] __schedule at ffffffffb3db78d8
#1 [ffff927569cd7758] schedule_preempt_disabled at ffffffffb3db8bf9
#2 [ffff927569cd7768] __mutex_lock_slowpath at ffffffffb3db6ca7
#3 [ffff927569cd77c0] mutex_lock at ffffffffb3db602f
#4 [ffff927569cd77d8] ucache_retrieve at ffffffffc0cf4409 [secfs2]
...snip the stacktrace of the same module...
#11 [ffff927569cd7ba0] cskal_path_vfs_getattr_nosec at ffffffffc05cae76 [falcon_kal]
...snip...
#13 [ffff927569cd7c40] _ZdlPv at ffffffffc086e751 [falcon_lsm_serviceable]
...snip...
#20 [ffff927569cd7ef8] unload_network_ops_symbols at ffffffffc06f11c0 [falcon_lsm_pinned_14713]
#21 [ffff927569cd7f50] system_call_fastpath at ffffffffb3dc539a
RIP: 00007f2b28ed4023 RSP: 00007f2a45fe7f80 RFLAGS: 00000206
RAX: 0000000000000012 RBX: 00007f2a68302e00 RCX: 00007f2a682546d8
RDX: 0000000000000826 RSI: 00007eb57ea6a000 RDI: 00000000000000e3
RBP: 00007eb57ea6a000 R8: 0000000000000826 R9: 00000002670bdfd2
R10: 00000002670bdfd2 R11: 0000000000000293 R12: 00000002670bdfd2
R13: 00007f29d501a480 R14: 0000000000000826 R15: 00000002670bdfd2
ORIG_RAX: 0000000000000012 CS: 0033 SS: 002b
crash>
real 7m14.826s
user 7m12.502s
sys 0m1.091s
With the patch:
$ time echo "bt 8993" | ~/crash-dev/crash vmcore vmlinux
crash> bt 8993
PID: 8993 TASK: ffff927569cc2100 CPU: 2 COMMAND: "WriterPool0"
#0 [ffff927569cd76f0] __schedule at ffffffffb3db78d8
#1 [ffff927569cd7758] schedule_preempt_disabled at ffffffffb3db8bf9
...snip the same output...
crash>
real 0m8.827s
user 0m7.896s
sys 0m0.938s
Signed-off-by: Tao Liu <ltao(a)redhat.com>
---
symbols.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/symbols.c b/symbols.c
index 5d91991..88a3fd1 100644
--- a/symbols.c
+++ b/symbols.c
@@ -5561,7 +5561,7 @@ value_search_module_6_4(ulong value, ulong *offset)
sp = lm->symtable[t];
sp_end = lm->symend[t];
- if (value < sp->value)
+ if (value < sp->value || value > sp_end->value)
continue;
splast = NULL;
@@ -5646,6 +5646,9 @@ retry:
if (sp->value > value) /* invalid -- between modules */
break;
+ if (sp_end->value < value) /* not within the module */
+ continue;
+
/*
* splast will contain the last module symbol encountered.
* Note: "__insmod_"-type symbols will be set in splast only
--
2.40.1
10 months, 1 week
[PATCH] Fix for "bt pid" command not printing enough stack trace
by Lianbo Jiang
Currently, the "bt pid" command may not print enough stack trace and the
remaining frames will be truncated on x86_64. For example:
Without the patch:
crash> bt 493113
PID: 493113 TASK: ff2e34ecbd3ca2c0 CPU: 27 COMMAND: "sriov_fec_daemo"
#0 [ff77abc4e81cfb08] __schedule at ffffffff81b239cb
#1 [ff77abc4e81cfb70] schedule at ffffffff81b23e2d
#2 [ff77abc4e81cfb88] schedule_timeout at ffffffff81b2c9e8
RIP: 000000000047cdbb RSP: 000000c0000975a8 RFLAGS: 00000216
RAX: ffffffffffffffda RBX: 000000c00004e000 RCX: 000000000047cdbb
RDX: 000000000000000c RSI: 000000c000097798 RDI: 0000000000000009
RBP: 000000c0000975f8 R8: 0000000000000001 R9: 000000c00098d680
R10: 000000000000000c R11: 0000000000000216 R12: 000000c000097688
R13: 0000000000000000 R14: 000000c0006c3520 R15: 00007f5e359946b7
ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
With the patch:
crash> bt 493113
PID: 493113 TASK: ff2e34ecbd3ca2c0 CPU: 27 COMMAND: "sriov_fec_daemo"
#0 [ff77abc4e81cfb08] __schedule at ffffffff81b239cb
#1 [ff77abc4e81cfb70] schedule at ffffffff81b23e2d
#2 [ff77abc4e81cfb88] schedule_timeout at ffffffff81b2c9e8
#3 [ff77abc4e81cfc68] vfio_unregister_group_dev at ffffffffc10e76ae [vfio]
#4 [ff77abc4e81cfca8] vfio_pci_core_unregister_device at ffffffffc11bb599 [vfio_pci_core]
#5 [ff77abc4e81cfcc0] vfio_pci_remove at ffffffffc103e045 [vfio_pci]
#6 [ff77abc4e81cfcd0] pci_device_remove at ffffffff815d7513
#7 [ff77abc4e81cfcf0] device_release_driver_internal at ffffffff81708baa
#8 [ff77abc4e81cfd20] unbind_store at ffffffff81705f6f
#9 [ff77abc4e81cfd50] kernfs_fop_write_iter at ffffffff81454bf1
#10 [ff77abc4e81cfd88] new_sync_write at ffffffff813aad8c
#11 [ff77abc4e81cfe20] vfs_write at ffffffff813adb36
#12 [ff77abc4e81cfe58] ksys_write at ffffffff813adeb2
#13 [ff77abc4e81cfe90] do_syscall_64 at ffffffff81b17159
#14 [ff77abc4e81cff50] entry_SYSCALL_64_after_hwframe at ffffffff81c0009b
RIP: 000000000047cdbb RSP: 000000c0000975a8 RFLAGS: 00000216
RAX: ffffffffffffffda RBX: 000000c00004e000 RCX: 000000000047cdbb
RDX: 000000000000000c RSI: 000000c000097798 RDI: 0000000000000009
RBP: 000000c0000975f8 R8: 0000000000000001 R9: 000000c00098d680
R10: 000000000000000c R11: 0000000000000216 R12: 000000c000097688
R13: 0000000000000000 R14: 000000c0006c3520 R15: 00007f5e359946b7
ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
Let's add a check function that jump to schedule_timeout(), just like
the schedule_timeout_*() in x86_64_function_called_by().
Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com>
---
x86_64.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/x86_64.c b/x86_64.c
index 42ade4817ad9..16850d98dc2d 100644
--- a/x86_64.c
+++ b/x86_64.c
@@ -4487,7 +4487,8 @@ x86_64_function_called_by(ulong rip)
*/
if (sp) {
if ((STREQ(sp->name, "schedule_timeout_interruptible") ||
- STREQ(sp->name, "schedule_timeout_uninterruptible")))
+ STREQ(sp->name, "schedule_timeout_uninterruptible") ||
+ STREQ(sp->name, "wait_for_completion_interruptible_timeout")))
sp = symbol_search("schedule_timeout");
if (STREQ(sp->name, "__cond_resched"))
--
2.41.0
10 months, 2 weeks
Re: Google Container OS and crash 8.0.4
by Matt Suiche
Is there an update available for this?
Thanks,
From: Matt Suiche <matt.suiche(a)magnetforensics.com>
Date: Wednesday, November 29, 2023 at 3:26 PM
To: HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab(a)nec.com>, devel(a)lists.crash-utility.osci.io <devel(a)lists.crash-utility.osci.io>
Subject: Re: [Crash-utility] Google Container OS and crash 8.0.4
Apparently, CONFIG_KALLSYMS_ALL is not set in COS kernel
Sent from my mobile device.
________________________________
From: Matt Suiche <matt.suiche(a)magnetforensics.com>
Sent: Wednesday, November 29, 2023 4:40:55 PM
To: HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab(a)nec.com>; devel(a)lists.crash-utility.osci.io <devel(a)lists.crash-utility.osci.io>
Subject: Re: [Crash-utility] Google Container OS and crash 8.0.4
Yes, it would probably make more sense. You can also probably use _stext instead of module_load_offset too to compare the values as an assertion check.
Sent from my mobile device.
________________________________
From: HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab(a)nec.com>
Sent: Wednesday, November 29, 2023 4:29 AM
To: Matt Suiche <matt.suiche(a)magnetforensics.com>; devel(a)lists.crash-utility.osci.io <devel(a)lists.crash-utility.osci.io>
Subject: Re: [Crash-utility] Google Container OS and crash 8.0.4
On 2023/11/22 18:04, Matt Suiche wrote:
> Sounds like this is the issue. Module_load_offset is not present, same
> with init_task though.
>
> root@instance-2:~# grep -e _stext -e module_load_offset -e init_task
> /proc/kallsyms
> ffffffff89000000 T _stext
> ffffffff8909e280 t ptrace_init_task
> ffffffff891c6af0 T ftrace_graph_init_task
> ffffffff89245ea0 T perf_event_init_task
> ffffffff8aba3b46 T rcu_init_tasks_generic
> root@instance-2:~#
Yes, but I don't see the reason why it's not present in /proc/kallsyms,
although it's present in the vmlinux..
Recent kernels have vmcoreinfo in /proc/kcore, maybe we can use the
KERNELOFFSET value instead of the module_load_offset symbol to determine
whether KASLR is enabled. I might try it when I have time.
Thanks,
Kazu
>
> *From: *HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab(a)nec.com>
> *Date: *Wednesday, November 22, 2023 at 12:01 PM
> *To: *Matt Suiche <matt.suiche(a)magnetforensics.com>,
> devel(a)lists.crash-utility.osci.io <devel(a)lists.crash-utility.osci.io>
> *Subject: *EXTERNAL SENDER Re: [Crash-utility] Google Container OS and
> crash 8.0.4
>
> On 2023/11/22 15:41, Matt Suiche wrote:
>> Good point, enough the –kaslr=auto option worked well. Same when I passed --kaslr=0x8000000
>
> Good news.
>
> apparently module_load_offset symbol is needed in /proc/kallsyms to
> enable the KASLR detection. I see it in the vmlinux.
>
> $ nm vmlinux-cos-5.15.133+ | grep module_load_offset
> ffffffff82d83350 b module_load_offset
>
> Is it (and _stext) found in /proc/kallsyms? like
>
> # grep -e _stext -e module_load_offset /proc/kallsyms
> ffffffffa0e00000 T _stext
> ffffffffa3aafab8 b module_load_offset
>
>
> PS. I will be out for the rest of this week, back next week.
>
> Thanks,
> Kazu
>
> This email including any attachments may contain confidential material
> for the sole use of the intended recipient. If you are not the intended
> recipient please immediately notify the sender by reply email,
> permanently delete this message and do not forward it or any part of it
> to anyone else.
>
This email including any attachments may contain confidential material for the sole use of the intended recipient. If you are not the intended recipient please immediately notify the sender by reply email, permanently delete this message and do not forward it or any part of it to anyone else.
10 months, 2 weeks
Re: [PATCH] symbols: skip the module if the given address is not within its address range
by Lianbo Jiang
Hi, Tao
Thank you for the fix.
On 1/4/24 23:27, devel-request(a)lists.crash-utility.osci.io wrote:
> Date: Thu, 4 Jan 2024 09:20:27 +0800
> From: Tao Liu<ltao(a)redhat.com>
> Subject: [Crash-utility] [PATCH] symbols: skip the module if the given
> address is not within its address range
> To:devel@lists.crash-utility.osci.io
> Cc: Tao Liu<ltao(a)redhat.com>
> Message-ID:<20240104012027.3893-1-ltao(a)redhat.com>
> Content-Type: text/plain; charset="US-ASCII"; x-default=true
>
> Previously, to find a module symbol and its offset by an arbitrary address,
> all symbols within the module will be iterated by address ascending order
> until the last symbol with a smaller address been noticed.
>
> However if the address is not within the module address range, e.g.
> the address is higher than the module's last symbol's address, then
> the module can be surely skipped, because its symbol iteration is
> unnecessary. This can speed up the kernel module symbols finding and improve
> the overall performance.
>
> Without the patch:
> $ time echo "bt 8993" | ~/crash-dev/crash vmcore vmlinux
> crash> bt 8993
> PID: 8993 TASK: ffff927569cc2100 CPU: 2 COMMAND: "WriterPool0"
> #0 [ffff927569cd76f0] __schedule at ffffffffb3db78d8
> #1 [ffff927569cd7758] schedule_preempt_disabled at ffffffffb3db8bf9
> #2 [ffff927569cd7768] __mutex_lock_slowpath at ffffffffb3db6ca7
> #3 [ffff927569cd77c0] mutex_lock at ffffffffb3db602f
> #4 [ffff927569cd77d8] ucache_retrieve at ffffffffc0cf4409 [secfs2]
> ...snip the stacktrace of the same module...
> #11 [ffff927569cd7ba0] cskal_path_vfs_getattr_nosec at ffffffffc05cae76 [falcon_kal]
> ...snip...
> #13 [ffff927569cd7c40] _ZdlPv at ffffffffc086e751 [falcon_lsm_serviceable]
> ...snip...
> #20 [ffff927569cd7ef8] unload_network_ops_symbols at ffffffffc06f11c0 [falcon_lsm_pinned_14713]
> #21 [ffff927569cd7f50] system_call_fastpath at ffffffffb3dc539a
> RIP: 00007f2b28ed4023 RSP: 00007f2a45fe7f80 RFLAGS: 00000206
> RAX: 0000000000000012 RBX: 00007f2a68302e00 RCX: 00007f2a682546d8
> RDX: 0000000000000826 RSI: 00007eb57ea6a000 RDI: 00000000000000e3
> RBP: 00007eb57ea6a000 R8: 0000000000000826 R9: 00000002670bdfd2
> R10: 00000002670bdfd2 R11: 0000000000000293 R12: 00000002670bdfd2
> R13: 00007f29d501a480 R14: 0000000000000826 R15: 00000002670bdfd2
> ORIG_RAX: 0000000000000012 CS: 0033 SS: 002b
> crash>
> real 7m14.826s
> user 7m12.502s
> sys 0m1.091s
>
> With the patch:
> $ time echo "bt 8993" | ~/crash-dev/crash vmcore vmlinux
> crash> bt 8993
> PID: 8993 TASK: ffff927569cc2100 CPU: 2 COMMAND: "WriterPool0"
> #0 [ffff927569cd76f0] __schedule at ffffffffb3db78d8
> #1 [ffff927569cd7758] schedule_preempt_disabled at ffffffffb3db8bf9
> ...snip the same output...
> crash>
> real 0m8.827s
> user 0m7.896s
> sys 0m0.938s
>
> Signed-off-by: Tao Liu<ltao(a)redhat.com>
> ---
> symbols.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
This looks good to me, so: Ack.
Thanks
Lianbo
> diff --git a/symbols.c b/symbols.c
> index 5d91991..88a3fd1 100644
> --- a/symbols.c
> +++ b/symbols.c
> @@ -5561,7 +5561,7 @@ value_search_module_6_4(ulong value, ulong *offset)
> sp = lm->symtable[t];
> sp_end = lm->symend[t];
>
> - if (value < sp->value)
> + if (value < sp->value || value > sp_end->value)
> continue;
>
> splast = NULL;
> @@ -5646,6 +5646,9 @@ retry:
> if (sp->value > value) /* invalid -- between modules */
> break;
>
> + if (sp_end->value < value) /* not within the module */
> + continue;
> +
> /*
> * splast will contain the last module symbol encountered.
> * Note: "__insmod_"-type symbols will be set in splast only
> -- 2.40.1
10 months, 2 weeks
Re: [PATCH v3 00/10] add LoongArch64 platform support
by Lianbo Jiang
Hi, Ming
Thank you for the update.
On 12/28/23 19:47, devel-request(a)lists.crash-utility.osci.io wrote:
> Date: Thu, 28 Dec 2023 19:46:24 +0800
> From: Ming Wang <wangming01(a)loongson.cn>
> Subject: [Crash-utility] [Crash-utility][PATCH v3 00/10] add
> LoongArch64 platform support
> To: devel(a)lists.crash-utility.osci.io, lijiang(a)redhat.com,
> k-hagio-ab(a)nec.com
> Cc: gaojuxin(a)loongson.cn, liweihao(a)loongson.cn
> Message-ID: <20231228114634.1085279-1-wangming01(a)loongson.cn>
>
> This patch set are for Crash-utility tool, it make crash tool support on
> loongarch64 architecture and the common commands(bt, p, rd, mod, log, set,
> dis, and so on).
>
> The patch sets were tested on a loongArch64 Loongson-3C5000 processor. Can
> successfully enter the crash command line and support for common command.
>
> ...
> KERNEL: vmlinux
> DUMPFILE: /proc/kcore
> CPUS: 16
> DATE: Thu Jul 27 19:51:21 CST 2023
> UPTIME: 06:35:11
> LOAD AVERAGE: 0.15, 0.03, 0.01
> TASKS: 257
> NODENAME: localhost.localdomain
> RELEASE: 5.10.0-60.102.0.128.oe2203.loongarch64
> VERSION: #1 SMP Fri Jul 14 04:17:09 UTC 2023
> MACHINE: loongarch64 (2200 Mhz)
> MEMORY: 64 GB
> PID: 2964
> COMMAND: "crash"
> TASK: 9000000098805500 [THREAD_INFO: 9000000094d48000]
> CPU: 6
> STATE: TASK_RUNNING (ACTIVE)
> crash>
> crash> dis -l start_kernel
> /linux-loongarch64/init/main.c: 883
> 0x9000000001030818 <start_kernel>: 0x0141ee40
> /linux-loongarch64/init/main.c: 879
> 0x900000000103081c <start_kernel+4>: 0x90000000
> /linux-loongarch64/init/main.c: 883
> 0x9000000001030820 <start_kernel+8>: addu16i.d $zero, $t8, 8179(0x1ff3)
> /linux-loongarch64/init/main.c: 879
> ...
>
> About the LoongArch64 Architecture:
> https://www.kernel.org/doc/html/latest/arch/loongarch/introduction.html
>
> Changes between v2 and v3:
> - Fix some compilation warnings.
> - Fix build errors when without target.
> - Some minor code adjustments.
>
> Thanks and regards,
> Ming
>
> Ming Wang (10):
> Add LoongArch64 framework code support
> LoongArch64: Make the crash tool successfully enter the crash command
> line
> LoongArch64: Add 'pte' command support
> LoongArch64: Add 'mach' command support
> LoongArch64: Add 'bt' command support
> LoongArch64: Add 'help -m/M' command support
> LoongArch64: Add 'help -r' command support
> LoongArch64: Add 'irq' command support
> LoongArch64: Add "--kaslr" command line option support
> LoongArch64: Add LoongArch64 architecture support information
>
> Makefile | 7 +-
> README | 4 +-
> configure.c | 43 +-
> crash.8 | 2 +-
> defs.h | 164 +-
> diskdump.c | 24 +-
> gdb-10.2.patch | 12822 +++++++++++++++++++++++++++++++++++++++++-
This is really a big change for the gdb, there are 12822 lines of code.
:-) Is it possible to support the LoongArch64 feature if a very smaller
change is made in the gdb patch? For example: trying to remove
redundant(unnecessary) changes, as I mentioned in the V1.
Or is it hard to implement this feature based on the gdb-10.2? Looks
like all code changes are backported from the latest gdb?
Thanks
Lianbo
> help.c | 13 +-
> lkcd_vmdump_v1.h | 2 +-
> lkcd_vmdump_v2_v3.h | 5 +-
> loongarch64.c | 1368 +++++
> main.c | 3 +-
> netdump.c | 27 +-
> ramdump.c | 2 +
> symbols.c | 33 +-
> 15 files changed, 14493 insertions(+), 26 deletions(-)
> create mode 100644 loongarch64.c
>
>
> base-commit: 53d2577cef98b76b122aade94349637a11e06138
10 months, 2 weeks
[PATCH 0/3] RISCV64: enhance bt command
by Song Shuai
hi,
This series enhances the bt command for RISCV64 by making bt aware of
- possible kernel and user exception frames (bt -e)
- per-cpu IRQ stacks (bt for vmcore crashed in irq stack, bt -E)
- per-cpu overflow stacks (bt for vmcore caused by kernel stack overflow)
This series is based on the "RISCV64: Fix 'bt' output when no ra on
the stack top" patch sent a few days ago. You can get them all from my Github repo:
https://github.com/sugarfillet/crash/commits/rv64-bt-enhance/
The detailed test report can be found in each patch's commit-msg and
all these tests passed on the RV64 Qemu-virt and with RISC-V Linux v6.6.
Here is the abstract of these three patches:
Patch1 :
[Crash-utility] RISCV64: Add support for 'bt -e' option
With this patch we can search the stack for possible kernel and user
mode exception frames via 'bt -e' command.
Patch2 :
[Crash-utility] RISCV64: Add per-cpu IRQ stacks support
This patch introduces per-cpu IRQ stacks for RISCV64 to let
"bt" do backtrace on it and 'bt -E' search eframes on it,
and the 'help -m' command displays the addresses of each
per-cpu IRQ stack.
Patch3:
[Crash-utility] RISCV64: Add per-cpu overflow stacks support
The patch introduces per-cpu overflow stacks for RISCV64 to let
"bt" do backtrace on it and the 'help -m' command dispalys the
addresss of each per-cpu overflow stack.
Song Shuai (3):
[Crash-utility] RISCV64: Add support for 'bt -e' option
[Crash-utility] RISCV64: Add per-cpu IRQ stacks support
[Crash-utility] RISCV64: Add per-cpu overflow stacks support
defs.h | 30 +++-
help.c | 2 +-
riscv64.c | 492 +++++++++++++++++++++++++++++++++++++++++++++++++++---
3 files changed, 494 insertions(+), 30 deletions(-)
--
2.20.1
10 months, 2 weeks
[PATCH v5 0/5] Improve stack unwind on ppc64
by Aditya Gupta
The Problem:
============
Currently crash is unable to show function arguments and local variables, as
gdb can do. And functionality for moving between frames ('up'/'down') is not
working in crash.
Crash has 'gdb passthroughs' for things gdb can do, but the gdb passthroughs
'bt', 'frame', 'info locals', 'up', 'down' are not working either, due to
gdb not getting the register values from `crash_target::fetch_registers`,
which then uses `machdep->get_cpu_reg`, which is not implemented for PPC64
Proposed Solution:
==================
Fix the gdb passthroughs by implementing "machdep->get_cpu_reg" for PPC64.
This way, "gdb mode in crash" will support this feature for both ELF and
kdump-compressed vmcore formats, while "gdb" would only have supported ELF
format
This way other features of 'gdb', such as seeing
backtraces/registers/variables/arguments/local variables, moving up and
down stack frames, can be used with any ppc64 vmcore, irrespective of
being ELF format or kdump-compressed format.
Note: This doesn't support live debugging on ppc64, since registers are not
available to be read
Implications on Architectures:
====================================
No architecture other than PPC64 has been affected, other than in case of
'frame' command
As mentioned in patch #2, since frame will not be prohibited, so it will print:
crash> frame
#0 <unavailable> in ?? ()
Instead of before prohibited message:
crash> frame
crash: prohibited gdb command: frame
Major change will be in 'gdb mode' on PPC64, that it will print the frames, and
local variables, instead of failing with errors showing no frame, or showing
that couldn't get PC, it will be able to give all this information.
Testing:
========
Git tree with this patch series applied:
https://github.com/adi-g15-ibm/crash/tree/stack-unwind-v5-smaller-gran
To test various gdb passthroughs:
(crash) set
(crash) set gdb on
gdb> thread
gdb> bt
gdb> info threads
gdb> info threads
gdb> info locals
gdb> info variables irq_rover_lock
gdb> info args
gdb> thread 2
gdb> set gdb off
(crash) set
(crash) set -c 6
(crash) gdb thread
(crash) bt
(crash) gdb bt
(crash) frame
(crash) up
(crash) down
(crash) info locals
Known Issues:
=============
1. In gdb mode, 'bt' might fail to show backtrace in few vmcores collected
from older kernels. This is a known issue due to register mismatch, and
its fix has been merged upstream:
This can also cause some 'invalid kernel virtual address' errors during gdb
unwinding the stack registers
Commit: https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef78581...
Fixing GDB passthroughs on other architectures
==============================================
Much of the work for making gdb passthroughs like 'gdb bt', 'gdb
thread', 'gdb info locals' etc. has been done by the patches introducing
'machdep->get_cpu_reg' and this series fixing some issues in that.
Other architectures should be able to fix these gdb functionalities by
simply implementing 'machdep->get_cpu_reg (cpu, regno, ...)'.
The reasoning behind that has been explained with a diagram in commit
description of patch #1
I will assist with my findings/observations fixing it on ppc64 whenever needed.
Changelog:
==========
V5:
+ changes in patch #1: made ppc64_get_cpu_reg static, and remove unreachable
code
+ changes in patch #3: fixed typo 'ppc64_renum' instead of 'ppc64_regnum',
remove unneeded if condition
+ changes in patch #5: implement refresh regcache on per thread, instead of all
threads at once
V4:
+ fix segmentation fault in live debugging (change in patch #1)
+ mention live debugging not supported in cover letter and patch #1
+ fixed some checkpatch warnings (change in patch #5)
V3:
+ default gdb thread will be the crashing thread, instead of being
thread '0'
+ synchronise crash cpu and gdb thread context
+ fix bug in gdb_interface, that replaced gdb's output stream, losing
output in some cases, such as info threads and extra output in info
variables
+ fix 'info threads'
RFC V2:
- removed patch implementing 'frame', 'up', 'down' in crash
- updated the cover letter by removing the mention of those commands other
than the respective gdb passthrough
Aditya Gupta (5):
ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg
remove 'frame' from prohibited commands list
synchronise cpu context changes between crash/gdb
fix gdb_interface: restore gdb's output streams at end of
gdb_interface
fix 'info threads' command
crash_target.c | 44 ++++++++++++++++
defs.h | 130 +++++++++++++++++++++++++++++++++++++++++++++++-
gdb-10.2.patch | 110 +++++++++++++++++++++++++++++++++++++++-
gdb_interface.c | 2 +-
kernel.c | 47 +++++++++++++++--
ppc64.c | 95 +++++++++++++++++++++++++++++++++--
task.c | 14 ++++++
tools.c | 2 +-
8 files changed, 434 insertions(+), 10 deletions(-)
--
2.41.0
10 months, 3 weeks
Re: [PATCH 0/3] RISCV64: enhance bt command
by Lianbo Jiang
Hi, Song
On 12/14/23 13:07, devel-request(a)lists.crash-utility.osci.io wrote:
> Date: Wed, 13 Dec 2023 17:45:05 +0800
> From: Song Shuai<songshuaishuai(a)tinylab.org>
> Subject: [Crash-utility] [PATCH 0/3] RISCV64: enhance bt command
> To:k-hagio-ab@nec.com,xianting.tian@linux.alibaba.com
> Cc:devel@lists.crash-utility.osci.io, Song Shuai
> <songshuaishuai(a)tinylab.org>
> Message-ID:<20231213094508.693236-1-songshuaishuai(a)tinylab.org>
>
> hi,
>
> This series enhances the bt command for RISCV64 by making bt aware of
> - possible kernel and user exception frames (bt -e)
> - per-cpu IRQ stacks (bt for vmcore crashed in irq stack, bt -E)
> - per-cpu overflow stacks (bt for vmcore caused by kernel stack overflow)
>
> This series is based on the "RISCV64: Fix 'bt' output when no ra on
> the stack top" patch sent a few days ago. You can get them all from my Github repo:
>
> https://github.com/sugarfillet/crash/commits/rv64-bt-enhance/
Could you please provide the patched kexec-tools(link) for riscv64?(if
possible, please give the test steps)
It can help me to do some tests quickly.
Thanks.
Lianbo
>
> The detailed test report can be found in each patch's commit-msg and
> all these tests passed on the RV64 Qemu-virt and with RISC-V Linux v6.6.
>
> Here is the abstract of these three patches:
>
> Patch1 :
> [Crash-utility] RISCV64: Add support for 'bt -e' option
>
> With this patch we can search the stack for possible kernel and user
> mode exception frames via 'bt -e' command.
>
> Patch2 :
> [Crash-utility] RISCV64: Add per-cpu IRQ stacks support
>
> This patch introduces per-cpu IRQ stacks for RISCV64 to let
> "bt" do backtrace on it and 'bt -E' search eframes on it,
> and the 'help -m' command displays the addresses of each
> per-cpu IRQ stack.
>
> Patch3:
> [Crash-utility] RISCV64: Add per-cpu overflow stacks support
>
> The patch introduces per-cpu overflow stacks for RISCV64 to let
> "bt" do backtrace on it and the 'help -m' command dispalys the
> addresss of each per-cpu overflow stack.
>
> Song Shuai (3):
> [Crash-utility] RISCV64: Add support for 'bt -e' option
> [Crash-utility] RISCV64: Add per-cpu IRQ stacks support
> [Crash-utility] RISCV64: Add per-cpu overflow stacks support
>
> defs.h | 30 +++-
> help.c | 2 +-
> riscv64.c | 492 +++++++++++++++++++++++++++++++++++++++++++++++++++---
> 3 files changed, 494 insertions(+), 30 deletions(-)
>
> -- 2.20.1
10 months, 4 weeks