[PATCH] speed up "ps -r" by storing the length of rlim array
by Kazuhito Hagio
Without this patch, the "ps -r" command takes one minute or more per 1,000
tasks. The cause is that getting the length of {task,signal}_struct.rlim
array takes some time and it is done for each task.
This patch stores the value, and it will take only about 0.5 seconds per
1,000 tasks.
Signed-off-by: Kazuhito Hagio <k-hagio(a)ab.jp.nec.com>
---
defs.h | 2 ++
symbols.c | 8 ++++++++
task.c | 6 ++++--
3 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/defs.h b/defs.h
index adddb9f..4b6ebc2 100644
--- a/defs.h
+++ b/defs.h
@@ -2191,6 +2191,8 @@ struct array_table {
int kmem_cache_cpu_slab;
int rt_prio_array_queue;
int height_to_maxnodes;
+ int task_struct_rlim;
+ int signal_struct_rlim;
};
/*
diff --git a/symbols.c b/symbols.c
index 638800a..e934e7f 100644
--- a/symbols.c
+++ b/symbols.c
@@ -8440,6 +8440,10 @@ builtin_array_length(char *s, int len, int *two_dim)
lenptr = &array_table.kmem_cache_cpu_slab;
else if (STREQ(s, "rt_prio_array.queue"))
lenptr = &array_table.rt_prio_array_queue;
+ else if (STREQ(s, "task_struct.rlim"))
+ lenptr = &array_table.task_struct_rlim;
+ else if (STREQ(s, "signal_struct.rlim"))
+ lenptr = &array_table.signal_struct_rlim;
if (!lenptr) /* not stored */
return(len);
@@ -10520,6 +10524,10 @@ dump_offset_table(char *spec, ulong makestruct)
ARRAY_LENGTH(kmem_cache_cpu_slab));
fprintf(fp, " rt_prio_array_queue: %d\n",
ARRAY_LENGTH(rt_prio_array_queue));
+ fprintf(fp, " task_struct_rlim: %d\n",
+ ARRAY_LENGTH(task_struct_rlim));
+ fprintf(fp, " signal_struct_rlim: %d\n",
+ ARRAY_LENGTH(signal_struct_rlim));
if (spec) {
int in_size_table, in_array_table, arrays, offsets, sizes;
diff --git a/task.c b/task.c
index 560adfa..2418e4c 100644
--- a/task.c
+++ b/task.c
@@ -4027,12 +4027,14 @@ show_task_rlimit(struct task_context *tc)
in_task_struct = in_signal_struct = FALSE;
if (VALID_MEMBER(task_struct_rlim)) {
- rlimit_index = get_array_length("task_struct.rlim", NULL, 0);
+ rlimit_index = (i = ARRAY_LENGTH(task_struct_rlim)) ?
+ i : get_array_length("task_struct.rlim", NULL, 0);
in_task_struct = TRUE;
} else if (VALID_MEMBER(signal_struct_rlim)) {
if (!VALID_MEMBER(task_struct_signal))
error(FATAL, "cannot determine rlimit array location\n");
- rlimit_index = get_array_length("signal_struct.rlim", NULL, 0);
+ rlimit_index = (i = ARRAY_LENGTH(signal_struct_rlim)) ?
+ i : get_array_length("signal_struct.rlim", NULL, 0);
in_signal_struct = TRUE;
}
--
1.8.3.1
6 years, 7 months
[PATCH 0/4] speed up handling of dumps with many tasks
by Greg Thelen
This series decreases crash startup and 'ps' processing time when handling dumps
with many tasks. Prior to the series a 1M task dump took 45m to load and 45m
more to run ps. Once patched, startup+ps time drops below 40 seconds.
Greg Thelen (4):
refactor store_context => add_context
refactor task_to_pid
remove unreachable (and slow) code
index task_context by task
defs.h | 2 +
task.c | 191 +++++++++++++++++++++++++++------------------------------
2 files changed, 92 insertions(+), 101 deletions(-)
--
2.17.0.484.g0c8726318c-goog
6 years, 7 months
Re: [Crash-utility] [crash patch] Compute init_thread_union size
by Dave Anderson
----- Original Message -----
> Greetings,
>
> I know absolutely nothing about how crash maintenance is done, and very
> damn little about crash's gizzard, so please consider the below a bug
> report, a patch.. or bloody annoying spam, as you see fit.
Hi Mike,
No, it's most definitely appreciated. Normally patches are posted on the
crash utility mailing list (crash-utility(a)redhat.com), but this is fine.
And speaking of the mailing list, there was a bug report and subsequent
thread yesterday concerning this issue:
https://www.redhat.com/archives/crash-utility/2018-April/msg00000.html
It was unresolved because the thread_union still exists in the most
recent upstream sources, and I can still see the union declaration
in the most recent Fedora kernel. It's there now, but maybe the x86
kernel doesn't reference it so it doesn't get picked up in the debuginfo
data? Not sure I understand, but regardless, this patch looks good to me.
I'm also forwarding this email to the mailing list and the original bug
reporter.
Thanks again,
Dave
>
> If the later, listen closely, and you'll hear "Sorry 'bout that" coming
> from the bottom of your trashcan :)
>
> -Mike
>
> ---
>
> As of kernel commit 0500871f21b2, init_thread_union size became zero,
> leaving thread_union and machdep->stacksize undetermined, breaking bt.
>
> crash> bt 1
> PID: 1 TASK: ffff9bf444c02200 CPU: 1 COMMAND: "systemd"
> #0 [ffffadc8428c3d50] __schedule at ffffffffbd704790
> bt: invalid RSP: ffffadc8428c3d50 bt->stackbase/stacktop:
> ffffadc8428c0000/ffffadc8428c2000 cpu: 1
> crash>
>
> Fall back to computing size via __end_init_task - __start_init_task.
>
> crash> bt 1
> PID: 1 TASK: ffff9bf444c02200 CPU: 1 COMMAND: "systemd"
> #0 [ffffadc8428c3d50] __schedule at ffffffffbd704790
> #1 [ffffadc8428c3dd0] schedule at ffffffffbd704bd0
> #2 [ffffadc8428c3de8] schedule_hrtimeout_range_clock at ffffffffbd707a66
> #3 [ffffadc8428c3e50] ep_poll at ffffffffbd29bac0
> #4 [ffffadc8428c3ef8] sys_epoll_wait at ffffffffbd29d612
> #5 [ffffadc8428c3f30] do_syscall_64 at ffffffffbd001b79
> #6 [ffffadc8428c3f50] entry_SYSCALL_64_after_hwframe at ffffffffbd80009f
> RIP: 00007f987b26d463 RSP: 00007fff36092e40 RFLAGS: 00000293
> RAX: ffffffffffffffda RBX: 000055a96c5accd0 RCX: 00007f987b26d463
> RDX: 000000000000005e RSI: 00007fff36092e50 RDI: 0000000000000004
> RBP: 00007fff360933c0 R8: 21ad2c5bde36816b R9: 000055a96a66b9e0
> R10: 00000000ffffffff R11: 0000000000000293 R12: 0000000000000001
> R13: 00007fff36092e50 R14: ffffffffffffffff R15: 0000000000000000
> ORIG_RAX: 00000000000000e8 CS: 0033 SS: 002b
> crash>
>
> Signed-off-by: Mike Galbraith <efault(a)gmx.de>
> ---
> task.c | 15 ++++++++++++++-
> 1 file changed, 14 insertions(+), 1 deletion(-)
>
> --- a/task.c
> +++ b/task.c
> @@ -438,8 +438,21 @@ task_init(void)
> len = SIZE(task_union));
> machdep->stacksize = len;
> } else if (VALID_SIZE(thread_union) &&
> - ((len = SIZE(thread_union)) != STACKSIZE()))
> + ((len = SIZE(thread_union)) != STACKSIZE())) {
> machdep->stacksize = len;
> + } else {
> + /*
> + * Post kernel commit 0500871f21b2, init_thread_union size
> + * became zero. Use __end_init_task - __start_init_task.
> + */
> + if (kernel_symbol_exists("__start_init_task") &&
> + kernel_symbol_exists("__end_init_task")) {
> + len = symbol_value("__end_init_task");
> + len -= symbol_value("__start_init_task");
> + ASSIGN_SIZE(thread_union) = len;
> + machdep->stacksize = len;
> + }
> + }
>
> MEMBER_OFFSET_INIT(pid_namespace_idr, "pid_namespace", "idr");
> MEMBER_OFFSET_INIT(idr_idr_rt, "idr", "idr_rt");
>
6 years, 7 months
Can't read stack contents from qemu dump
by Nikolay Borisov
Hello,
I tried running crash-head (HEAD: 5d172b230cf4) against today's linus'
master on a dump obtained via dump-guest-memory in qemu. And I got the
following when the image is loaded:
please wait... (determining panic task)
bt: read error: kernel virtual address: fffffe0000007000 type: "stack
contents"
KERNEL: vmlinux
DUMPFILE: memory-verbatim.img
CPUS: 1
DATE: Wed Apr 4 16:36:47 2018
UPTIME: 00:27:48
LOAD AVERAGE: 31.11, 17.80, 10.43
TASKS: 145
NODENAME: ubuntu-virtual
RELEASE: 4.16.0-rc7-nbor
VERSION: #570 SMP Wed Apr 4 16:03:44 EEST 2018
MACHINE: x86_64 (3392 Mhz)
MEMORY: 4 GB
PANIC: ""
PID: 0
COMMAND: "swapper/0"
TASK: ffffffff82016500 [THREAD_INFO: ffffffff82016500]
CPU: 0
STATE: TASK_RUNNING
WARNING: panic task not found
crash> bt
PID: 0 TASK: ffffffff82016500 CPU: 0 COMMAND: "swapper/0"
#0 [ffffffff82003dc8] __schedule at ffffffff817ea059
bt: invalid RSP: ffffffff82003dc8 bt->stackbase/stacktop:
ffffffff82000000/ffffffff82002000 cpu: 0
So the kernel has been compiled with : gcc (Ubuntu
5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 which has retpoline enabled.
I have KASLR disabled: # CONFIG_RANDOMIZE_BASE is not set and the kernel
is compiled with CONFIG_FRAME_POINTER=y .
This scenario used to work around the 4.10 timeline. Am I doing
something wrong or crash still needs time to work on the latest upstream
kernel code?
6 years, 7 months