[ANNOUNCE] crash version 5.1.4 is available
by Dave Anderson
- Fix for RT kernels in which the schedule() function has become a
wrapper function that calls the __schedule() function, and where
other functions may call __schedule() directly. Without the patch,
a warning message indicating "crash: cannot determing thread return
address" is displayed during invocation on x86_64 machines, and
backtraces of blocked tasks may have missing or invalid frames.
(anderson(a)redhat.com)
- Fix for running against live x86 kernels that were configured with
CONFIG_PHYSICAL_START containing a value that is greater than its
CONFIG_PHYSICAL_ALIGN value, and where the first symbol listed by
/proc/kallsyms is not "_text". Without the patch, the crash session
fails during invocation with the error message "crash: vmlinux and
/dev/mem do not match!" (or "/dev/crash" if applicable). As a work-
around, "/proc/kallsyms" can be entered on the command line, or the
"--reloc=<size>" option could be used, but the fix obviates that
requirement for live systems. It should be noted that dumpfiles of
kernels configured that way still do require that "/proc/kallsyms",
or a copy of it, or alternatively the "--reloc=<size>" option, to
be entered on the command line, as detailed in this changelog entry:
http://people.redhat.com/anderson/crash.changelog.html#4_0_4_5
(anderson(a)redhat.com)
- Unlike other extension modules, the "sial.so" module must be built
within a pre-built crash source tree because it uses header files
from the embedded gdb module. Therefore if a crash source tree is
laid down, entered, and "make extensions" is entered without first
building the crash utility, the build of sial.so build spews numerous
error messages. To avoid that, the sial.mk file has been modified to
check whether the embedded gdb build has been completed, and if it
has not, just displays "sial.so: build failed: requires the crash
gdb-7.0 module".
(anderson(a)redhat.com)
- If an extension module does not have its own <module>.mk file,
and is built using the extensions/Makefile, then it will be compiled
with the -Wall flag.
(anderson(a)redhat.com)
- The "trace.so" extension module has been improved to use "trace.cmd"
to implement the "trace show" option, instead of maintaining a
redundant code base within the module itself. The trace-cmd command
is better, mature, and continually maintained. The new "trace show"
option works like so:
(1) builds trace.dat from the core file and dumps it to /tmp.
(2) execs "trace-cmd report" upon the trace.dat file.
(3) splices the output of trace-cmd to the user and unlinks the
temporary file.
(laijs(a)cn.fujitsu.com)
- Updates to the "trace.so" extension module to extract trace_bprintk()
formats from a kernel core dump. It handles both the current format
and a new format that will be pushed out after the merge window has
closed for Linux 2.6.40. The new format is required for the kernel
debugfs to export the same bprintk data as well. This means that the
trace.so extension module will be able to extract more information
than trace-cmd itself can on a running kernel.
(rostedt(a)goodmis.org)
- Fix for the "gdb" command, or any command that resolves to a gdb
command, to not strip quotation marks from the input line. Without
the patch, any gdb command whose arguments contain quotation marks,
(e.g. "printf") would fail because they get incorrectly stripped
from the input line.
(anderson(a)redhat.com)
- Fix for the "p" command if its symbolic argument is a "char *" that
points to a static data string containing an "%" character. Without
the patch, the command results in a segmentation violation.
(anderson(a)redhat.com)
- Fix for the "sys -c" option to display an error message if a known
sys_call_table entry is not a valid system call address. Without
the patch, the compromised system call entry is not displayed unless
the crash debug mode is set to 1 or greater. With the patch, the
system call number will be followed by an error message indicating
"invalid sys_call_table entry: <address> (<symbol-name>)". This
change is only applicable on architectures/kernels where the index of
the sys_call_table array can be confirmed by debuginfo data, i.e.,
is not a loose calculation based upon the next kernel symbol.
(anderson(a)redhat.com)
- Print a warning message if there is any inconsistency between the
kernel version strings found in the vmlinux file vs. the dumpfile
or live memory. If a System.map file is used to correct the virtual
addresses found in the vmlinux file, the message is not displayed.
(anderson(a)redhat.com)
- Fix for "kmem -v", and all other commands that search through the
kernel's mapped virtual address list, in x86_64 kernel versions from
2.6.0 to 2.6.11. Those kernels contained a "vmlist" and a separate
"mod_vmlist" list header, both of which point to list of vm_structs
that described each contiguous block of mapped kernel memory. 2.6.12
and later x86_64 kernels consolidated both lists onto the "vmlist".
Without the patch, the list headed by "mod_vmlist" was not searched.
(anderson(a)redhat.com)
- Clarify the help page documentation for the "struct -l offset" option
so that it does not imply that the address argument is necessarily
an embedded list_head pointer. The "-l offset" option essentially
provides the capability of the kernel's container_of() macro, such
that the address of an embedded data structure can be used to display
its containing data structure.
(anderson(a)redhat.com)
Download from: http://people.redhat.com/anderson
13 years, 8 months
[PATCH] trace: Improve "trace show" command
by Lai Jiangshan
The code were also applied to:
git://github.com/laijs/tracing-extension-module-for-crash.git
Documents and man pages will/may be added in two weeks.
Dave Anderson, could you add a "Requires" entry to its RPM.spec,
it requires trace-cmd RPM after this patch applied.
Thanks,
Lai
Subject: [PATCH] trace: Improve "trace show" command
Use trace-cmd to implement "trace show" command and remove
the related code. It is not a good idea to maintain another
set of code with the same functional. trace-cmd is better
and mature.
How the new "trace show" works:
1) build trace.dat from the core file and dump it to /tmp.
2) exec "trace-cmd report" upon the trace.dat
3) splice the output of trace-cmd to user and unlink the temp file.
Signed-off-by: Lai Jiangshan <laijs(a)cn.fujitsu.com>
---
trace.c | 1566 ++--------------------------------------------------------------
1 file changed, 50 insertions(+), 1516 deletions(-)
diff --git a/extensions/trace.c b/extensions/trace.c
index 714765e..057459b 100755
--- a/extensions/trace.c
+++ b/extensions/trace.c
@@ -96,8 +96,6 @@ static const char *current_tracer_name;
static void ftrace_destroy_event_types(void);
static int ftrace_init_event_types(void);
-static int ftrace_show_init(void);
-static void ftrace_show_destroy(void);
/* at = ((struct *)ptr)->member */
#define read_value(at, ptr, struct, member) \
@@ -489,13 +487,8 @@ static int ftrace_init(void)
if (ftrace_init_current_tracer() < 0)
goto out_2;
- if (ftrace_show_init() < 0)
- goto out_3;
-
return 0;
-out_3:
- free(current_tracer_name);
out_2:
ftrace_destroy_event_types();
out_1:
@@ -511,7 +504,6 @@ out_0:
static void ftrace_destroy(void)
{
- ftrace_show_destroy();
free(current_tracer_name);
ftrace_destroy_event_types();
@@ -611,37 +603,21 @@ out_fail:
return -1;
}
-typedef uint64_t u64;
-typedef int64_t s64;
-typedef uint32_t u32;
-
#define MAX_CACHE_ID 256
-struct ftrace_field;
-typedef u64 (*access_op)(struct ftrace_field *aop, void *data);
-static void ftrace_field_access_init(struct ftrace_field *f);
-
struct ftrace_field {
const char *name;
const char *type;
- access_op op;
int offset;
int size;
int is_signed;
};
-struct event_type;
-struct format_context;
-typedef void (*event_printer)(struct event_type *t, struct format_context *fc);
-
- /* SIGH, we cann't get "print fmt" from core-file */
-
struct event_type {
struct event_type *next;
const char *system;
const char *name;
int plugin;
- event_printer printer;
const char *print_fmt;
int id;
int nfields;
@@ -655,16 +631,6 @@ static int nr_event_types;
static struct ftrace_field *ftrace_common_fields;
static int ftrace_common_fields_count;
-/*
- * TODO: implement event_generic_print_fmt_print() when the print fmt
- * in tracing/events/$SYSTEM/$TRACE/format becomes a will-defined
- * language.
- */
-static void event_generic_print_fmt_print(struct event_type *t,
- struct format_context *fc);
-static void event_default_print(struct event_type *t,
- struct format_context *fc);
-
static int syscall_get_enter_fields(ulong call, ulong *fields)
{
static int inited;
@@ -887,7 +853,6 @@ static int ftrace_init_event_fields(ulong fields_head, int *pnfields,
goto out_fail;
}
- ftrace_field_access_init(&fields[nfields]);
nfields++;
/* Advance to the next field */
@@ -1186,7 +1151,6 @@ static int ftrace_init_event_types(void)
aevent_type->plugin = 1;
else
aevent_type->plugin = 0;
- aevent_type->printer = event_default_print;
/* Add a event type */
event_types[nr_event_types++] = aevent_type;
@@ -1212,57 +1176,6 @@ out_fail:
return -1;
}
-static
-struct ftrace_field *find_event_field(struct event_type *t, const char *name)
-{
- int i;
- struct ftrace_field *f;
-
- for (i = 0; i < ftrace_common_fields_count; i++) {
- f = ftrace_common_fields + i;
- if (!strcmp(name, f->name))
- return f;
- }
-
- for (i = 0; i < t->nfields; i++) {
- f = &t->fields[i];
- if (!strcmp(name, f->name))
- return f;
- }
-
- return NULL;
-}
-
-static struct event_type *find_event_type(int id)
-{
- int i;
-
- if ((unsigned int)id < MAX_CACHE_ID)
- return event_type_cache[id];
-
- for (i = 0; i < nr_event_types; i++) {
- if (event_types[i]->id == id)
- return event_types[i];
- }
-
- return NULL;
-}
-
-static
-struct event_type *find_event_type_by_name(const char *system, const char *name)
-{
- int i;
-
- for (i = 0; i < nr_event_types; i++) {
- if (system && strcmp(system, event_types[i]->system))
- continue;
- if (!strcmp(name, event_types[i]->name))
- return event_types[i];
- }
-
- return NULL;
-}
-
#define default_common_field_count 5
static int ftrace_dump_event_type(struct event_type *t, const char *path)
@@ -1364,381 +1277,6 @@ static int ftrace_dump_event_types(const char *events_path)
return 0;
}
-struct ring_buffer_per_cpu_stream {
- struct ring_buffer_per_cpu *cpu_buffer;
- void *curr_page;
- int curr_page_indx;
-
- uint64_t ts;
- uint32_t *offset;
- uint32_t *commit;
-};
-
-static
-int ring_buffer_per_cpu_stream_init(struct ring_buffer_per_cpu *cpu_buffer,
- unsigned pages, struct ring_buffer_per_cpu_stream *s)
-{
- s->cpu_buffer = cpu_buffer;
- s->curr_page = malloc(PAGESIZE());
- if (s->curr_page == NULL)
- return -1;
-
- s->curr_page_indx = -1;
- return 0;
-}
-
-static
-void ring_buffer_per_cpu_stream_destroy(struct ring_buffer_per_cpu_stream *s)
-{
- free(s->curr_page);
-}
-
-struct ftrace_event {
- uint64_t ts;
- int length;
- void *data;
-};
-
-struct event {
- u32 type_len:5, time_delta:27;
-};
-
-#define RINGBUF_TYPE_PADDING 29
-#define RINGBUF_TYPE_TIME_EXTEND 30
-#define RINGBUF_TYPE_TIME_STAMP 31
-#define RINGBUF_TYPE_DATA 0 ... 28
-
-#define sizeof_local_t (sizeof(ulong))
-#define PAGE_HEADER_LEN (8 + sizeof_local_t)
-
-static
-int ring_buffer_per_cpu_stream_get_page(struct ring_buffer_per_cpu_stream *s)
-{
- ulong raw_page;
-
- read_value(raw_page, s->cpu_buffer->linear_pages[s->curr_page_indx],
- buffer_page, page);
-
- if (!readmem(raw_page, KVADDR, s->curr_page, PAGESIZE(),
- "get page context", RETURN_ON_ERROR))
- return -1;
-
- s->ts = *(u64 *)s->curr_page;
- s->offset = s->curr_page + PAGE_HEADER_LEN;
- s->commit = s->offset + *(ulong *)(s->curr_page + 8) / 4;
-
- return 0;
-
-out_fail:
- return -1;
-}
-
-static
-int ring_buffer_per_cpu_stream_pop_event(struct ring_buffer_per_cpu_stream *s,
- struct ftrace_event *res)
-{
- struct event *event;
-
- res->data = NULL;
-
- if (s->curr_page_indx >= s->cpu_buffer->nr_linear_pages)
- return -1;
-
-again:
- if ((s->curr_page_indx == -1) || (s->offset >= s->commit)) {
- s->curr_page_indx++;
-
- if (s->curr_page_indx == s->cpu_buffer->nr_linear_pages)
- return -1;
-
- if (ring_buffer_per_cpu_stream_get_page(s) < 0) {
- s->curr_page_indx = s->cpu_buffer->nr_linear_pages;
- return -1;
- }
-
- if (s->offset >= s->commit)
- goto again;
- }
-
- event = (void *)s->offset;
-
- switch (event->type_len) {
- case RINGBUF_TYPE_PADDING:
- if (event->time_delta)
- s->offset += 1 + ((*(s->offset + 1) + 3) / 4);
- else
- s->offset = s->commit;
- goto again;
-
- case RINGBUF_TYPE_TIME_EXTEND:
- s->ts +=event->time_delta;
- s->ts += ((u64)*(s->offset + 1)) << 27;
- s->offset += 2;
- goto again;
-
- case RINGBUF_TYPE_TIME_STAMP:
- /* FIXME: not implemented */
- s->offset += 4;
- goto again;
-
- case RINGBUF_TYPE_DATA:
- if (!event->type_len) {
- res->data = s->offset + 2;
- res->length = *(s->offset + 1) - 4;
-
- s->offset += 1 + ((*(s->offset + 1) + 3) / 4);
- } else {
- res->data = s->offset + 1;
- res->length = event->type_len * 4;
-
- s->offset += 1 + event->type_len;
- }
-
- if (s->offset > s->commit) {
- fprintf(fp, "corrupt\n");
- res->data = NULL;
- goto again;
- }
-
- s->ts += event->time_delta;
- res->ts = s->ts;
-
- return 0;
-
- default:;
- }
-
- return -1;
-}
-
-struct ring_buffer_stream {
- struct ring_buffer_per_cpu_stream *ss;
- struct ftrace_event *es;
- u64 ts;
- int popped_cpu;
- int pushed;
-};
-
-static void __rbs_destroy(struct ring_buffer_stream *s, int *cpulist, int nr)
-{
- int cpu;
-
- for (cpu = 0; cpu < nr; cpu++) {
- if (!s->ss[cpu].cpu_buffer)
- continue;
-
- ring_buffer_per_cpu_stream_destroy(s->ss + cpu);
- }
-
- free(s->ss);
- free(s->es);
-}
-
-static
-int ring_buffer_stream_init(struct ring_buffer_stream *s, int *cpulist)
-{
- int cpu;
-
- s->ss = malloc(sizeof(*s->ss) * nr_cpu_ids);
- if (s->ss == NULL)
- return -1;
-
- s->es = malloc(sizeof(*s->es) * nr_cpu_ids);
- if (s->es == NULL) {
- free(s->ss);
- return -1;
- }
-
- for (cpu = 0; cpu < nr_cpu_ids; cpu++) {
- s->ss[cpu].cpu_buffer = NULL;
- s->es[cpu].data = NULL;
-
- if (!global_buffers[cpu].kaddr)
- continue;
-
- if (cpulist && !cpulist[cpu])
- continue;
-
- if (ring_buffer_per_cpu_stream_init(global_buffers + cpu,
- global_pages, s->ss + cpu) < 0) {
- __rbs_destroy(s, cpulist, cpu);
- return -1;
- }
- }
-
- s->ts = 0;
- s->popped_cpu = nr_cpu_ids;
- s->pushed = 0;
-
- return 0;
-}
-
-static
-void ring_buffer_stream_destroy(struct ring_buffer_stream *s, int *cpulist)
-{
- __rbs_destroy(s, cpulist, nr_cpu_ids);
-}
-
-/* make current event be returned again at next pop */
-static void ring_buffer_stream_push_current_event(struct ring_buffer_stream *s)
-{
- if ((s->popped_cpu < 0) || (s->popped_cpu == nr_cpu_ids))
- return;
-
- s->pushed = 1;
-}
-
-/* return the cpu# of this event */
-static int ring_buffer_stream_pop_event(struct ring_buffer_stream *s,
- struct ftrace_event *res)
-{
- int cpu, min_cpu = -1;
- u64 ts, min_ts;
-
- res->data = NULL;
-
- if (s->popped_cpu < 0)
- return -1;
-
- if (s->popped_cpu == nr_cpu_ids) {
- for (cpu = 0; cpu < nr_cpu_ids; cpu++) {
- if (!s->ss[cpu].cpu_buffer)
- continue;
-
- ring_buffer_per_cpu_stream_pop_event(s->ss + cpu,
- s->es + cpu);
-
- if (s->es[cpu].data == NULL)
- continue;
-
- /*
- * We do not have start point of time,
- * determine the min_ts with heuristic way.
- */
- ts = s->es[cpu].ts;
- if (min_cpu < 0 || (s64)(ts - min_ts) < 0) {
- min_ts = ts;
- min_cpu = cpu;
- }
- }
-
- s->pushed = 0;
- goto done;
- }
-
- if (s->pushed) {
- s->pushed = 0;
- *res = s->es[s->popped_cpu];
- return s->popped_cpu;
- }
-
- ring_buffer_per_cpu_stream_pop_event(&s->ss[s->popped_cpu],
- &s->es[s->popped_cpu]);
-
- for (cpu = 0; cpu < nr_cpu_ids; cpu++) {
- if (s->es[cpu].data == NULL)
- continue;
-
- /* we have start point of time(s->ts) */
- ts = s->es[cpu].ts - s->ts;
- if (min_cpu < 0 || ts < min_ts) {
- min_ts = ts;
- min_cpu = cpu;
- }
- }
-
-done:
- s->popped_cpu = min_cpu;
-
- if (min_cpu < 0)
- return -1;
-
- s->ts = s->es[min_cpu].ts;
- *res = s->es[min_cpu];
-
- return min_cpu;
-}
-
-static u64 access_error(struct ftrace_field *f, void *data)
-{
- return 0;
-}
-
-static u64 access_8(struct ftrace_field *f, void *data)
-{
- return *(int8_t *)(data + f->offset);
-}
-
-static u64 access_16(struct ftrace_field *f, void *data)
-{
- return *(int16_t *)(data + f->offset);
-}
-
-static u64 access_32(struct ftrace_field *f, void *data)
-{
- return *(int32_t *)(data + f->offset);
-}
-
-static u64 access_64(struct ftrace_field *f, void *data)
-{
- return *(int64_t *)(data + f->offset);
-}
-
-static u64 access_string_local(struct ftrace_field *f, void *data)
-{
- int offset;
-
- if (f->size == 2)
- offset = *(int16_t *)(data + f->offset);
- else
- offset = *(int32_t *)(data + f->offset) & 0xFFFF;
-
- return (long)(data + offset);
-}
-
-static u64 access_string(struct ftrace_field *f, void *data)
-{
- return (long)(data + f->offset);
-}
-
-static u64 access_other_local(struct ftrace_field *f, void *data)
-{
- return access_string_local(f, data);
-}
-
-static u64 access_other(struct ftrace_field *f, void *data)
-{
- return (long)(data + f->offset);
-}
-
-static void ftrace_field_access_init(struct ftrace_field *f)
-{
- /* guess whether it is string array */
- if (!strncmp(f->type, "__data_loc", sizeof("__data_loc") - 1)) {
- if (f->size != 2 && f->size != 4) {
- /* kernel side may be changed, need fix here */
- f->op = access_error;
- } else if (strstr(f->type, "char")) {
- f->op = access_string_local;
- } else {
- f->op = access_other_local;
- }
- } else if (strchr(f->type, '[')) {
- if (strstr(f->type, "char"))
- f->op = access_string;
- else
- f->op = access_other;
- } else {
- switch (f->size) {
- case 1: f->op = access_8; break;
- case 2: f->op = access_16; break;
- case 4: f->op = access_32; break;
- case 8: f->op = access_64; break;
- default: f->op = access_other; break;
- }
- }
-}
-
static void show_basic_info(void)
{
fprintf(fp, "current tracer is %s\n", current_tracer_name);
@@ -1881,334 +1419,58 @@ static void ftrace_dump(int argc, char *argv[])
}
}
-static char show_event_buf[4096];
-static int show_event_pos;
-
-#define INVALID_ACCESS_FIELD 1
-static jmp_buf show_event_env;
-
-struct format_context {
- struct ring_buffer_stream stream;
- struct ftrace_event event;
- int cpu;
-};
-
-static struct format_context format_context;
-
-/* !!!! @event_type and @field_name should be const for every call */
-#define access_field(event_type, data, field_name) \
-({ \
- static struct ftrace_field *__access_field##_field; \
- \
- if (__access_field##_field == NULL) { \
- __access_field##_field = find_event_field(event_type, \
- field_name); \
- } \
- \
- if (__access_field##_field == NULL) { \
- event_type->printer = event_default_print; \
- ring_buffer_stream_push_current_event(&format_context.stream);\
- longjmp(show_event_env, INVALID_ACCESS_FIELD); \
- } \
- \
- __access_field##_field->op(__access_field##_field, data); \
-})
-
-static int ftrace_event_get_id(void *data)
-{
- return access_field(event_types[0], data, "common_type");
-}
-
-static int ftrace_event_get_pid(void *data)
-{
- return access_field(event_types[0], data, "common_pid");
-}
-
-#define event_printf(fmt, args...) \
-do { \
- show_event_pos += snprintf(show_event_buf + show_event_pos, \
- sizeof(show_event_buf) - show_event_pos, \
- fmt, ##args); \
-} while (0)
-
-
-static void event_field_print(struct ftrace_field *f, void *data)
+static void ftrace_show(int argc, char *argv[])
{
- u64 value = f->op(f, data);
+ char buf[4096];
+ char tmp[] = "/tmp/crash.trace_dat.XXXXXX";
+ char *trace_cmd = "trace-cmd", *env_trace_cmd = getenv("TRACE_CMD");
+ int fd;
+ FILE *file;
+ size_t ret;
- if (f->op == access_error) {
- event_printf("<Error>");
- } else if (f->op == access_8) {
- if (f->is_signed)
- event_printf("%d", (int8_t)value);
- else
- event_printf("%u", (uint8_t)value);
- } else if (f->op == access_16) {
- if (f->is_signed)
- event_printf("%d", (int16_t)value);
- else
- event_printf("%u", (uint16_t)value);
- } else if (f->op == access_32) {
- if (f->is_signed)
- event_printf("%d", (int32_t)value);
- else
- event_printf("%u", (uint32_t)value);
- } else if (f->op == access_64) {
- if (f->is_signed)
- event_printf("%lld", (long long)value);
+ /* check trace-cmd */
+ if (env_trace_cmd)
+ trace_cmd = env_trace_cmd;
+ if (!(file = popen(trace_cmd, "r")))
+ return;
+ ret = fread(buf, 1, sizeof(buf), file);
+ buf[4097] = 0;
+ if (!strstr(buf, "trace-cmd version")) {
+ if (env_trace_cmd)
+ fprintf(fp, "Invalid environment TRACE_CMD: %s\n",
+ env_trace_cmd);
else
- event_printf("%llu", (unsigned long long)value);
- } else if (f->op == access_string_local) {
- int size = 0;
+ fprintf(fp, "\"trace show\" requires trace-cmd.\n"
+ "please set the environment TRACE_CMD "
+ "if you installed it a special path\n"
+ );
+ return;
+ }
- if (f->size == 4)
- size = *(int32_t *)(data + f->offset) >> 16;
+ /* dump trace.dat to the temp file */
+ mktemp(tmp);
+ fd = open(tmp, O_WRONLY | O_CREAT | O_TRUNC, 0644);
+ if (trace_cmd_data_output(fd) < 0)
+ goto out;
- if (size)
- event_printf("%.*s", size, (char *)(long)value);
- else
- event_printf("%s", (char *)(long)value);
- } else if (f->op == access_string) {
- event_printf("%.*s", f->size, (char *)(long)value);
- } else if (f->op == access_other) {
- /* TODO */
- } else if (f->op == access_other_local) {
- /* TODO */
- } else {
- /* TODO */
+ /* splice the output of trace-cmd to user */
+ snprintf(buf, sizeof(buf), "%s report %s", trace_cmd, tmp);
+ if (!(file = popen(buf, "r")))
+ goto out;
+ for (;;) {
+ ret = fread(buf, 1, sizeof(buf), file);
+ if (ret == 0)
+ break;
+ fwrite(buf, 1, ret, fp);
}
+ pclose(file);
+out:
+ close(fd);
+ unlink(tmp);
+ return;
}
-static void get_comm_from_pid(int pid, char *comm)
-{
- int li, hi;
- struct task_context *tc;
-
- if (pid == 0) {
- strcpy(comm, "<swapper>");
- return;
- }
-
- tc = FIRST_CONTEXT();
-
- li = 0;
- hi = RUNNING_TASKS();
- while (li < hi) {
- int mid = (li + hi) / 2;
-
- if (tc[mid].pid > pid)
- hi = mid;
- else if (tc[mid].pid < pid)
- li = mid + 1;
- else {
- strcpy(comm, tc[mid].comm);
- return;
- }
- }
-
- strcpy(comm, "<...>");
-}
-
-static void event_context_print(struct event_type *t, struct format_context *fc)
-{
- u64 time = fc->event.ts / 1000;
- ulong sec = time / 1000000;
- ulong usec = time % 1000000;
- int pid = ftrace_event_get_pid(fc->event.data);
- char comm[20];
-
- get_comm_from_pid(pid, comm);
- event_printf("%16s-%-5d [%03d] %5lu.%06lu: ",
- comm, pid, fc->cpu, sec, usec);
-}
-
-static int gopt_context_info;
-static int gopt_sym_offset;
-static int gopt_sym_addr;
-
-static int gopt_graph_print_duration;
-static int gopt_graph_print_overhead;
-static int gopt_graph_print_abstime;
-static int gopt_graph_print_cpu;
-static int gopt_graph_print_proc;
-static int gopt_graph_print_overrun;
-
-static void set_all_flags_default(void)
-{
- gopt_context_info = 1;
- gopt_sym_offset = 0;
- gopt_sym_addr = 0;
-
- gopt_graph_print_duration = 1;
- gopt_graph_print_overhead = 1;
- gopt_graph_print_abstime = 0;
- gopt_graph_print_cpu = 1;
- gopt_graph_print_proc = 0;
- gopt_graph_print_overrun = 0;
-}
-
-static void set_clear_flag(const char *flag_name, int set)
-{
- if (!strcmp(flag_name, "context_info"))
- gopt_context_info = set;
- else if (!strcmp(flag_name, "sym_offset"))
- gopt_sym_offset = set;
- else if (!strcmp(flag_name, "sym_addr"))
- gopt_sym_addr = set;
- else if (!strcmp(flag_name, "graph_print_duration"))
- gopt_graph_print_duration = set;
- else if (!strcmp(flag_name, "graph_print_overhead"))
- gopt_graph_print_overhead = set;
- else if (!strcmp(flag_name, "graph_print_abstime"))
- gopt_graph_print_abstime = set;
- else if (!strcmp(flag_name, "graph_print_cpu"))
- gopt_graph_print_cpu = set;
- else if (!strcmp(flag_name, "graph_print_proc"))
- gopt_graph_print_proc = set;
- else if (!strcmp(flag_name, "graph_print_overrun"))
- gopt_graph_print_overrun = set;
- /* invalid flage_name is omitted. */
-}
-
-static int tracer_no_event_context;
-
-static void ftrace_show_function_graph_init(void);
-static void ftrace_show_function_init(void);
-static void ftrace_show_trace_event_init(void);
-
-static int ftrace_show_init(void)
-{
- /* ftrace_event_get_id(), ftrace_event_get_pid() should not failed. */
- if (find_event_field(event_types[0], "common_type") == NULL)
- return -1;
-
- if (find_event_field(event_types[0], "common_pid") == NULL)
- return -1;
-
- ftrace_show_function_init();
- ftrace_show_function_graph_init();
- ftrace_show_trace_event_init();
-
- return 0;
-}
-
-void show_event(struct format_context *fc)
-{
- struct event_type *etype;
- int id;
-
- id = ftrace_event_get_id(fc->event.data);
- etype = find_event_type(id);
-
- if (etype == NULL) {
- event_printf("<Unknown event type>\n");
- return;
- }
-
- if (!tracer_no_event_context && gopt_context_info)
- event_context_print(etype, fc);
- if (!etype->plugin)
- event_printf("%s: ", etype->name);
- etype->printer(etype, fc);
-}
-
-static int parse_cpulist(int *cpulist, const char *cpulist_str, int len)
-{
- unsigned a, b;
- const char *s = cpulist_str;
-
- memset(cpulist, 0, sizeof(int) * len);
-
- do {
- if (!isdigit(*s))
- return -1;
- b = a = strtoul(s, (char **)&s, 10);
- if (*s == '-') {
- s++;
- if (!isdigit(*s))
- return -1;
- b = strtoul(s, (char **)&s, 10);
- }
- if (!(a <= b))
- return -1;
- if (b >= len)
- return -1;
- while (a <= b) {
- cpulist[a] = 1;
- a++;
- }
- if (*s == ',')
- s++;
- } while (*s != '\0' && *s != '\n');
-
- return 0;
-}
-
-static void ftrace_show_function_graph_start(void);
-
-static void ftrace_show(int argc, char *argv[])
-{
- int c;
- int *cpulist = NULL;
-
- set_all_flags_default();
- ftrace_show_function_graph_start();
-
- while ((c = getopt(argc, argv, "f:c:")) != EOF) {
- switch(c)
- {
- case 'f':
- if (optarg[0] == 'n' && optarg[1] == 'o')
- set_clear_flag(optarg + 2, 0);
- else
- set_clear_flag(optarg, 1);
- break;
- case 'c':
- if (cpulist)
- goto err_arg;
-
- cpulist = malloc(sizeof(int) * nr_cpu_ids);
- if (cpulist == NULL) {
- error(INFO, "malloc() fail\n");
- return;
- }
-
- if (parse_cpulist(cpulist, optarg, nr_cpu_ids) < 0)
- goto err_arg;
- break;
- default:
- goto err_arg;
- }
- }
-
- if (argc - optind != 0) {
-err_arg:
- cmd_usage(pc->curcmd, SYNOPSIS);
- free(cpulist);
- return;
- }
-
- ring_buffer_stream_init(&format_context.stream, cpulist);
-
- /* Ignore setjmp()'s return value, no special things to do. */
- setjmp(show_event_env);
-
- for (;;) {
- show_event_pos = 0;
- format_context.cpu = ring_buffer_stream_pop_event(
- &format_context.stream, &format_context.event);
- if (format_context.cpu < 0)
- break;
-
- show_event(&format_context);
- fprintf(fp, "%s", show_event_buf);
- }
-
- ring_buffer_stream_destroy(&format_context.stream, cpulist);
- free(cpulist);
-}
-
-static void cmd_ftrace(void)
+static void cmd_ftrace(void)
{
if (argcnt == 1)
show_basic_info();
@@ -2216,729 +1478,12 @@ static void cmd_ftrace(void)
ftrace_dump(argcnt - 1, args + 1);
else if (!strcmp(args[1], "show"))
ftrace_show(argcnt - 1, args + 1);
+ else if (!strcmp(args[1], "report"))
+ ftrace_show(argcnt - 1, args + 1);
else
cmd_usage(pc->curcmd, SYNOPSIS);
}
-static void event_default_print(struct event_type *t, struct format_context *fc)
-{
- int i;
-
- /* Skip the common types */
- for (i = t->nfields - 6; i >= 0; i--) {
- struct ftrace_field *f;
-
- f = &t->fields[i];
- event_printf("%s=", f->name);
- event_field_print(f, fc->event.data);
- if (i)
- event_printf(", ");
- }
-
- event_printf("\n");
-}
-
-static void sym_print(ulong sym, int opt_offset)
-{
- if (!sym) {
- event_printf("0");
- } else {
- ulong offset;
- struct syment *se;
-
- se = value_search(sym, &offset);
- if (se) {
- event_printf("%s", se->name);
- if (opt_offset)
- event_printf("+%lu", offset);
- }
- }
-}
-
-static void event_fn_print(struct event_type *t, struct format_context *fc)
-{
- unsigned long ip = access_field(t, fc->event.data, "ip");
- unsigned long parent_ip = access_field(t, fc->event.data, "parent_ip");
-
- sym_print(ip, gopt_sym_offset);
- if (gopt_sym_addr)
- event_printf("<%lx>", ip);
-
- event_printf(" <-");
-
- sym_print(parent_ip, gopt_sym_offset);
- if (gopt_sym_addr)
- event_printf("<%lx>", parent_ip);
-
- event_printf("\n");
-}
-
-static void ftrace_show_function_init(void)
-{
- struct event_type *t = find_event_type_by_name("ftrace", "function");
-
- if (t)
- t->printer = event_fn_print;
-}
-
-#if 0
-/* simple */
-static void event_fn_entry_print(struct event_type *t, struct format_context *fc)
-{
- ulong func = access_field(t, fc->event.data, "graph_ent.func");
- int depth = access_field(t, fc->event.data, "graph_ent.depth");
-
- event_printf("%*s", depth, " ");
- sym_print(func, gopt_sym_offset);
- if (gopt_sym_addr)
- event_printf("<%lx>", func);
- event_printf("() {");
-}
-
-static void event_fn_return_print(struct event_type *t, struct format_context *fc)
-{
- ulong func = access_field(t, fc->event.data, "ret.func");
- u64 calltime = access_field(t, fc->event.data, "ret.calltime");
- u64 rettime = access_field(t, fc->event.data, "ret.rettime");
- int depth = access_field(t, fc->event.data, "ret.depth");
-
- event_printf("%*s} %lluns", depth, " ",
- (unsigned long long)(rettime - calltime));
-}
-
-static void ftrace_show_function_graph_init(void)
-{
- struct event_type *t1 = find_event_type_by_name(
- "ftrace", "funcgraph_entry");
- struct event_type *t2 = find_event_type_by_name(
- "ftrace", "funcgraph_exit");
-
- if (t1 == NULL || t2 == NULL)
- return;
-
- t1->printer = event_fn_entry_print;
- t2->printer = event_fn_return_print;
-}
-#endif
-
-
-#define TRACE_GRAPH_PROCINFO_LENGTH 14
-#define TRACE_GRAPH_INDENT 2
-
-static int max_bytes_for_cpu;
-static int *cpus_prev_pid;
-
-static int function_graph_entry_id;
-static int function_graph_return_id;
-static struct event_type *function_graph_entry_type;
-static struct event_type *function_graph_return_type;
-
-static void ftrace_show_function_graph_start(void)
-{
- int i;
-
- if (cpus_prev_pid == NULL)
- return;
-
- for (i = 0; i < nr_cpu_ids; i++)
- cpus_prev_pid[i] = -1;
-}
-
-static void fn_graph_proc_print(int pid)
-{
- int pid_strlen, comm_strlen;
- char pid_str[20];
- char comm[20] = "<...>";
-
- pid_strlen = sprintf(pid_str, "%d", pid);
- comm_strlen = TRACE_GRAPH_PROCINFO_LENGTH - 1 - pid_strlen;
-
- get_comm_from_pid(pid, comm);
- event_printf("%*.*s-%s", comm_strlen, comm_strlen, comm, pid_str);
-}
-
-/* If the pid changed since the last trace, output this event */
-static void fn_graph_proc_switch_print(int pid, int cpu)
-{
- int prev_pid = cpus_prev_pid[cpu];
-
- cpus_prev_pid[cpu] = pid;
- if ((prev_pid == pid) || (prev_pid == -1))
- return;
-
-/*
- * Context-switch trace line:
-
- ------------------------------------------
- | 1) migration/0--1 => sshd-1755
- ------------------------------------------
-
- */
-
- event_printf(" ------------------------------------------\n");
- event_printf(" %*d) ", max_bytes_for_cpu, cpu);
- fn_graph_proc_print(prev_pid);
- event_printf(" => ");
- fn_graph_proc_print(pid);
- event_printf("\n ------------------------------------------\n\n");
-}
-
-/* Signal a overhead of time execution to the output */
-static void fn_graph_overhead_print(unsigned long long duration)
-{
- const char *s = " ";
-
- /* If duration disappear, we don't need anything */
- if (!gopt_graph_print_duration)
- return;
-
- /* duration == -1 is for non nested entry or return */
- if ((duration != -1) && gopt_graph_print_overhead) {
- /* Duration exceeded 100 msecs */
- if (duration > 100000ULL)
- s = "! ";
- /* Duration exceeded 10 msecs */
- else if (duration > 10000ULL)
- s = "+ ";
- }
-
- event_printf(s);
-}
-
-static void fn_graph_abstime_print(u64 ts)
-{
- u64 time = ts / 1000;
- unsigned long sec = time / 1000000;
- unsigned long usec = time % 1000000;
-
- event_printf("%5lu.%06lu | ", sec, usec);
-}
-
-static void fn_graph_irq_print(int type)
-{
- /* TODO: implement it. */
-}
-
-static void fn_graph_duration_print(unsigned long long duration)
-{
- /* log10(ULONG_MAX) + '\0' */
- char msecs_str[21];
- char nsecs_str[5];
- int len;
- unsigned long nsecs_rem = duration % 1000;
-
- duration = duration / 1000;
- len = sprintf(msecs_str, "%lu", (unsigned long) duration);
-
- /* Print msecs */
- event_printf("%s", msecs_str);
-
- /* Print nsecs (we don't want to exceed 7 numbers) */
- if (len < 7) {
- snprintf(nsecs_str, 8 - len, "%03lu", nsecs_rem);
- event_printf(".%s", nsecs_str);
-
- len += strlen(nsecs_str);
- }
-
- if (len > 7)
- len = 7;
-
- event_printf(" us %*s| ", 7 - len, "");
-}
-
-/* Case of a leaf function on its call entry */
-static void fn_graph_entry_leaf_print(void *entry_data, void *exit_data)
-{
- struct event_type *t = function_graph_return_type;
-
- u64 calltime = access_field(t, exit_data, "ret.calltime");
- u64 rettime = access_field(t, exit_data, "ret.rettime");
- u64 duration = rettime - calltime;
- int depth = access_field(t, exit_data, "ret.depth");
- ulong func = access_field(t, exit_data, "ret.func");
-
- fn_graph_overhead_print(duration);
- if (gopt_graph_print_duration)
- fn_graph_duration_print(duration);
-
- event_printf("%*s", depth * TRACE_GRAPH_INDENT, "");
- sym_print(func, 0);
- event_printf("();\n");
-}
-
-static void fn_graph_entry_nested_print(struct event_type *t, void *data)
-{
- int depth = access_field(t, data, "graph_ent.depth");
- ulong func = access_field(t, data, "graph_ent.func");
-
- fn_graph_overhead_print(-1);
-
- /* No time */
- if (gopt_graph_print_duration)
- event_printf(" | ");
-
- event_printf("%*s", depth * TRACE_GRAPH_INDENT, "");
- sym_print(func, 0);
- event_printf("() {\n");
-}
-
-static void fn_graph_prologue_print(int cpu, u64 ts, int pid, int type)
-{
- fn_graph_proc_switch_print(pid, cpu);
-
- if (type)
- fn_graph_irq_print(type);
-
- if (gopt_graph_print_abstime)
- fn_graph_abstime_print(ts);
-
- if (gopt_graph_print_cpu)
- event_printf(" %*d) ", max_bytes_for_cpu, cpu);
-
- if (gopt_graph_print_proc) {
- fn_graph_proc_print(pid);
- event_printf(" | ");
- }
-}
-
-static void *get_return_for_leaf(struct event_type *t,
- struct format_context *fc, void *curr_data)
-{
- int cpu;
- struct ftrace_event next;
- ulong entry_func, exit_func;
-
- cpu = ring_buffer_stream_pop_event(&fc->stream, &next);
-
- if (cpu < 0)
- goto not_leaf;
-
- if (ftrace_event_get_id(next.data) != function_graph_return_id)
- goto not_leaf;
-
- if (ftrace_event_get_pid(curr_data) != ftrace_event_get_pid(next.data))
- goto not_leaf;
-
- entry_func = access_field(t, curr_data, "graph_ent.func");
- exit_func = access_field(function_graph_return_type, next.data,
- "ret.func");
-
- if (entry_func != exit_func)
- goto not_leaf;
-
- return next.data;
-
-not_leaf:
- ring_buffer_stream_push_current_event(&fc->stream);
- return NULL;
-}
-
-static
-void event_fn_entry_print(struct event_type *t, struct format_context *fc)
-{
- void *leaf_ret_data = NULL, *curr_data = fc->event.data, *data;
- int pid = ftrace_event_get_pid(curr_data);
-
- fn_graph_prologue_print(fc->cpu, fc->event.ts, pid, 1);
-
- data = alloca(fc->event.length);
- if (data) {
- memcpy(data, fc->event.data, fc->event.length);
- curr_data = data;
- leaf_ret_data = get_return_for_leaf(t, fc, curr_data);
- }
-
- if (leaf_ret_data)
- return fn_graph_entry_leaf_print(curr_data, leaf_ret_data);
- else
- return fn_graph_entry_nested_print(t, curr_data);
-}
-
-static
-void event_fn_return_print(struct event_type *t, struct format_context *fc)
-{
- void *data = fc->event.data;
- int pid = ftrace_event_get_pid(data);
-
- u64 calltime = access_field(t, data, "ret.calltime");
- u64 rettime = access_field(t, data, "ret.rettime");
- u64 duration = rettime - calltime;
- int depth = access_field(t, data, "ret.depth");
-
- fn_graph_prologue_print(fc->cpu, fc->event.ts, pid, 0);
- fn_graph_overhead_print(duration);
-
- if (gopt_graph_print_duration)
- fn_graph_duration_print(duration);
-
- event_printf("%*s}\n", depth * TRACE_GRAPH_INDENT, "");
-
- if (gopt_graph_print_overrun) {
- unsigned long overrun = access_field(t, data, "ret.overrun");
- event_printf(" (Overruns: %lu)\n", overrun);
- }
-
- fn_graph_irq_print(0);
-}
-
-static void ftrace_show_function_graph_init(void)
-{
- if (strcmp(current_tracer_name, "function_graph"))
- return;
-
- function_graph_entry_type = find_event_type_by_name(
- "ftrace", "funcgraph_entry");
- function_graph_return_type = find_event_type_by_name(
- "ftrace", "funcgraph_exit");
-
- if (!function_graph_entry_type || !function_graph_return_type)
- return;
-
- /*
- * Because of get_return_for_leaf(), the exception handling
- * of access_field() is not work for function_graph. So we need
- * to ensure access_field() will not failed for these fields.
- *
- * I know these will not failed. I just ensure it.
- */
-
- if (!find_event_field(function_graph_entry_type, "graph_ent.func"))
- return;
-
- if (!find_event_field(function_graph_entry_type, "graph_ent.depth"))
- return;
-
- if (!find_event_field(function_graph_return_type, "ret.func"))
- return;
-
- if (!find_event_field(function_graph_return_type, "ret.calltime"))
- return;
-
- if (!find_event_field(function_graph_return_type, "ret.rettime"))
- return;
-
- if (!find_event_field(function_graph_return_type, "ret.overrun"))
- return;
-
- if (!find_event_field(function_graph_return_type, "ret.depth"))
- return;
-
- cpus_prev_pid = malloc(sizeof(int) * nr_cpu_ids);
- if (!cpus_prev_pid)
- return;
-
- max_bytes_for_cpu = snprintf(NULL, 0, "%d", nr_cpu_ids - 1);
-
- function_graph_entry_id = function_graph_entry_type->id;
- function_graph_return_id = function_graph_return_type->id;
-
- /* OK, set the printer for function_graph. */
- tracer_no_event_context = 1;
- function_graph_entry_type->printer = event_fn_entry_print;
- function_graph_return_type->printer = event_fn_return_print;
-}
-
-static void event_sched_kthread_stop_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("task %s:%d\n",
- (char *)(long)access_field(t, fc->event.data, "comm"),
- (int)access_field(t, fc->event.data, "pid"));
-}
-
-static void event_sched_kthread_stop_ret_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("ret %d\n", (int)access_field(t, fc->event.data, "ret"));
-}
-
-static void event_sched_wait_task_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("task %s:%d [%d]\n",
- (char *)(long)access_field(t, fc->event.data, "comm"),
- (int)access_field(t, fc->event.data, "pid"),
- (int)access_field(t, fc->event.data, "prio"));
-}
-
-static void event_sched_wakeup_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("task %s:%d [%d] success=%d\n",
- (char *)(long)access_field(t, fc->event.data, "comm"),
- (int)access_field(t, fc->event.data, "pid"),
- (int)access_field(t, fc->event.data, "prio"),
- (int)access_field(t, fc->event.data, "success"));
-}
-
-static void event_sched_wakeup_new_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("task %s:%d [%d] success=%d\n",
- (char *)(long)access_field(t, fc->event.data, "comm"),
- (int)access_field(t, fc->event.data, "pid"),
- (int)access_field(t, fc->event.data, "prio"),
- (int)access_field(t, fc->event.data, "success"));
-}
-
-static void event_sched_switch_print(struct event_type *t,
- struct format_context *fc)
-{
- char *prev_comm = (char *)(long)access_field(t, fc->event.data,
- "prev_comm");
- int prev_pid = access_field(t, fc->event.data, "prev_pid");
- int prev_prio = access_field(t, fc->event.data, "prev_prio");
-
- int prev_state = access_field(t, fc->event.data, "prev_state");
-
- char *next_comm = (char *)(long)access_field(t, fc->event.data,
- "next_comm");
- int next_pid = access_field(t, fc->event.data, "next_pid");
- int next_prio = access_field(t, fc->event.data, "next_prio");
-
- event_printf("task %s:%d [%d] (", prev_comm, prev_pid, prev_prio);
-
- if (prev_state == 0) {
- event_printf("R");
- } else {
- if (prev_state & 1)
- event_printf("S");
- if (prev_state & 2)
- event_printf("D");
- if (prev_state & 4)
- event_printf("T");
- if (prev_state & 8)
- event_printf("t");
- if (prev_state & 16)
- event_printf("Z");
- if (prev_state & 32)
- event_printf("X");
- if (prev_state & 64)
- event_printf("x");
- if (prev_state & 128)
- event_printf("W");
- }
-
- event_printf(") ==> %s:%d [%d]\n", next_comm, next_pid, next_prio);
-}
-
-static void event_sched_migrate_task_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("task %s:%d [%d] from: %d to: %d\n",
- (char *)(long)access_field(t, fc->event.data, "comm"),
- (int)access_field(t, fc->event.data, "pid"),
- (int)access_field(t, fc->event.data, "prio"),
- (int)access_field(t, fc->event.data, "orig_cpu"),
- (int)access_field(t, fc->event.data, "dest_cpu"));
-}
-
-static void event_sched_process_free_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("task %s:%d [%d]\n",
- (char *)(long)access_field(t, fc->event.data, "comm"),
- (int)access_field(t, fc->event.data, "pid"),
- (int)access_field(t, fc->event.data, "prio"));
-}
-
-static void event_sched_process_exit_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("task %s:%d [%d]\n",
- (char *)(long)access_field(t, fc->event.data, "comm"),
- (int)access_field(t, fc->event.data, "pid"),
- (int)access_field(t, fc->event.data, "prio"));
-}
-
-static void event_sched_process_wait_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("task %s:%d [%d]\n",
- (char *)(long)access_field(t, fc->event.data, "comm"),
- (int)access_field(t, fc->event.data, "pid"),
- (int)access_field(t, fc->event.data, "prio"));
-}
-
-static void event_sched_process_fork_print(struct event_type *t,
- struct format_context *fc)
-{
- char *parent_comm = (char *)(long)access_field(t, fc->event.data,
- "parent_comm");
- int parent_pid = access_field(t, fc->event.data, "parent_pid");
-
- char *child_comm = (char *)(long)access_field(t, fc->event.data,
- "child_comm");
- int child_pid = access_field(t, fc->event.data, "child_pid");
-
- event_printf("parent %s:%d child %s:%d\n", parent_comm, parent_pid,
- child_comm, child_pid);
-}
-
-static void event_sched_signal_send_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("sig: %d task %s:%d\n",
- (int)access_field(t, fc->event.data, "sig"),
- (char *)(long)access_field(t, fc->event.data, "comm"),
- (int)access_field(t, fc->event.data, "pid"));
-}
-
-static void event_kmalloc_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu "
- "gfp_flags=%lx\n",
- (long)access_field(t, fc->event.data, "call_site"),
- (void *)(long)access_field(t, fc->event.data, "ptr"),
- (size_t)access_field(t, fc->event.data, "bytes_req"),
- (size_t)access_field(t, fc->event.data, "bytes_alloc"),
- (long)access_field(t, fc->event.data, "gfp_flags"));
-}
-
-static void event_kmem_cache_alloc_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu "
- "gfp_flags=%lx\n",
- (long)access_field(t, fc->event.data, "call_site"),
- (void *)(long)access_field(t, fc->event.data, "ptr"),
- (size_t)access_field(t, fc->event.data, "bytes_req"),
- (size_t)access_field(t, fc->event.data, "bytes_alloc"),
- (long)access_field(t, fc->event.data, "gfp_flags"));
-}
-
-static void event_kmalloc_node_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu "
- "gfp_flags=%lx node=%d\n",
- (long)access_field(t, fc->event.data, "call_site"),
- (void *)(long)access_field(t, fc->event.data, "ptr"),
- (size_t)access_field(t, fc->event.data, "bytes_req"),
- (size_t)access_field(t, fc->event.data, "bytes_alloc"),
- (long)access_field(t, fc->event.data, "gfp_flags"),
- (int)access_field(t, fc->event.data, "node"));
-}
-
-static void event_kmem_cache_alloc_node_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu "
- "gfp_flags=%lx node=%d\n",
- (long)access_field(t, fc->event.data, "call_site"),
- (void *)(long)access_field(t, fc->event.data, "ptr"),
- (size_t)access_field(t, fc->event.data, "bytes_req"),
- (size_t)access_field(t, fc->event.data, "bytes_alloc"),
- (long)access_field(t, fc->event.data, "gfp_flags"),
- (int)access_field(t, fc->event.data, "node"));
-}
-
-static void event_kfree_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("call_site=%lx ptr=%p\n",
- (long)access_field(t, fc->event.data, "call_site"),
- (void *)(long)access_field(t, fc->event.data, "ptr"));
-}
-
-static void event_kmem_cache_free_print(struct event_type *t,
- struct format_context *fc)
-{
- event_printf("call_site=%lx ptr=%p\n",
- (long)access_field(t, fc->event.data, "call_site"),
- (void *)(long)access_field(t, fc->event.data, "ptr"));
-}
-
-static void event_workqueue_insertion_print(struct event_type *t,
- struct format_context *fc)
-{
- char *thread_comm = (char *)(long)access_field(t, fc->event.data,
- "thread_comm");
- int thread_pid = access_field(t, fc->event.data, "thread_pid");
- ulong func = access_field(t, fc->event.data, "func");
-
- event_printf("thread=%s:%d func=", thread_comm, thread_pid);
- sym_print(func, 1);
- event_printf("\n");
-}
-
-static void event_workqueue_execution_print(struct event_type *t,
- struct format_context *fc)
-{
- char *thread_comm = (char *)(long)access_field(t, fc->event.data,
- "thread_comm");
- int thread_pid = access_field(t, fc->event.data, "thread_pid");
- ulong func = access_field(t, fc->event.data, "func");
-
- event_printf("thread=%s:%d func=", thread_comm, thread_pid);
- sym_print(func, 1);
- event_printf("\n");
-}
-
-static void event_workqueue_creation_print(struct event_type *t,
- struct format_context *fc)
-{
- char *thread_comm = (char *)(long)access_field(t, fc->event.data,
- "thread_comm");
- int thread_pid = access_field(t, fc->event.data, "thread_pid");
- int cpu = access_field(t, fc->event.data, "cpu");
-
- event_printf("thread=%s:%d cpu=%d\n", thread_comm, thread_pid, cpu);
-}
-
-static void event_workqueue_destruction_print(struct event_type *t,
- struct format_context *fc)
-{
- char *thread_comm = (char *)(long)access_field(t, fc->event.data,
- "thread_comm");
- int thread_pid = access_field(t, fc->event.data, "thread_pid");
-
- event_printf("thread=%s:%d\n", thread_comm, thread_pid);
-}
-
-static void ftrace_show_trace_event_init(void)
-{
-#define init_trace_event(system, name) \
-do { \
- struct event_type *t = find_event_type_by_name(#system, #name); \
- if (t) \
- t->printer = event_ ## name ## _print; \
-} while (0)
-
- init_trace_event(sched, sched_kthread_stop);
- init_trace_event(sched, sched_kthread_stop_ret);
- init_trace_event(sched, sched_wait_task);
- init_trace_event(sched, sched_wakeup);
- init_trace_event(sched, sched_wakeup_new);
- init_trace_event(sched, sched_switch);
- init_trace_event(sched, sched_migrate_task);
- init_trace_event(sched, sched_process_free);
- init_trace_event(sched, sched_process_exit);
- init_trace_event(sched, sched_process_wait);
- init_trace_event(sched, sched_process_fork);
- init_trace_event(sched, sched_signal_send);
-
- init_trace_event(kmem, kmalloc);
- init_trace_event(kmem, kmem_cache_alloc);
- init_trace_event(kmem, kmalloc_node);
- init_trace_event(kmem, kmem_cache_alloc_node);
- init_trace_event(kmem, kfree);
- init_trace_event(kmem, kmem_cache_free);
-
- init_trace_event(workqueue, workqueue_insertion);
- init_trace_event(workqueue, workqueue_execution);
- init_trace_event(workqueue, workqueue_creation);
- init_trace_event(workqueue, workqueue_destruction);
-#undef init_trace_event
-}
-
-static void ftrace_show_destroy(void)
-{
- free(cpus_prev_pid);
-}
-
static char *help_ftrace[] = {
"trace",
"show or dump the tracing info",
@@ -2946,22 +1491,11 @@ static char *help_ftrace[] = {
"trace",
" shows the current tracer and other informations.",
"",
-"trace show [ -c <cpulist> ] [ -f [no]<flagename> ]",
+"trace show",
" shows all events with readability text(sorted by timestamp)",
-" -c: only shows specified CPUs' events.",
-" ex. trace show -c 1,2 - only shows cpu#1 and cpu#2 's events.",
-" trace show -c 0,2-7 - only shows cpu#0, cpu#2...cpu#7's events.",
-" -f: set or clear a flag",
-" available flags default",
-" context_info true",
-" sym_offset false",
-" sym_addr false",
-" graph_print_duration true",
-" graph_print_overhead true",
-" graph_print_abstime false",
-" graph_print_cpu true",
-" graph_print_proc false",
-" graph_print_overrun false",
+"",
+"trace report",
+" the same as \"trace show\"",
"",
"trace dump [-sm] <dest-dir>",
" dump ring_buffers to dest-dir. Then you can parse it",
13 years, 8 months
Re: [Crash-utility] Issues with ps -a/vtop
by Steven Soulen
Hi Dave,
The user pages have been filtered out. Is this documented anywhere?
On a side note does anyone have suggestions for documentation/books I
should read for debugging crash dumps? (I've already read the
whitepaper on using crash)
Thanks again
Steven Soulen
Unless otherwise indicated, this message is intended only for the personal and confidential use of the designated recipient(s) named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. This communication is for information purposes only and should not be regarded as an offer to sell or as a solicitation of an offer to buy any financial product or service, an official confirmation of any transaction, or as an official statement of the entity sending this message. Email transmission cannot be guaranteed to be secure or error-free. Therefore, we do not represent that this information is complete or accurate and it should not be relied upon as such. All information is subject to change without notice.
13 years, 8 months
[PATCH 0/3 v2] crash-trace-command: Handle trace_bprintk() for modules
by Steven Rostedt
These are update patches to have crash extract the trace_bprintk()
formats from a kernel core dump. It handles the current format
and a new format that I'll be pushing out after the merge window
has closed for Linux 2.6.40. The new format is required for the
kernel debugfs to export the same bprintk data as well.
This means that crash will be able to extract more information than
trace-cmd itself can on a running kernel :)
v2:
o Removed unused 'len' variable.
o Moved count == 0 fix into patch 2
-- Steve
13 years, 8 months
[PATCH 0/3] crash-trace-command: Handle trace_bprintk() for modules
by Steven Rostedt
Hi,
These are update patches to have crash extract the trace_bprintk()
formats from a kernel core dump. It handles the current format
and a new format that I'll be pushing out after the merge window
has closed for Linux 2.6.40. The new format is required for the
kernel debugfs to export the same bprintk data as well.
This means that crash will be able to extract more information than
trace-cmd itself can on a running kernel :)
-- Steve
13 years, 8 months
Re: [Crash-utility] Issues with ps -a/vtop
by Steven Soulen
Hi Dave,
Thanks for the quick response. That was the issue I was having. I am
now able to use vtop.
crash> vtop -c 23619 7fff7efa16a6
VIRTUAL PHYSICAL
7fff7efa16a6 201d126a6
PML: 2a948d7f8 => 3f031a067
PUD: 3f031afe8 => 1ed481067
PMD: 1ed481fb8 => 2c9af2067
PTE: 2c9af2d08 => 201d12005
PAGE: 201d12000
PTE PHYSICAL FLAGS
201d12005 201d12000 (PRESENT|USER)
VMA START END FLAGS FILE
ffff81041195dce8 7fff7efa0000 7fff7efa2000 100173
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffff810107065bf0 201d12000 ffff8101eee276f1 7fffffffe 1 200100000000278
So how do I get at the information that is proved by ps -a? What I'm
really after is the arguments of the command.
Thanks again.
Steven Soulen
Unless otherwise indicated, this message is intended only for the personal and confidential use of the designated recipient(s) named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. This communication is for information purposes only and should not be regarded as an offer to sell or as a solicitation of an offer to buy any financial product or service, an official confirmation of any transaction, or as an official statement of the entity sending this message. Email transmission cannot be guaranteed to be secure or error-free. Therefore, we do not represent that this information is complete or accurate and it should not be relied upon as such. All information is subject to change without notice.
13 years, 8 months
Issues with ps -a/vtop
by Steven Soulen
Hello All,
I ran into this problem were ps -a/vtop can't access user space.
Looking at the change log it appears a fix was put in as 4.0-3.7. But
I'm seeing this issue with 5.1.3
crash> ps -a 23591
PID: 23591 TASK: ffff8103813be7a0 CPU: 5 COMMAND: "python"
ps: cannot access user stack address: 7fff7d89a75f
crash> vtop 7fff7d89a75f
VIRTUAL PHYSICAL
7fff7d89a75f (not accessible)
This is on a RHEL 5.4 with kernel 2.6.18-194.11.3.el5. What information
will needed to debug this? Or am I using these commands incorrectly?
Thanks in advance for any help you can give me.
Steven Soulen
Unless otherwise indicated, this message is intended only for the personal and confidential use of the designated recipient(s) named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. This communication is for information purposes only and should not be regarded as an offer to sell or as a solicitation of an offer to buy any financial product or service, an official confirmation of any transaction, or as an official statement of the entity sending this message. Email transmission cannot be guaranteed to be secure or error-free. Therefore, we do not represent that this information is complete or accurate and it should not be relied upon as such. All information is subject to change without notice.
13 years, 8 months
How to find core utlility implementation (source code) file in Linux kernel ?
by VL Chowdary
Hi,
This is VL Chowdary, a software engineer in Bangalore,India.I am currently
working on a project that deals with "Creation of Postmortem data log file
on Linux platform".
The project is on Fedora 12 with Kernel2.6.31.5 version.
For that i need following things.
1. Find the location of core utility in the kernel.
2. Find the source file that consists of actual implementation of
the core utility (On Fedora's kernel 2.6.31.5)
3. Based on that i want to implement a core file with my own project
requirements.
4. Build the kernel with the above implemented corefile replacing the
existing one.
I went through many
documents and they were saying that core file generated location is in :
/proc/sys/kernel.
Is it correct?.
I couldn't get the source file that contains the implementation of core
utility.
Please help in the project for further proceedings.
Advance thanks to all repliers.
Thank you,
vlc
13 years, 8 months
[RFC][PATCH] use the value of register in the vmcore when we do not find panic task
by Wen Congyang
We have a new hardware to do dump, and use makedumpfile to generate
vmcore. Our hardware can work when the OS is out of controll(for
example: dead loop). When we use crash to analyze the vmcore, bt can
not work, because there is no panic task.
We have provide the value of register in the vmcore(the format is
elf_prstatus, it is same with normal kdump's vmcore). So we can use
it when we do not find panic task.
---
defs.h | 10 +++-
diskdump.c | 118 +++++++++++++++++++++++++++++++++++++++++---
netdump.c | 161 ++++++++++++++++++++++++++++++++++++++++++++++-------------
task.c | 6 ++-
x86.c | 8 +++
x86_64.c | 6 ++
6 files changed, 263 insertions(+), 46 deletions(-)
diff --git a/defs.h b/defs.h
index af3c8ed..b926725 100644
--- a/defs.h
+++ b/defs.h
@@ -1426,6 +1426,7 @@ struct offset_table { /* stash of commonly-used offsets */
long prio_array_queue;
long user_regs_struct_ebp;
long user_regs_struct_esp;
+ long user_regs_struct_eip;
long user_regs_struct_rip;
long user_regs_struct_cs;
long user_regs_struct_eflags;
@@ -4567,13 +4568,16 @@ int xen_major_version(void);
int xen_minor_version(void);
int get_netdump_arch(void);
void *get_regs_from_elf_notes(struct task_context *);
-void map_cpus_to_prstatus(void);
+void map_cpus_to_prstatus(void **, size_t);
int arm_kdump_phys_base(ulong *);
int is_proc_kcore(char *, ulong);
int proc_kcore_init(FILE *);
int read_proc_kcore(int, void *, int, ulong, physaddr_t);
int write_proc_kcore(int, void *, int, ulong, physaddr_t);
int kcore_memory_dump(FILE *);
+void **netdump_get_prstatus(void);
+size_t netdump_get_num_prstatus(void);
+
/*
* diskdump.c
@@ -4595,6 +4599,10 @@ ulong *diskdump_flags;
int is_partial_diskdump(void);
int dumpfile_is_split(void);
void show_split_dumpfiles(void);
+int KDUMP_CMPRS_DUMPFILE(void);
+void **diskdump_get_prstatus(void);
+size_t diskdump_get_num_prstatus(void);
+void *diskdump_get_prstatus_percpu(int);
/*
* makedumpfile.c
diff --git a/diskdump.c b/diskdump.c
index 9a2f37f..074f1d7 100644
--- a/diskdump.c
+++ b/diskdump.c
@@ -49,6 +49,7 @@ struct diskdump_data {
int byte, bit;
char *compressed_page; /* copy of compressed page data */
char *curbufptr; /* ptr to uncompressed page buffer */
+ unsigned char *notes_buf; /* copy of notes */
/* page cache */
struct page_cache_hdr { /* header for each cached page */
@@ -67,6 +68,8 @@ struct diskdump_data {
static struct diskdump_data diskdump_data = { 0 };
static struct diskdump_data *dd = &diskdump_data;
+static unsigned char *nt_prstatus_percpu[NR_CPUS];
+static size_t num_prstatus_notes = 0;
static int get_dump_level(void);
ulong *diskdump_flags = &diskdump_data.flags;
@@ -185,12 +188,86 @@ static int open_dump_file(char *file)
return TRUE;
}
+static size_t dump_elf_note(unsigned char *note_buf, size_t max_size, int machine)
+{
+ Elf64_Nhdr *note64;
+ Elf32_Nhdr *note32;
+ size_t len;
+ unsigned int type;
+ int i;
+
+ if (machine == EM_X86_64) {
+ note64 = (Elf64_Nhdr *)note_buf;
+ len = sizeof(Elf64_Nhdr);
+ len = roundup(len + note64->n_namesz, 4);
+ len = roundup(len + note64->n_descsz, 4);
+ type = note64->n_type;
+ } else {
+ note32 = (Elf32_Nhdr *)note_buf;
+ len = sizeof(Elf32_Nhdr);
+ len = roundup(len + note32->n_namesz, 4);
+ len = roundup(len + note32->n_descsz, 4);
+ type = note32->n_type;
+ }
+ if (len > max_size) {
+ /* this note is broken, do not restore it */
+ return max_size;
+ }
+
+ if (type != NT_PRSTATUS) {
+ /* This note segment does not contain copy of prstatus struct */
+ return len;
+ }
+
+ for (i = 0; i < NR_CPUS; i++) {
+ if (!nt_prstatus_percpu[i]) {
+ nt_prstatus_percpu[i] = (void *)note_buf;
+ num_prstatus_notes++;
+ break;
+ }
+ }
+ return len;
+}
+
+static void process_elf_notes(unsigned char *notes_buf, size_t size, int machine)
+{
+ size_t len = 0;
+ void **nt_ptr;
+ int online, i, j, nrcpus;
+ size_t buf_size;
+
+ while (len < size) {
+ len += dump_elf_note(notes_buf + len, size - len, machine);
+ }
+
+ /* copy from map_cpus_to_prstatus from netdump.c */
+ if (!(online = get_cpus_online()) || (online == kt->cpus))
+ return;
+
+ buf_size = NR_CPUS * sizeof(void *);
+
+ nt_ptr = (void **)GETBUF(buf_size);
+ BCOPY(nt_prstatus_percpu, nt_ptr, buf_size);
+ BZERO(nt_prstatus_percpu, buf_size);
+
+ /*
+ * Re-populate the array with the notes mapping to online cpus
+ */
+ nrcpus = (kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS);
+
+ for (i = 0, j = 0; i < nrcpus; i++) {
+ if (in_cpu_map(ONLINE, i))
+ nt_prstatus_percpu[i] = nt_ptr[j++];
+ }
+
+ FREEBUF(nt_ptr);
+}
+
static int read_dump_header(char *file)
{
struct disk_dump_header *header = NULL;
struct disk_dump_sub_header *sub_header = NULL;
struct kdump_sub_header *sub_header_kdump = NULL;
- unsigned char *notes_buf = NULL;
size_t size;
int bitmap_len;
int block_size = (int)sysconf(_SC_PAGESIZE);
@@ -394,16 +471,18 @@ restart:
/* process elf notes data */
if (KDUMP_CMPRS_VALID() && (dd->header->header_version >= 4) &&
(sub_header_kdump->offset_note) &&
- (sub_header_kdump->size_note) && (machdep->process_elf_notes)) {
+ (sub_header_kdump->size_note) &&
+ (machdep->process_elf_notes || dd->machine_type == EM_X86_64 ||
+ dd->machine_type == EM_386)) {
size = sub_header_kdump->size_note;
offset = sub_header_kdump->offset_note;
- if ((notes_buf = malloc(size)) == NULL)
+ if ((dd->notes_buf = malloc(size)) == NULL)
error(FATAL, "compressed kdump: cannot malloc notes"
" buffer\n");
if (FLAT_FORMAT()) {
- if (!read_flattened_format(dd->dfd, offset, notes_buf, size)) {
+ if (!read_flattened_format(dd->dfd, offset, dd->notes_buf, size)) {
error(INFO, "compressed kdump: cannot read notes data"
"\n");
goto err;
@@ -413,14 +492,19 @@ restart:
error(INFO, "compressed kdump: cannot lseek notes data\n");
goto err;
}
- if (read(dd->dfd, notes_buf, size) < size) {
+ if (read(dd->dfd, dd->notes_buf, size) < size) {
error(INFO, "compressed kdump: cannot read notes data"
"\n");
goto err;
}
}
- machdep->process_elf_notes(notes_buf, size);
+ if (machdep->process_elf_notes) {
+ /* s390 */
+ machdep->process_elf_notes(dd->notes_buf, size);
+ } else if (dd->machine_type == EM_X86_64 || dd->machine_type == EM_386) {
+ process_elf_notes(dd->notes_buf, size, dd->machine_type);
+ }
}
/* For split dumpfile */
@@ -468,8 +552,6 @@ err:
free(sub_header);
if (sub_header_kdump)
free(sub_header_kdump);
- if (notes_buf)
- free(notes_buf);
if (dd->bitmap)
free(dd->bitmap);
if (dd->dumpable_bitmap)
@@ -1226,3 +1308,23 @@ show_split_dumpfiles(void)
fprintf(fp, "\n");
}
}
+
+int KDUMP_CMPRS_DUMPFILE(void)
+{
+ return KDUMP_CMPRS_VALID();
+}
+
+void **diskdump_get_prstatus(void)
+{
+ return (void **)nt_prstatus_percpu;
+}
+
+size_t diskdump_get_num_prstatus(void)
+{
+ return num_prstatus_notes;
+}
+
+void *diskdump_get_prstatus_percpu(int cpu)
+{
+ return nt_prstatus_percpu[cpu];
+}
diff --git a/netdump.c b/netdump.c
index 7916df1..6c1cf13 100644
--- a/netdump.c
+++ b/netdump.c
@@ -59,7 +59,7 @@ static int proc_kcore_init_64(FILE *fp);
* to remap the NT_PRSTATUS notes only to the online cpus.
*/
void
-map_cpus_to_prstatus(void)
+map_cpus_to_prstatus(void **nt_prstatus, size_t num_prstatus)
{
void **nt_ptr;
int online, i, j, nrcpus;
@@ -71,13 +71,13 @@ map_cpus_to_prstatus(void)
if (CRASHDEBUG(1))
error(INFO,
"cpus: %d online: %d NT_PRSTATUS notes: %d (remapping)\n",
- kt->cpus, online, nd->num_prstatus_notes);
+ kt->cpus, online, num_prstatus);
size = NR_CPUS * sizeof(void *);
nt_ptr = (void **)GETBUF(size);
- BCOPY(nd->nt_prstatus_percpu, nt_ptr, size);
- BZERO(nd->nt_prstatus_percpu, size);
+ BCOPY(nt_prstatus, nt_ptr, size);
+ BZERO(nt_prstatus, size);
/*
* Re-populate the array with the notes mapping to online cpus
@@ -86,12 +86,22 @@ map_cpus_to_prstatus(void)
for (i = 0, j = 0; i < nrcpus; i++) {
if (in_cpu_map(ONLINE, i))
- nd->nt_prstatus_percpu[i] = nt_ptr[j++];
+ nt_prstatus[i] = nt_ptr[j++];
}
FREEBUF(nt_ptr);
}
+void **netdump_get_prstatus(void)
+{
+ return nd->nt_prstatus_percpu;
+}
+
+size_t netdump_get_num_prstatus(void)
+{
+ return nd->num_prstatus_notes;
+}
+
/*
* Determine whether a file is a netdump/diskdump/kdump creation,
* and if TRUE, initialize the vmcore_data structure.
@@ -2170,60 +2180,71 @@ get_netdump_regs(struct bt_info *bt, ulong *eip, ulong *esp)
}
}
-struct x86_64_user_regs_struct {
- unsigned long r15,r14,r13,r12,rbp,rbx,r11,r10;
- unsigned long r9,r8,rax,rcx,rdx,rsi,rdi,orig_rax;
- unsigned long rip,cs,eflags;
- unsigned long rsp,ss;
- unsigned long fs_base, gs_base;
- unsigned long ds,es,fs,gs;
-};
+/* get regs from elf note, and return the address of user_regs. */
+static char * get_x86_regs_from_note(char *note, ulong *eip, ulong *esp)
+{
+ Elf32_Nhdr *note32;
+ size_t len;
+ char *user_regs;
+
+ note32 = (Elf32_Nhdr *)note;
+ len = sizeof(Elf32_Nhdr);
+ len = roundup(len + note32->n_namesz, 4);
+ len = roundup(len + note32->n_descsz, 4);
+
+ user_regs = note + len - SIZE(user_regs_struct) - sizeof(long);
+ *esp = ULONG(user_regs + OFFSET(user_regs_struct_esp));
+ *eip = ULONG(user_regs + OFFSET(user_regs_struct_eip));
+
+ return user_regs;
+}
+
+static char * get_x86_64_regs_from_note(char *note, ulong *rip, ulong *rsp)
+{
+ Elf64_Nhdr *note64;
+ size_t len;
+ char *user_regs;
+
+ note64 = (Elf64_Nhdr *)note;
+ len = sizeof(Elf64_Nhdr);
+ len = roundup(len + note64->n_namesz, 4);
+ len = roundup(len + note64->n_descsz, 4);
+
+ user_regs = note + len - SIZE(user_regs_struct) - sizeof(long);
+ *rsp = ULONG(user_regs + OFFSET(user_regs_struct_rsp));
+ *rip = ULONG(user_regs + OFFSET(user_regs_struct_rip));
+
+ return user_regs;
+}
void
get_netdump_regs_x86_64(struct bt_info *bt, ulong *ripp, ulong *rspp)
{
Elf64_Nhdr *note;
- size_t len;
char *user_regs;
- ulong regs_size, rsp_offset, rip_offset;
+ ulong rip, rsp;
if (is_task_active(bt->task))
bt->flags |= BT_DUMPFILE_SEARCH;
if (((NETDUMP_DUMPFILE() || KDUMP_DUMPFILE()) &&
VALID_STRUCT(user_regs_struct) && (bt->task == tt->panic_task)) ||
- (KDUMP_DUMPFILE() && (kt->flags & DWARF_UNWIND) &&
- (bt->flags & BT_DUMPFILE_SEARCH))) {
+ (KDUMP_DUMPFILE() && (bt->flags & BT_DUMPFILE_SEARCH))) {
if (nd->num_prstatus_notes > 1)
note = (Elf64_Nhdr *)
nd->nt_prstatus_percpu[bt->tc->processor];
else
note = (Elf64_Nhdr *)nd->nt_prstatus;
- len = sizeof(Elf64_Nhdr);
- len = roundup(len + note->n_namesz, 4);
- len = roundup(len + note->n_descsz, 4);
-
- regs_size = VALID_STRUCT(user_regs_struct) ?
- SIZE(user_regs_struct) :
- sizeof(struct x86_64_user_regs_struct);
- rsp_offset = VALID_MEMBER(user_regs_struct_rsp) ?
- OFFSET(user_regs_struct_rsp) :
- offsetof(struct x86_64_user_regs_struct, rsp);
- rip_offset = VALID_MEMBER(user_regs_struct_rip) ?
- OFFSET(user_regs_struct_rip) :
- offsetof(struct x86_64_user_regs_struct, rip);
-
- user_regs = ((char *)note + len) - regs_size - sizeof(long);
+ user_regs = get_x86_64_regs_from_note((char *)note, &rip, &rsp);
if (CRASHDEBUG(1))
netdump_print("ELF prstatus rsp: %lx rip: %lx\n",
- ULONG(user_regs + rsp_offset),
- ULONG(user_regs + rip_offset));
+ rsp, rip);
if (KDUMP_DUMPFILE()) {
- *rspp = ULONG(user_regs + rsp_offset);
- *ripp = ULONG(user_regs + rip_offset);
+ *rspp = rsp;
+ *ripp = rip;
if (*ripp && *rspp)
bt->flags |= BT_KDUMP_ELF_REGS;
@@ -2232,6 +2253,28 @@ get_netdump_regs_x86_64(struct bt_info *bt, ulong *ripp, ulong *rspp)
bt->machdep = (void *)user_regs;
}
+ if (DISKDUMP_DUMPFILE() && (bt->flags & BT_DUMPFILE_SEARCH)) {
+ note = (Elf64_Nhdr *)diskdump_get_prstatus_percpu(bt->tc->processor);
+ if (note == NULL)
+ /* The version of makedumpfile may be less than 4 */
+ goto skip_notes;
+
+ user_regs = get_x86_64_regs_from_note((char *)note, &rip, &rsp);
+
+ if (CRASHDEBUG(1))
+ netdump_print("ELF prstatus rsp: %lx rip: %lx\n",
+ rsp, rip);
+
+ *rspp = rsp;
+ *ripp = rip;
+
+ if (*ripp && *rspp)
+ bt->flags |= BT_KDUMP_ELF_REGS;
+
+ bt->machdep = (void *)user_regs;
+ }
+
+skip_notes:
machdep->get_stack_frame(bt, ripp, rspp);
}
@@ -2423,6 +2466,52 @@ next_sysrq:
goto retry;
}
+ if (KDUMP_DUMPFILE()) {
+ Elf32_Nhdr *note;
+ char *user_regs;
+ ulong ip, sp;
+
+ if (nd->num_prstatus_notes > 1)
+ note = (Elf32_Nhdr *)
+ nd->nt_prstatus_percpu[bt->tc->processor];
+ else
+ note = (Elf32_Nhdr *)nd->nt_prstatus;
+
+ user_regs = get_x86_regs_from_note((char *)note, &ip, &sp);
+ bt->flags |= BT_KDUMP_ELF_REGS;
+ if (is_kernel_text(ip) &&
+ (((sp >= GET_STACKBASE(bt->task)) &&
+ (sp < GET_STACKTOP(bt->task))) ||
+ in_alternate_stack(bt->tc->processor, sp))) {
+ *eip = ip;
+ *esp = sp;
+ bt->flags |= BT_KERNEL_SPACE;
+ return;
+ }
+ }
+ if (DISKDUMP_DUMPFILE()) {
+ Elf32_Nhdr *note = (Elf32_Nhdr *)diskdump_get_prstatus_percpu(bt->tc->processor);
+ char *user_regs;
+ ulong ip, sp;
+
+ if (note == NULL)
+ /* The version of makedumpfile may be less than 4 */
+ goto skip_notes;
+
+ user_regs = get_x86_regs_from_note((char *)note, &ip, &sp);
+ bt->flags |= BT_KDUMP_ELF_REGS;
+ if (is_kernel_text(ip) &&
+ (((sp >= GET_STACKBASE(bt->task)) &&
+ (sp < GET_STACKTOP(bt->task))) ||
+ in_alternate_stack(bt->tc->processor, sp))) {
+ *eip = ip;
+ *esp = sp;
+ bt->flags |= BT_KERNEL_SPACE;
+ return;
+ }
+ }
+
+skip_notes:
if (CRASHDEBUG(1))
error(INFO,
"get_netdump_regs_x86: cannot find anything useful (task: %lx)\n", bt->task);
diff --git a/task.c b/task.c
index 290463f..de77033 100644
--- a/task.c
+++ b/task.c
@@ -457,7 +457,11 @@ task_init(void)
}
else {
if (KDUMP_DUMPFILE())
- map_cpus_to_prstatus();
+ map_cpus_to_prstatus(netdump_get_prstatus(),
+ netdump_get_num_prstatus());
+ if (KDUMP_CMPRS_DUMPFILE())
+ map_cpus_to_prstatus(diskdump_get_prstatus(),
+ diskdump_get_num_prstatus());
please_wait("determining panic task");
set_context(get_panic_context(), NO_PID);
please_wait_done();
diff --git a/x86.c b/x86.c
index ab2e7f3..e6fa2e2 100644
--- a/x86.c
+++ b/x86.c
@@ -1798,6 +1798,12 @@ x86_init(int when)
else
MEMBER_OFFSET_INIT(user_regs_struct_esp,
"user_regs_struct", "sp");
+ if (MEMBER_EXISTS("user_regs_struct", "eip"))
+ MEMBER_OFFSET_INIT(user_regs_struct_eip,
+ "user_regs_struct", "eip");
+ else
+ MEMBER_OFFSET_INIT(user_regs_struct_eip,
+ "user_regs_struct", "ip");
if (!VALID_STRUCT(user_regs_struct)) {
/* Use this hardwired version -- sometimes the
* debuginfo doesn't pick this up even though
@@ -1818,6 +1824,8 @@ x86_init(int when)
offsetof(struct x86_user_regs_struct, ebp);
ASSIGN_OFFSET(user_regs_struct_esp) =
offsetof(struct x86_user_regs_struct, esp);
+ ASSIGN_OFFSET(user_regs_struct_eip) =
+ offsetof(struct x86_user_regs_struct, eip);
}
MEMBER_OFFSET_INIT(thread_struct_cr3, "thread_struct", "cr3");
STRUCT_SIZE_INIT(cpuinfo_x86, "cpuinfo_x86");
diff --git a/x86_64.c b/x86_64.c
index 853a1aa..48864f9 100644
--- a/x86_64.c
+++ b/x86_64.c
@@ -4009,6 +4009,12 @@ x86_64_get_dumpfile_stack_frame(struct bt_info *bt_in, ulong *rip, ulong *rsp)
goto skip_stage;
}
}
+ } else if (bt->machdep) {
+ user_regs = bt->machdep;
+ ur_rip = ULONG(user_regs +
+ OFFSET(user_regs_struct_rip));
+ ur_rsp = ULONG(user_regs +
+ OFFSET(user_regs_struct_rsp));
}
panic = FALSE;
--
1.7.1
13 years, 8 months
Infinite loop during gathering of kmem slab cache data
by Shawn Rhode
I am seeing an issue when opening a crash file from RHEL AS 2.1 with the last 2.1 kernel released where crash will sit at "please wait... (gathering kmem slab cache data)" forever taking 100% of the CPU. Turning on debugging at level 5 continuously prints out the following sequence over and over without stopping:
<readmem: f79d52cc, KVADDR, "kmem_cache buffer", 244, (ROE), 8585500>
<readmem: f79d5340, KVADDR, "cpudata array", 128, (ROE), ffffcbd0>
<readmem: f79d5344, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d5344, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d534c, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d534c, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d5354, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d5354, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d535c, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d535c, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d5364, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d5364, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d536c, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d536c, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d5374, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d5374, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d537c, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d537c, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d5384, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d5384, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d538c, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d538c, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d5394, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d5394, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d539c, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d539c, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d53a4, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d53a4, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d53ac, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d53ac, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d53b4, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d53b4, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d53bc, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d53bc, KVADDR, "cpucache limit", 4, (ROE), ffffcbcc>
<readmem: f79d52cc, KVADDR, "kmem_cache buffer", 244, (ROE), 8585500>
....
I am on the latest version of crash, 5.1.3:
[srhode@crash 1642489]$ crash32 -v
crash32 5.1.3
Copyright (C) 2002-2011 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.0
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-unknown-linux-gnu --target=i686-pc-linux-gnu".
If I open the dump file using the "-no_kmem_cache" option, it opens, but I am not sure if this is causing other issues during analysis of the dump file.
Shawn Rhode
Principal Support Engineer - Egenera Global Technical Support
13 years, 8 months