Fine. If you remove that message then I see no problems with your patch.
Jan
Jan Karlsson
Senior Software Engineer
System Assurance
Sony Mobile Communications
Tel: +46 703 062 174
jan.karlsson(a)sonymobile.com
sonymobile.com
-----Original Message-----
From: crash-utility-bounces(a)redhat.com [mailto:crash-utility-bounces@redhat.com] On Behalf
Of Dave Anderson
Sent: den 22 oktober 2014 15:02
To: Discussion list for crash utility usage, maintenance and development
Subject: Re: [Crash-utility] Crash in crash
----- Original Message -----
Hi
Your patch works but I get a "strange" error message:
please wait... (determining panic task)
bt: bsearch for tgid failed: task: ffffffc01cfed400 tgid: 5040
KERNEL: vmlinux
DUMPFILE: vmcore
....
This message does not occur with my patch.
Jan
Yeah, that message will be removed in crash-7.0.9:
https://github.com/crash-utility/crash/commit/a3a441aeabd6c5c3c86b4793a28...
The point of the matter is to entirely avoid doing initial sort, and then doing the RSS
gathering and associated readmem()'s for all tasks during the last-ditch panic-task
search that your dumpfile requires.
Dave
Jan Karlsson
Senior Software Engineer
System Assurance
Sony Mobile Communications
Tel: +46 703 062 174
jan.karlsson(a)sonymobile.com
sonymobile.com
-----Original Message-----
From: crash-utility-bounces(a)redhat.com
[mailto:crash-utility-bounces@redhat.com] On Behalf Of Dave Anderson
Sent: den 21 oktober 2014 16:32
To: Discussion list for crash utility usage, maintenance and
development
Subject: Re: [Crash-utility] Crash in crash
Hi Jan,
Good catch. As far as a fix goes, it would be more efficient if
tgid_quick_search() just returns a NULL in that case. Try the
attached patch.
Thanks,
Dave
----- Original Message -----
>
>
> Hi Dave
>
>
>
> I have a vmcore file for ARM64 that crashes Crash during startup.
> The core file is created at a hardware watchdog (I believe) so there
> is no panic message or something similar in the log.
>
>
>
> This is the printout from Crash running under gdb, after the
> copyrights and config information:
>
>
>
> please wait... (determining panic task)
>
> Program received signal SIGSEGV, Segmentation fault.
>
> 0x000000000047ed40 in tgid_quick_search (tgid=5040) at memory.c:4114
>
> 4114 if (tgid == last->tgid) {
>
>
>
> (gdb) bt
>
> #0 0x000000000047ed40 in tgid_quick_search (tgid=5040) at
> memory.c:4114
>
> #1 0x000000000047f046 in get_task_mem_usage
> (task=18446743799318107136,
> tm=0x7fffffff6f40)
>
> at memory.c:4186
>
> #2 0x000000000047c679 in vm_area_dump (task=18446743799318107136,
> flag=10, vaddr=0, ref=0x0)
>
> at memory.c:3671
>
> #3 0x000000000047ec08 in in_user_stack (task=18446743799318107136,
> vaddr=0) at memory.c:4063
>
> #4 0x00000000004fd9fe in arm64_get_dumpfile_stackframe
> (frame=<synthetic
> pointer>,
>
> bt=<optimized out>) at arm64.c:1077
>
> #5 arm64_get_stack_frame (bt=0x7fffffffc690, pcp=0x7fffffff9560,
> spp=0x7fffffff9568)
>
> at arm64.c:1103
>
> #6 0x00000000004de409 in back_trace (bt=0x7fffffffc690) at
> kernel.c:2533
>
> #7 0x00000000004d1563 in foreach (fd=0x7fffffffc7c0) at task.c:6161
>
> #8 0x00000000004d2bbd in panic_search () at task.c:6425
>
> #9 0x00000000004d4454 in get_panic_context () at task.c:5364
>
> #10 task_init () at task.c:491
>
> #11 0x000000000046146e in main_loop () at main.c:801
>
> #12 0x00000000006467a3 in captured_command_loop (data=<optimized
> out>) at
> main.c:258
>
> #13 0x000000000064535b in catch_errors (func=0x646790
> <captured_command_loop>, func_args=0x0,
>
> errstring=0x873235 "", mask=6) at exceptions.c:557
>
> #14 0x0000000000647726 in captured_main (data=<optimized out>) at
> main.c:1064
>
> #15 0x000000000064535b in catch_errors (func=0x646aa0
> <captured_main>, func_args=0x7fffffffe030,
>
> errstring=0x873235 "", mask=6) at exceptions.c:557
>
> #16 0x0000000000647a84 in gdb_main (args=<optimized out>) at
> main.c:1079
>
> #17 0x0000000000647abe in gdb_main_entry (argc=<optimized out>,
> argv=<optimized out>)
>
> at main.c:1099
>
> #18 0x000000000045f61f in main (argc=3, argv=0x7fffffffe188) at
> main.c:758
>
>
>
> (gdb) p tt->last_tgid
>
> $1 = (struct tgid_context *) 0x0
>
>
>
> Source code for tgid_quick_search:
>
> static struct tgid_context *
>
> tgid_quick_search(ulong tgid)
>
> {
>
> struct tgid_context *last, *next;
>
>
>
> tt->tgid_searches++;
>
>
>
> last = tt->last_tgid;
>
> if (tgid == last->tgid) {
>
> tt->tgid_cache_hits++;
>
> return last;
>
> }
>
> ....
>
> }
>
>
>
> So 'last' becomes 0 which causes the crash.
>
>
>
> After some more investigation I have seen that "tt->last_tgid" is
> initialized in function sort_tgid_array in task.c, but that function
> seems to be called at a later stage.
>
>
>
> By adding a line in tgid_quick_search:
>
>
>
> static struct tgid_context *
>
> tgid_quick_search(ulong tgid)
>
> {
>
> struct tgid_context *last, *next;
>
>
>
> tt->tgid_searches++;
>
>
>
> if (tt->last_tgid == 0) sort_tgid_array(); // added line
>
> last = tt->last_tgid;
>
> if (tgid == last->tgid) {
>
> tt->tgid_cache_hits++;
>
> return last;
>
> }
>
> ...
>
>
>
> I can run Crash on this core file. However I do not know if this is
> the best way to fix the problem.
>
>
>
> Jan
>
>
>
> Jan Karlsson
>
> Senior Software Engineer
>
> System Assurance
>
>
>
> Sony Mobile Communications
>
> Tel: +46 703 062 174
>
> jan.karlsson(a)sonymobile.com
>
>
>
>
sonymobile.com
>
>
>
>
>
>
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
>
https://www.redhat.com/mailman/listinfo/crash-utility
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility