Dave Anderson <anderson redhat com> [2007-10-22 15:32]:
> Troy Heber wrote:
>> On 10/19/07 12:23, Dave Anderson wrote:
>>> So my biggest worry would be if this somehow breaks
>>> backwards-compatibility, but I'm presuming that you took
>>> that into account. But anyway, I leave this all up
>>> to Troy.
>> I just did a quick sanity check on a couple of old IA64 LKCD dumps and
>> everything seems to work, so I'm happy.
>> Troy
Troy, thanks for checking this!
> Bernhard, can you post a cleaned-up patch for queueing?
Here it is (attached). I didn't see any warnings in the crash code
with 'make warn' now. I have used your own definition of offsetof()
but moved it into the header file.
My biggest worry came true, so I'm going to have to NAK
this patch in its current state.
We have a major customer who uses an older version
of LKCD (the dh_version in the header shows version 2).
Because of that, I wouldn't have thought your patch
would in any way affect them. Anyway, it's the *only*
LKCD dumpfile that I test with each new crash release.
They run both x86 and x86_64.
With 4.0-4.7, the backtrace of the x86 panic task shows this:
crash> bt
PID: 12727 TASK: c086c000 CPU: 0 COMMAND: "httpd"
#0 [c086da80] dump_execute at f5728f42
#1 [c086da84] do_dump at f572928d
#2 [c086db2c] die at c010798a
#3 [c086db44] do_invalid_op at c0107c5a
#4 [c086dc00] error_code (via invalid_op) at c010750e
EAX: 0000001d EBX: c0293cd6 ECX: c0330148 EDX: 0011062b EBP: c086dc4c
DS: 0018 ESI: c086dc9c ES: 0018 EDI: c086c000
CS: 0010 EIP: c011db63 ERR: ffffffff EFLAGS: 00010002
#5 [c086dc3c] panic at c011db63
#6 [c086dc50] XXXXXXX_nmi_check at c010811b (company name removed...)
#7 [c086dc64] do_nmi at c0108254
#8 [c086dc90] nmi at c0107595
EAX: 000003dc EBX: 00000000 ECX: 00000064 EDX: c086dcec EBP: c086dd10
DS: 0018 ESI: 000000f0 ES: 0018 EDI: 00000001
CS: 0010 EIP: c0261440 ERR: 000003dc EFLAGS: 00000286
#9 [c086dccc] stext_lock (via prune_icache) at c0261440
#10 [c086dd14] shrink_icache_memory at c015f7dd
#11 [c086dd20] do_try_to_free_pages at c013f402
#12 [c086dd4c] try_to_free_pages at c013f8d2
#13 [c086dd64] _wrapped_alloc_pages at c01406bd
#14 [c086dd88] __alloc_pages at c014079d
#15 [c086dda8] __get_free_pages at c014083e
#16 [c086ddb0] kmem_cache_grow at c013a77b
#17 [c086dde8] kmalloc at c013ad8b
#18 [c086de20] skbmem_grow_bucket at f638cdd5
#19 [c086de3c] skbmemalloc at f638cfa0
#20 [c086de58] alloc_skb at c01f5770
#21 [c086de74] sock_alloc_send_skb at c01f4c15
#22 [c086de90] unix_stream_sendmsg at c02395c3
#23 [c086dee0] sock_sendmsg at c01f23c6
#24 [c086df34] sock_write at c01f25d0
#25 [c086df7c] sys_write at c0148d06
#26 [c086dfc0] system_call at c010740c
EAX: 00000004 EBX: 0000000a ECX: be1fd8fc EDX: 00000004
DS: 002b ESI: 00000004 ES: 002b EDI: be1fd8fc
SS: 002b ESP: be1fd8a4 EBP: be1fd8d4
CS: 0023 EIP: 4024f214 ERR: 00000004 EFLAGS: 00000296
crash>
With your patch applied, it shows this:
crash> bt
PID: 12727 TASK: c086c000 CPU: 0 COMMAND: "httpd"
bt: cannot resolve stack trace:
bt: Task in user space -- no backtrace
crash>
and in fact, "bt -a" shows the same thing for all
active tasks:
crash> bt -a
PID: 12727 TASK: c086c000 CPU: 0 COMMAND: "httpd"
bt: cannot resolve stack trace:
bt: Task in user space -- no backtrace
PID: 0 TASK: cdccc000 CPU: 1 COMMAND: "swapper"
bt: cannot resolve stack trace:
bt: Task in user space -- no backtrace
PID: 9959 TASK: ce01a000 CPU: 2 COMMAND: "httpd"
bt: cannot resolve stack trace:
bt: Task in user space -- no backtrace
PID: 0 TASK: cdcde000 CPU: 3 COMMAND: "swapper"
bt: cannot resolve stack trace:
bt: Task in user space -- no backtrace
PID: 16444 TASK: dc4d8000 CPU: 1 COMMAND: "httpd"
bt: cannot resolve stack trace:
bt: Task in user space -- no backtrace
PID: 5874 TASK: d3920000 CPU: 0 COMMAND: "httpd"
bt: cannot resolve stack trace:
bt: Task in user space -- no backtrace
crash>
The backtraces of the non-active tasks are OK.
Any ideas on what's wrong, and how to address this?
Dave