I've been puzzling over why the regs formatted with
a backtrace on an IA32 dump are invalid. Here's what I mean:
PID: 2692 TASK: f4656630 CPU: 0
COMMAND: "rmmod"
#0 [f463ce54] crash_kexec at c044a1f7
#1 [f463ce9c] die at c040651a
#2 [f463ced4] do_page_fault at c0603107
#3 [f463cf14] error_code (via page_fault) at
c060190a
EAX: 00000018 EBX: f8b43400 ECX:
f8b4304f EDX: 00200000
DS: 007b ESI:
00000000 ES: 007b EDI: 00000000
SS: 304f ESP:
f8b4302b EBP: f463c000
CS: 0060 EIP:
f8b43004 ERR: ffffffff EFLAGS: 00210286
They are supposed to represent a valid set of regs
that are presented to do_page_fault, which I presume are meant to be valid
at the time the exception occurred.
Of they can never be a set of valid regs for the simple
reason that the CPL is 0 (CS=60) and the RPL of SS is 3, which is an automatic
GPF.
Since I manufactured the exception that caused this
dump, by causing an unrecoverable page fault in ring 0, I known the CS
is correct but SS is bogus.
Furthermore the the error code (ERR), which is stored
by the processor as part of the exception stack frame uses only bits 0-2
for page faults and at most bits 0-15 for other exceptions, the unused
bit positions are zero. So ERR is also bogus.
On looking at the code in entry.S at page_fault and
the other exception entry points I see no attempt to save regs to create
a pt_regs struct. The fact that do_page_fault takes pt_regs as the first
arg is a hack to get at CS:EIP and SS:ESP at the time of exception. Furthermore
error_code loads the exception error code into edx then wipes it out from
the stack by storing -1 into this location. I can't actually see a good
reason for wiping out the error code. By convention exceptions and interrupts
have a -ve integer stored at the error-code location to distinguish them
from system calls, but I don't think this is used. signal.c seems to be
the only place to look for an error code >=0 but I don't see an exception
affects signal.c
Can anyone confirm whether setting the error code
to -1 is essential. If it isn't then I think we should consider leaving
it in place.
The long and short of it is: the only thing that has
any meaning is CS, EIP and EFLAGS. All of which are saved by the processor.
SS and ESP are only saved when the exception occurred at a privilege
level >0 but these can never generate a panic.
I'd recommend that we change the bt output to format
only the three valid regs (possibly SS and ESP, if CPL at time of exception
>0). Is there any reason why this shouldn't be changed?
Richard
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU