Re: [Crash-utility] invalid regs display in bt

Thursday, 27 September 2007

Richard, the SS is "bogus" because it is NOT saved by the processor unless there
is a privilege level change with the exception, and in this case there was no privilege
change. I think you'll find SS is valid when a fault occurs in user land, resulting in
a priv change as we enter the kernel.

And NO.. I don't want to see a different format for priv-level-change vs
non-priv-level change exceptions and this makes it harder to post process with perl,
etc..

As for error-code, I don't know why it would be replace with -1

 - jim

On Tue, 25 Sep 2007 22:58:44 +0100
Richard J Moore <richardj_moore(a)uk.ibm.com&gt; wrote:

...
 I've been puzzling over why the regs formatted with a backtrace
on an IA32 
 dump are invalid. Here's what I mean:

 PID: 2692   TASK: f4656630  CPU: 0   COMMAND: "rmmod"
  #0 [f463ce54] crash_kexec at c044a1f7
  #1 [f463ce9c] die at c040651a
  #2 [f463ced4] do_page_fault at c0603107
  #3 [f463cf14] error_code (via page_fault) at c060190a
     EAX: 00000018  EBX: f8b43400  ECX: f8b4304f  EDX: 00200000 
     DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000000
     SS:  304f      ESP: f8b4302b  EBP: f463c000
     CS:  0060      EIP: f8b43004  ERR: ffffffff  EFLAGS: 00210286 

 They are supposed to represent a valid set of regs that are presented to 
 do_page_fault, which I presume are meant to be valid at the time the 
 exception occurred.
 Of they can never be a set of valid regs for the simple reason that the 
 CPL is 0 (CS=60) and the RPL of SS is 3, which is an automatic GPF.
 Since I manufactured the exception that caused this dump, by causing an 
 unrecoverable page fault in ring 0, I known the CS is correct but SS is 
 bogus. 
 Furthermore the the error code (ERR), which is stored by the processor as 
 part of the exception stack frame uses only bits 0-2 for page faults and 
 at most bits 0-15 for other exceptions, the unused bit positions are zero. 
 So ERR is also bogus.

 On looking at the code in entry.S at page_fault and the other exception 
 entry points I see no attempt to save regs to create a pt_regs struct. The 
 fact that do_page_fault takes pt_regs as the first arg is a hack to get at 
 CS:EIP and SS:ESP at the time of exception. Furthermore error_code loads 
 the exception error code into edx then wipes it out from the stack by 
 storing -1 into this location. I can't actually see a good reason for 
 wiping out the error code. By convention exceptions and interrupts have a 
 -ve integer stored at the error-code location to distinguish them from 
 system calls, but I don't think this is used. signal.c seems to be the 
 only place to look for an error code >=0 but I don't see an exception 
 affects signal.c 

 Can anyone confirm whether setting the error code to -1 is essential. If 
 it isn't then I think we should consider leaving it in place.

 The long and short of it is: the only thing that has any meaning is CS, 
 EIP and EFLAGS. All of which are saved by the processor.  SS and ESP are 
 only saved when the exception occurred at a privilege level >0 but these 
 can never generate a panic. 

 I'd recommend that we change the bt output to format only the three valid 
 regs (possibly SS and ESP, if CPL at time of exception >0). Is there any 
 reason why this shouldn't be changed?

 Richard

 Unless stated otherwise above:
 IBM United Kingdom Limited - Registered in England and Wales with number 
 741598. 
 Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Crash-utility] invalid regs display in bt