----- "Hu Tao" <hutao(a)cn.fujitsu.com> wrote:
> > On Tue, Oct 19, 2010 at 09:06:33AM -0400, Dave Anderson
wrote:
> > >
> > > ----- "Hu Tao" <hutao cn fujitsu com> wrote:
> > >
> > > > Hi Dave,
> > > >
> > > > These are updated patches tested with SMP system and panic task.
> > > >
> > > > When testing a x86 guest, I found another bug about reading cpu
> > > > registers from dumpfile. Qemu simulated system is x86_64
> > > > (qemu-system-x86_64), guest OS is x86. When crash reads cpu
registers
> > > > from dumpfile, it uses cpu_load_32(), this will read gp registers
by
> > > > get_be_long(fp, 32), that is, treate them as 32bits. But in fact,
> > > > qemu-system-x86_64 saves 64bits for each of them(although guest
OS
> > > > uses only lower 32 bits). As a result, crash gets wrong cpu gp
> > > > register values.
> > >
> > > As I understand it, you're running a 32-bit guest on a 64-bit host.
> >
> > Yes.
> >
> > > If you were to read 64-bit register values instead of 32-bit register
> > > values, wouldn't that cause the file offsets of the subsequent
get_xxx()
> > > calls in cpu_load() to read from the wrong file offsets? And then
> > > that would leave the ending file offset incorrect, such that the
> > > qemu_load() loop would fail to find the next device?
> > >
> > > In other words, the cpu_load() function, which is used for both
> > > 32-bit and 64-bit guests, must be reading the correct amount of
> > > data from the "cpu" device, or else qemu_load() would fail to
> > > find the next device in the next location in the dumpfile.
> >
> > True. In fact, in my case if read 32-bit registers, following devices
> > are found:
> > block, ram, kvm-tpr-opt, kvmclock, timer, cpu_common, cpu.
> > If read 64-bit registers, following devices are found:
> > block, ram, kvm-tpr-opt, kvmclock, timer, cpu_common, cpu, apic, fw_cfg
>
> Right -- so it got "lost" after incorrectly gathering the data for the
> first "cpu" device instance.
>
> > > > Is there any way we can know from dumpfile that these gp
> > > > registers(and those similar registers) are 32bits or 64bits?
> > >
> > > I don't know. If what you say is true, when would those registers
> > > ever be 32-bit values?
> >
> > I did tests on a 64-bit machine. Result is:
> >
> > machine OS guest machine guest OS saved gp regs
> > ------------------------------------------------------------------------
> > 64-bit x86 qemu-kvm(kvm enabled) x86 64 bits
> > 64-bit x86 qemu(kvm disabled) x86 32 bits
>
> I don't understand what you mean when you say that the guest machine
> is "kvm enabled" or "kvm disabled"?
Sorry for being vague. "kvm enabled" means using qemu-kvm to bring up
guest machine and this enables KVM hardware virtualization on host.
"kvm disabled" means using qemu to bring up guest machine and this
disables KVM hardware virtualization on host.
>
> And if your host machine is running a 32-bit x86 OS (on 64-bit hardware),
> that's something I've never seen given that Red Hat only allows 64-bit
> kernels as KVM hosts.
I did the test on Fedora 13 i686. Just tried rhel6 i386, as you said,
there is no kvm support.
Hello Hu,
Your supposition that the "cpu" device layout is dependent upon the
host kernel type is correct, but unfortunately there's no readily-evident
way to determine what type of kernel the host was running. This is Paolo's
response to the question:
So the question is:
Can it be determined from something in the dumpfile header that
the *host* machine was running a 32-bit kernel?
It's not an exact science, but you can do some trial-and-error. I
suggest measuring the distance from between the cpu and apic blocks
(which you can do using code from your "workaround" explained below, I
guess) and deciding based on the size of the CPU block.
A 64-bit image I have lying around takes 987 bytes, I'd guess that
anything above 850 is 64-bit. Maybe you can start searching after the
first 250 bytes, since the registers are at the beginning and if you're
going to get a false match you're going to get it there.
The "workaround" he's referring to is this, which will be in the next
release:
Re: [Crash-utility] [patch] crash on a KVM-generated dump
https://www.redhat.com/archives/crash-utility/2010-October/msg00034.html
But it's not a particularly graceful solution in this case, because it
would require walking through all of the "block" and "ram" devices
to find the first "cpu" device -- but at that point the 32-vs-64 bit
device has already been selected. I suppose another alternative would
be to always start reading the "cpu" data in cpu_load() as if it were
created by a 64-bit host, and making a determination somewhere along the
way that the data being read is bogus and that it should be using the
32-bit device mechanism, seeking back, and calling the other function?
I don't know -- either option would be be really ugly...
Anyway, given that the use of 32-bit KVM hosts should be fairly rare,
what would you think of handling it this way:
(1) use the 64-bit functions by default
(2) adding a crash command line option like "--kvmhost 32" to force the
use of the 32-bit functions
And of course, even if the new option were *not* used on a 32-bit dumpfile,
it would still behave as it does now -- crash still comes up OK -- but it
just wouldn't be able to use the registers from the header.
What do you think?
Dave