----- Original Message -----
Hello,
We have a CoreOS VM(46 vCPU, 60GB RAM) freeze issue and hoping to find out
what is going on in it at the time of freeze. When the VM froze, we have no
access to it via ssh and ping works sometimes but not always. So, we
suspended the VM which created vmem and vmss files.
Since this is a CoreOS VM, I have used toolbox to install and run crash.
When trying to read these files using crash utility, I'm getting the below
message:
<read_vmware_vmss: addr: ffffffff81c00100 paddr: 1c00100 cnt: 8>
<readmem: ffffffff81c00100, KVADDR, "read_string characters", 1499,
(ROE|Q),
7ffcf595cd70>
<read_vmware_vmss: addr: ffffffff81c00100 paddr: 1c00100 cnt: 1499>
linux_banner:
-ش????kB??C???Ã͞}&k?Xb?8/?ν?fF??&v;?Š???? ??
It would have been helpful to see the full crash -d# log, but I'm presuming
that the utsname data and the cpus_[possible/present/online/active]_mask output
that gets displayed just before the linux_banner output are also nonsensical?
Typically this kind of problem is because phys_base cannot be determined,
or if KASLR is enabled, the KASLR offset cannot be determined. Those two
items are encoded into the dumpfile header for kdump dumpfiles, but there
is no such information in a vmms dumpfile header.
Can you run crash live on the machine? You can see whether the phys_base
and KASLR offset are non-zero on the live system by entering:
crash> help -m | grep phys_base
phys_base: 129800000
crash> help -k | grep relocate
relocate: ffffffffcf400000 (KASLR offset: 30c00000 / 780MB)
crash>
If relocate is 0 (KASLR not enabled), then the phys_base value can
be applied to your vmcore by entering, for example, "--machdep phys_base=780m"
on the crash command line (using your phys_base). If KASLR is enabled,
then I'm not sure how you can figure it out with a raw dumpfile.
The VMware developer of the vmware_vmss.c and vmware_vmss.h files does
not seem to be on this list. (hfu(a)vmware.com) I've added him to the
cc list in case they've run into this issue.
Dave