Dave,
patch updated. please review, thanks.
* converted all the FATAL to INFO.
* the reason to "break" rather than "return FALSE" is that we might
still
be able to read the "memory" group which is the only relevant checkpoint
group we care about.
Hi Dyno,
This patch looks OK, except for one minor thing.
I see that you removed the two CRASHDEBUG(1) messages in is_vmware_vmss()
as well. In the case of the failure to fopen() or fread() the passed-in
filename, it is OK to call error(INFO, ...), but they are essentially both
no-ops. Given that main() will reject any non-readable file, and several
earlier dumpfile-checking functions will call error(FATAL, ...) if the
dumpfile cannot be opened, those messages will never be displayed.
However, in this case:
@@ -47,8 +53,7 @@ is_vmware_vmss(char *filename)
hdr.id != CPTDUMP_PARTIAL_MAGIC_NUMBER &&
hdr.id != CPTDUMP_RESTORED_MAGIC_NUMBER &&
hdr.id != CPTDUMP_NORESTORE_MAGIC_NUMBER) {
- if (CRASHDEBUG(1))
- error(INFO, LOGPRX"Unrecognized .vmss file (magic
%x).\n", hdr.id);
+ error(INFO, LOGPRX"Unrecognized .vmss file (magic %x).\n",
hdr.id);
return FALSE;
}
The CRASHDEBUG(1) absolutely should be there. The is_vmware_vmss() function
is simply there to verify whether the passed-in filename is a .vmss file.
If it is not -- based upon the header contents -- then it should quietly return
FALSE. There will be other file formats in the future, and most likely will be placed
following the is_vmware_vmss() call in main(), and we won't want that irrelevant
message. So I've restored that particular CRASHDEBUG(1) qualifier.
Aside from that, the patch is queued for crash-7.1.1:
rgds,
Dyno
On 3/30/15 12:14 PM, Dave Anderson wrote:
>
> ----- Original Message -----
>> Dave,
>> I've removed the assert now. please find the updated patch in
>> attachment.
>> thanks.
>> rgds,
>> Dyno
> Hi Dyno,
>
> I should have combined this actual code review with my last
> response re: the assert() calls, but there are just a few minor
> things that still need adjustment.
>
> As I mentioned in an earlier response, during session initialization,
> which is before RUNTIME gets set in pc->flags in main_loop(), any
> error(FATAL, ...) call will print the error message and the crash
> session immediately exits. And that's OK if that's what you want
> to happen.
>
> But in vmware_vmss_init() you changed many of the previous error messages
> that used fprintf(vmss.ofp, ...) to use a construct using FATAL error
> messages like this:
>
> + error(FATAL, LOGPRX"%s: %s\n", filename,
strerror(errno));
> + result = FALSE;
> + goto exit;
>
> But that doesn't make sense since the error message will get printed and
> crash will exit immediately, and therefore will not return back here in
> memory_sources_init():
>
> + } else if (pc->flags & VMWARE_VMSS) {
> + if (!vmware_vmss_init(pc->dumpfile, fp))
> + error(FATAL, "%s: initialization
failed\n",
> + pc->dumpfile);
> + }
>
> So preferably the calls should be changed to error(INFO, ...) so that the
> user
> will get the specific message followed by the general "initialization
> failed"
> message.
>
> Also, it would make sense to add an error(INFO, ...) message here as you
> have done for the other error scenarios:
>
> + if (fread(&hdr, sizeof(cptdumpheader), 1, fp) != 1) {
> + result = FALSE;
> + goto exit;
> + }
>
> If that initial fread() of the header fails, then the session will exit
> with
> just the single somewhat-confusing "initialization failed" message.
>
> I also have a general question about vmware_vmss_init() which is not
> specific to this patch. In the inner-most "for(;;)" loop, it appears
> that it only legitimately breaks if it runs into a TAG_NULL:
>
> if (tag == NULL_TAG)
> break;
>
> But there are a number of fread() failures in the loop that do an
> fprintf(ofp, ...) and simply "break". If any of those failures occur,
> it looks like vmware_vmss_init() will return success. But should it?
>
> Anyway, if you can clean up the error message handling, we can get this
> patch
> checked in.
>
> Thanks,
> Dave
>
>
>> On 3/27/15 11:05 AM, Dave Anderson wrote:
>>> Dyno,
>>>
>>> I've got the sample multi-region dumpfile -- thanks for that.
>>>
>>> This latest patch tests OK on all three dumpfiles. But can you please
>>> remove the remaining assert() calls as I requested in my previous email?
>>>
>>> Thanks,
>>> Dave
>>>
>>>
>>> ----- Original Message -----
>>>> Dave,
>>>>
>>>> An assert i put caused the crash, workstation VM memory has more
block
>>>> item
>>>> than i thought.
>>>> fixed it in the patch and tested against all the dump i have. i will
try
>>>> to
>>>> provide a 4GB memory dump later.
>>>>
>>>> ESX VM 3G Memory
>>>> ================
>>>> - Group: memory pos=0x1f6f6 size=0xc000090c
>>>> ------------------------------------
>>>> align_mask[0, 0] => 0x00ffff
>>>> regionsCount => 0x000000
>>>> Memory[0, 0] => BLOCK, pos=0x20000,
>>>> size=0xc0000000
>>>>
>>>> ESX VM 4G Memory
>>>> ================
>>>> - Group: memory pos=0x1f6f6 size=0x10000090c
>>>> -----------------------------------
>>>> align_mask[0, 0] => 0x00ffff
>>>> regionsCount => 0x000002
>>>> regionPageNum[0] => 0x000000
>>>> regionPPN[0] => 0x000000
>>>> regionSize[0] => 0x0c0000
>>>> regionPageNum[1] => 0x0c0000
>>>> regionPPN[1] => 0x100000
>>>> regionSize[1] => 0x040000
>>>> Memory[0, 0] => BLOCK, pos=0x20000,
>>>> size=0x100000000
>>>>
>>>> WS VM Memory
>>>> ============
>>>> - Group: memory pos=0x93b1 size=0x10098
>>>> ----------------------------------------
>>>> align_mask[0, 0] => 0x00ffff
>>>> regionsCount => 0x000000
>>>> hotSetSize => 0x040000
>>>> hotSet => BLOCK, pos=0x9405,
>>>> size=0x8000
>>>> MainMemPageZeroStateSize => 0x040000
>>>> MainMemKnownZero => BLOCK, pos=0x11447,
>>>> size=0x8000
>>>>
>>>> rgds,
>>>> Dyno
>>>>
>>>> On 3/26/15 1:12 PM, Dave Anderson wrote:
>>>>> ----- Original Message -----
>>>>>> Dave,
>>>>>> updated the patch and please review. thanks.
>>>>>> - the page_size/page_shift problem.
>>>>>> - change type cast to union.
>>>>>> - the read_vmware_vmss() regression.
>>>>>>
>>>>>> rgds,
>>>>>> Dyno
>>>>> Dyno,
>>>>>
>>>>> Since you cannot make any additional sample vmss2core-generated
>>>>> dumpfiles available to me, I can only test this latest patch
>>>>> on the two dumpfile/kernel pairs that you gave me in February:
>>>>>
>>>>> vmlinux-2.6.32-431.el6 CentOS6.5-11bd56db.vmss
>>>>>
>>>>> and
>>>>>
>>>>> vmlinux-3.13.0-39-generic Ubuntu1404_64bit-65993542.vmss
>>>>> (with its companion Ubuntu1404_64bit-65993542.vmem file)
>>>>>
>>>>> The CentOS kernel works OK with your patch:
>>>>>
>>>>> $ crash vmlinux-2.6.32-431.el6 CentOS6.5-11bd56db.vmss
>>>>>
>>>>> crash 7.1.1rc13
>>>>> Copyright (C) 2002-2014 Red Hat, Inc.
>>>>> Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
>>>>> Copyright (C) 1999-2006 Hewlett-Packard Co
>>>>> Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
>>>>> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
>>>>> Copyright (C) 2005, 2011 NEC Corporation
>>>>> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
>>>>> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux,
Inc.
>>>>> This program is free software, covered by the GNU General Public
>>>>> License,
>>>>> and you are welcome to change it and/or distribute copies of it
under
>>>>> certain conditions. Enter "help copying" to see the
conditions.
>>>>> This program has absolutely no warranty. Enter "help
warranty" for
>>>>> details.
>>>>>
>>>>> GNU gdb (GDB) 7.6
>>>>> Copyright (C) 2013 Free Software Foundation, Inc.
>>>>> License GPLv3+: GNU GPL version 3 or later
>>>>>
<
https://urldefense.proofpoint.com/v2/url?u=http-3A__gnu.org_licenses_gpl....
>>>>> >
>>>>> This is free software: you are free to change and redistribute
it.
>>>>> There is NO WARRANTY, to the extent permitted by law. Type
"show
>>>>> copying"
>>>>> and "show warranty" for details.
>>>>> This GDB was configured as
"x86_64-unknown-linux-gnu"...
>>>>>
>>>>> KERNEL: vmlinux-2.6.32-431.el6
>>>>> DUMPFILE: CentOS6.5-11bd56db.vmss
>>>>> CPUS: 4
>>>>> DATE: Tue Feb 3 18:22:03 2015
>>>>> UPTIME: 00:01:06
>>>>> LOAD AVERAGE: 0.71, 0.22, 0.07
>>>>> TASKS: 302
>>>>> NODENAME:
promd-1s-dhcp37.eng.vmware.com
>>>>> RELEASE: 2.6.32-431.el6.x86_64
>>>>> VERSION: #1 SMP Fri Nov 22 03:15:09 UTC 2013
>>>>> MACHINE: x86_64 (2394 Mhz)
>>>>> MEMORY: 511.5 MB
>>>>> PANIC: ""
>>>>> PID: 0
>>>>> COMMAND: "swapper"
>>>>> TASK: ffffffff81a8d020 (1 of 4) [THREAD_INFO:
>>>>> ffffffff81a00000]
>>>>> CPU: 0
>>>>> STATE: TASK_RUNNING (ACTIVE)
>>>>> WARNING: panic task not found
>>>>>
>>>>> crash>
>>>>>
>>>>> But the Ubuntu1404_64bit-65993542.vmss fails miserably:
>>>>>
>>>>> $ crash Ubuntu1404_64bit-65993542.vmss vmlinux-3.13.0-39-generic
>>>>>
>>>>> crash 7.1.1rc13
>>>>> Copyright (C) 2002-2014 Red Hat, Inc.
>>>>> Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
>>>>> Copyright (C) 1999-2006 Hewlett-Packard Co
>>>>> Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
>>>>> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
>>>>> Copyright (C) 2005, 2011 NEC Corporation
>>>>> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
>>>>> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux,
Inc.
>>>>> This program is free software, covered by the GNU General Public
>>>>> License,
>>>>> and you are welcome to change it and/or distribute copies of it
under
>>>>> certain conditions. Enter "help copying" to see the
conditions.
>>>>> This program has absolutely no warranty. Enter "help
warranty" for
>>>>> details.
>>>>>
>>>>> crash: vmware_vmss.c:169: vmware_vmss_init: Assertion
`__extension__
>>>>> ({
>>>>> size_t __s1_len, __s2_len; (__builtin_constant_p (name)
&&
>>>>> __builtin_constant_p ("Memory") && (__s1_len =
strlen (name),
>>>>> __s2_len
>>>>> =
>>>>> strlen ("Memory"), (!((size_t)(const void *)((name) + 1)
-
>>>>> (size_t)(const void *)(name) == 1) || __s1_len >= 4)
&&
>>>>> (!((size_t)(const void *)(("Memory") + 1) -
(size_t)(const void
>>>>> *)("Memory") == 1) || __s2_len >= 4)) ?
__builtin_strcmp (name,
>>>>> "Memory") : (__builtin_constant_p (name) &&
((size_t)(const void
>>>>> *)((name) + 1) - (size_t)(const void *)(name) == 1) &&
(__s1_len =
>>>>> strlen (name), __s1_len < 4) ? (__builtin_constant_p
("Memory") &&
>>>>> ((size_t)(const void *)(("Memory") + 1) - (size_t)(const
void
>>>>> *)("Memory") == 1) ? __builtin_strcmp (name,
"Memory") :
>>>>> (__extension__
>>>>> ({ __const unsigned char *__s2 = (__const unsigned char *)
(__const
>>>>> char
>>>>> *) ("Memory"); register int __result = (((__const
unsigned char *)
>>>>> (__const char *) (name))[0] - __s2[0]); if (__s1_len > 0
&& __result
>>>>> ==
>>>>> 0)
>>>> { __result = (((__const unsigned char *) (__const char *) (name))[1] -
>>>> __s2[1]); if (__s1_len > 1 && __result == 0) { __result =
(((__const
>>>> unsigned char *) (__const char *) (name))[2] - __s2[2]); if (__s1_len
>
>>>> 2
>>>> &&
>>>> __result == 0) __result = (((__const unsigned char *) (__const char *)
>>>> (name))[3] - __s2[3]); } } __result; }))) : (__builtin_constant_p
>>>> ("Memory")
>>>> && ((size_t)(const void *)(("Memory") + 1) -
(size_t)(const void
>>>> *)("Memory") == 1) && (__s2_len = strlen
("Memory"), __s2_len < 4) ?
>>>> (__builtin_constant_p (name) && ((size_t)(const void *)((name) +
1) -
>>>> (size_t)(const void *)(name) == 1) ? __builtin_strcmp (name,
"Memory") :
>>>> (__extension__ ({ __const unsigned char *__s1 = (__const unsigned char
>>>> *)
>>>> (__const char *) (name); register int __result = __s1[0] - ((__const
>>>> unsigned char *) (__const char *) ("Memory"))[0]; if (__s2_len
> 0 &&
>>>> __result == 0) { __result = (__s1[1] - ((__const unsigned char *)
>>>> (__const
>>>> char *) ("Memory"))[1]); if (__s2_len > 1 &&
__result == 0) { __resul
>>>> t = (__s1[2] - ((__const unsigned char *) (__const char *)
>>>> ("Memory"))[2]);
>>>> if (__s2_len > 2 && __result == 0) __result = (__s1[3] -
((__const
>>>> unsigned
>>>> char *) (__const char *) ("Memory"))[3]); } } __result; })))
:
>>>> __builtin_strcmp (name, "Memory")))); }) == 0' failed.
>>>>> Aborted (core dumped)
>>>>> $
>>>>>
>>>>> Note that crash-7.1.0 works OK:
>>>>>
>>>>> $ /usr/bin/crash Ubuntu1404_64bit-65993542.vmss
>>>>> vmlinux-3.13.0-39-generic
>>>>>
>>>>> crash 7.1.0
>>>>> Copyright (C) 2002-2014 Red Hat, Inc.
>>>>> Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
>>>>> Copyright (C) 1999-2006 Hewlett-Packard Co
>>>>> Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
>>>>> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
>>>>> Copyright (C) 2005, 2011 NEC Corporation
>>>>> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
>>>>> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux,
Inc.
>>>>> This program is free software, covered by the GNU General Public
>>>>> License,
>>>>> and you are welcome to change it and/or distribute copies of it
under
>>>>> certain conditions. Enter "help copying" to see the
conditions.
>>>>> This program has absolutely no warranty. Enter "help
warranty" for
>>>>> details.
>>>>>
>>>>> vmw: Memory dump is not part of this vmss file.
>>>>> vmw: Try to locate the companion vmem file ...
>>>>> vmw: vmem file: Ubuntu1404_64bit-65993542.vmem
>>>>>
>>>>> GNU gdb (GDB) 7.6
>>>>> Copyright (C) 2013 Free Software Foundation, Inc.
>>>>> License GPLv3+: GNU GPL version 3 or later
>>>>>
<
https://urldefense.proofpoint.com/v2/url?u=http-3A__gnu.org_licenses_gpl....
>>>>> >
>>>>> This is free software: you are free to change and redistribute
it.
>>>>> There is NO WARRANTY, to the extent permitted by law. Type
"show
>>>>> copying"
>>>>> and "show warranty" for details.
>>>>> This GDB was configured as
"x86_64-unknown-linux-gnu"...
>>>>>
>>>>> KERNEL: vmlinux-3.13.0-39-generic
>>>>> DUMPFILE: Ubuntu1404_64bit-65993542.vmss
>>>>> CPUS: 1
>>>>> DATE: Thu Nov 13 14:10:53 2014
>>>>> UPTIME: 2 days, 03:40:33
>>>>> LOAD AVERAGE: 0.00, 0.01, 0.05
>>>>> TASKS: 669
>>>>> NODENAME: ubuntu
>>>>> RELEASE: 3.13.0-39-generic
>>>>> VERSION: #66-Ubuntu SMP Tue Oct 28 13:30:27 UTC 2014
>>>>> MACHINE: x86_64 (2693 Mhz)
>>>>> MEMORY: 1 GB
>>>>> PANIC: ""
>>>>> PID: 0
>>>>> COMMAND: "swapper/0"
>>>>> TASK: ffffffff81c15480 [THREAD_INFO: ffffffff81c00000]
>>>>> CPU: 0
>>>>> STATE: TASK_RUNNING
>>>>> WARNING: panic task not found
>>>>>
>>>>> crash>
>>>>>
>>>>> Anyway, besides fixing whatever the problem is, please remove the
>>>>> assert()
>>>>> calls entirely. They did not exist in the original vmware_vmss.c
file,
>>>>> and shouldn't be added now.
>>>>>
>>>>> assert() is not used by the top-level crash sources (except for
qemu.c
>>>>> and qemu-load.c, which came from a 3rd party, and I didn't feel
like
>>>>> changing
>>>>> all of them). Instead, please use the crash convention by testing
for
>>>>> the
>>>>> anomoly, and then send an appropriate error message to error(FATAL,
>>>>> ...),
>>>>> which will kill the crash session if it's during session
>>>>> initialization,
>>>>> or kill the current command if it's during crash runtime.
>>>>>
>>>>> Thanks,
>>>>> Dave
>>>>>
>>