Dave,
patch updated. please review, thanks.
* converted all the FATAL to INFO.
* the reason to "break" rather than "return FALSE" is that we might
still
be able to read the "memory" group which is the only relevant checkpoint
group we care about.
rgds,
Dyno
On 3/30/15 12:14 PM, Dave Anderson wrote:
----- Original Message -----
> Dave,
> I've removed the assert now. please find the updated patch in attachment.
> thanks.
> rgds,
> Dyno
Hi Dyno,
I should have combined this actual code review with my last
response re: the assert() calls, but there are just a few minor
things that still need adjustment.
As I mentioned in an earlier response, during session initialization,
which is before RUNTIME gets set in pc->flags in main_loop(), any
error(FATAL, ...) call will print the error message and the crash
session immediately exits. And that's OK if that's what you want
to happen.
But in vmware_vmss_init() you changed many of the previous error messages
that used fprintf(vmss.ofp, ...) to use a construct using FATAL error
messages like this:
+ error(FATAL, LOGPRX"%s: %s\n", filename, strerror(errno));
+ result = FALSE;
+ goto exit;
But that doesn't make sense since the error message will get printed and
crash will exit immediately, and therefore will not return back here in
memory_sources_init():
+ } else if (pc->flags & VMWARE_VMSS) {
+ if (!vmware_vmss_init(pc->dumpfile, fp))
+ error(FATAL, "%s: initialization failed\n",
+ pc->dumpfile);
+ }
So preferably the calls should be changed to error(INFO, ...) so that the user
will get the specific message followed by the general "initialization failed"
message.
Also, it would make sense to add an error(INFO, ...) message here as you
have done for the other error scenarios:
+ if (fread(&hdr, sizeof(cptdumpheader), 1, fp) != 1) {
+ result = FALSE;
+ goto exit;
+ }
If that initial fread() of the header fails, then the session will exit with
just the single somewhat-confusing "initialization failed" message.
I also have a general question about vmware_vmss_init() which is not
specific to this patch. In the inner-most "for(;;)" loop, it appears
that it only legitimately breaks if it runs into a TAG_NULL:
if (tag == NULL_TAG)
break;
But there are a number of fread() failures in the loop that do an
fprintf(ofp, ...) and simply "break". If any of those failures occur,
it looks like vmware_vmss_init() will return success. But should it?
Anyway, if you can clean up the error message handling, we can get this patch
checked in.
Thanks,
Dave
> On 3/27/15 11:05 AM, Dave Anderson wrote:
>> Dyno,
>>
>> I've got the sample multi-region dumpfile -- thanks for that.
>>
>> This latest patch tests OK on all three dumpfiles. But can you please
>> remove the remaining assert() calls as I requested in my previous email?
>>
>> Thanks,
>> Dave
>>
>>
>> ----- Original Message -----
>>> Dave,
>>>
>>> An assert i put caused the crash, workstation VM memory has more block
>>> item
>>> than i thought.
>>> fixed it in the patch and tested against all the dump i have. i will try
>>> to
>>> provide a 4GB memory dump later.
>>>
>>> ESX VM 3G Memory
>>> ================
>>> - Group: memory pos=0x1f6f6 size=0xc000090c
>>> ------------------------------------
>>> align_mask[0, 0] => 0x00ffff
>>> regionsCount => 0x000000
>>> Memory[0, 0] => BLOCK, pos=0x20000,
>>> size=0xc0000000
>>>
>>> ESX VM 4G Memory
>>> ================
>>> - Group: memory pos=0x1f6f6 size=0x10000090c
>>> -----------------------------------
>>> align_mask[0, 0] => 0x00ffff
>>> regionsCount => 0x000002
>>> regionPageNum[0] => 0x000000
>>> regionPPN[0] => 0x000000
>>> regionSize[0] => 0x0c0000
>>> regionPageNum[1] => 0x0c0000
>>> regionPPN[1] => 0x100000
>>> regionSize[1] => 0x040000
>>> Memory[0, 0] => BLOCK, pos=0x20000,
>>> size=0x100000000
>>>
>>> WS VM Memory
>>> ============
>>> - Group: memory pos=0x93b1 size=0x10098
>>> ----------------------------------------
>>> align_mask[0, 0] => 0x00ffff
>>> regionsCount => 0x000000
>>> hotSetSize => 0x040000
>>> hotSet => BLOCK, pos=0x9405,
size=0x8000
>>> MainMemPageZeroStateSize => 0x040000
>>> MainMemKnownZero => BLOCK, pos=0x11447,
>>> size=0x8000
>>>
>>> rgds,
>>> Dyno
>>>
>>> On 3/26/15 1:12 PM, Dave Anderson wrote:
>>>> ----- Original Message -----
>>>>> Dave,
>>>>> updated the patch and please review. thanks.
>>>>> - the page_size/page_shift problem.
>>>>> - change type cast to union.
>>>>> - the read_vmware_vmss() regression.
>>>>>
>>>>> rgds,
>>>>> Dyno
>>>> Dyno,
>>>>
>>>> Since you cannot make any additional sample vmss2core-generated
>>>> dumpfiles available to me, I can only test this latest patch
>>>> on the two dumpfile/kernel pairs that you gave me in February:
>>>>
>>>> vmlinux-2.6.32-431.el6 CentOS6.5-11bd56db.vmss
>>>>
>>>> and
>>>>
>>>> vmlinux-3.13.0-39-generic Ubuntu1404_64bit-65993542.vmss
>>>> (with its companion Ubuntu1404_64bit-65993542.vmem file)
>>>>
>>>> The CentOS kernel works OK with your patch:
>>>>
>>>> $ crash vmlinux-2.6.32-431.el6 CentOS6.5-11bd56db.vmss
>>>>
>>>> crash 7.1.1rc13
>>>> Copyright (C) 2002-2014 Red Hat, Inc.
>>>> Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
>>>> Copyright (C) 1999-2006 Hewlett-Packard Co
>>>> Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
>>>> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
>>>> Copyright (C) 2005, 2011 NEC Corporation
>>>> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
>>>> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
>>>> This program is free software, covered by the GNU General Public
>>>> License,
>>>> and you are welcome to change it and/or distribute copies of it under
>>>> certain conditions. Enter "help copying" to see the
conditions.
>>>> This program has absolutely no warranty. Enter "help
warranty" for
>>>> details.
>>>>
>>>> GNU gdb (GDB) 7.6
>>>> Copyright (C) 2013 Free Software Foundation, Inc.
>>>> License GPLv3+: GNU GPL version 3 or later
>>>>
<
https://urldefense.proofpoint.com/v2/url?u=http-3A__gnu.org_licenses_gpl....
>>>> >
>>>> This is free software: you are free to change and redistribute it.
>>>> There is NO WARRANTY, to the extent permitted by law. Type "show
>>>> copying"
>>>> and "show warranty" for details.
>>>> This GDB was configured as "x86_64-unknown-linux-gnu"...
>>>>
>>>> KERNEL: vmlinux-2.6.32-431.el6
>>>> DUMPFILE: CentOS6.5-11bd56db.vmss
>>>> CPUS: 4
>>>> DATE: Tue Feb 3 18:22:03 2015
>>>> UPTIME: 00:01:06
>>>> LOAD AVERAGE: 0.71, 0.22, 0.07
>>>> TASKS: 302
>>>> NODENAME:
promd-1s-dhcp37.eng.vmware.com
>>>> RELEASE: 2.6.32-431.el6.x86_64
>>>> VERSION: #1 SMP Fri Nov 22 03:15:09 UTC 2013
>>>> MACHINE: x86_64 (2394 Mhz)
>>>> MEMORY: 511.5 MB
>>>> PANIC: ""
>>>> PID: 0
>>>> COMMAND: "swapper"
>>>> TASK: ffffffff81a8d020 (1 of 4) [THREAD_INFO:
>>>> ffffffff81a00000]
>>>> CPU: 0
>>>> STATE: TASK_RUNNING (ACTIVE)
>>>> WARNING: panic task not found
>>>>
>>>> crash>
>>>>
>>>> But the Ubuntu1404_64bit-65993542.vmss fails miserably:
>>>>
>>>> $ crash Ubuntu1404_64bit-65993542.vmss vmlinux-3.13.0-39-generic
>>>>
>>>> crash 7.1.1rc13
>>>> Copyright (C) 2002-2014 Red Hat, Inc.
>>>> Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
>>>> Copyright (C) 1999-2006 Hewlett-Packard Co
>>>> Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
>>>> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
>>>> Copyright (C) 2005, 2011 NEC Corporation
>>>> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
>>>> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
>>>> This program is free software, covered by the GNU General Public
>>>> License,
>>>> and you are welcome to change it and/or distribute copies of it under
>>>> certain conditions. Enter "help copying" to see the
conditions.
>>>> This program has absolutely no warranty. Enter "help
warranty" for
>>>> details.
>>>>
>>>> crash: vmware_vmss.c:169: vmware_vmss_init: Assertion `__extension__
({
>>>> size_t __s1_len, __s2_len; (__builtin_constant_p (name) &&
>>>> __builtin_constant_p ("Memory") && (__s1_len = strlen
(name), __s2_len
>>>> =
>>>> strlen ("Memory"), (!((size_t)(const void *)((name) + 1) -
>>>> (size_t)(const void *)(name) == 1) || __s1_len >= 4) &&
>>>> (!((size_t)(const void *)(("Memory") + 1) - (size_t)(const
void
>>>> *)("Memory") == 1) || __s2_len >= 4)) ? __builtin_strcmp
(name,
>>>> "Memory") : (__builtin_constant_p (name) &&
((size_t)(const void
>>>> *)((name) + 1) - (size_t)(const void *)(name) == 1) &&
(__s1_len =
>>>> strlen (name), __s1_len < 4) ? (__builtin_constant_p
("Memory") &&
>>>> ((size_t)(const void *)(("Memory") + 1) - (size_t)(const
void
>>>> *)("Memory") == 1) ? __builtin_strcmp (name,
"Memory") : (__extension__
>>>> ({ __const unsigned char *__s2 = (__const unsigned char *) (__const
>>>> char
>>>> *) ("Memory"); register int __result = (((__const unsigned
char *)
>>>> (__const char *) (name))[0] - __s2[0]); if (__s1_len > 0 &&
__result ==
>>>> 0)
>>> { __result = (((__const unsigned char *) (__const char *) (name))[1] -
>>> __s2[1]); if (__s1_len > 1 && __result == 0) { __result =
(((__const
>>> unsigned char *) (__const char *) (name))[2] - __s2[2]); if (__s1_len > 2
>>> &&
>>> __result == 0) __result = (((__const unsigned char *) (__const char *)
>>> (name))[3] - __s2[3]); } } __result; }))) : (__builtin_constant_p
>>> ("Memory")
>>> && ((size_t)(const void *)(("Memory") + 1) - (size_t)(const
void
>>> *)("Memory") == 1) && (__s2_len = strlen
("Memory"), __s2_len < 4) ?
>>> (__builtin_constant_p (name) && ((size_t)(const void *)((name) + 1)
-
>>> (size_t)(const void *)(name) == 1) ? __builtin_strcmp (name,
"Memory") :
>>> (__extension__ ({ __const unsigned char *__s1 = (__const unsigned char *)
>>> (__const char *) (name); register int __result = __s1[0] - ((__const
>>> unsigned char *) (__const char *) ("Memory"))[0]; if (__s2_len >
0 &&
>>> __result == 0) { __result = (__s1[1] - ((__const unsigned char *) (__const
>>> char *) ("Memory"))[1]); if (__s2_len > 1 && __result ==
0) { __resul
>>> t = (__s1[2] - ((__const unsigned char *) (__const char *)
>>> ("Memory"))[2]);
>>> if (__s2_len > 2 && __result == 0) __result = (__s1[3] -
((__const
>>> unsigned
>>> char *) (__const char *) ("Memory"))[3]); } } __result; }))) :
>>> __builtin_strcmp (name, "Memory")))); }) == 0' failed.
>>>> Aborted (core dumped)
>>>> $
>>>>
>>>> Note that crash-7.1.0 works OK:
>>>>
>>>> $ /usr/bin/crash Ubuntu1404_64bit-65993542.vmss
>>>> vmlinux-3.13.0-39-generic
>>>>
>>>> crash 7.1.0
>>>> Copyright (C) 2002-2014 Red Hat, Inc.
>>>> Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
>>>> Copyright (C) 1999-2006 Hewlett-Packard Co
>>>> Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
>>>> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
>>>> Copyright (C) 2005, 2011 NEC Corporation
>>>> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
>>>> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
>>>> This program is free software, covered by the GNU General Public
>>>> License,
>>>> and you are welcome to change it and/or distribute copies of it under
>>>> certain conditions. Enter "help copying" to see the
conditions.
>>>> This program has absolutely no warranty. Enter "help
warranty" for
>>>> details.
>>>>
>>>> vmw: Memory dump is not part of this vmss file.
>>>> vmw: Try to locate the companion vmem file ...
>>>> vmw: vmem file: Ubuntu1404_64bit-65993542.vmem
>>>>
>>>> GNU gdb (GDB) 7.6
>>>> Copyright (C) 2013 Free Software Foundation, Inc.
>>>> License GPLv3+: GNU GPL version 3 or later
>>>>
<
https://urldefense.proofpoint.com/v2/url?u=http-3A__gnu.org_licenses_gpl....
>>>> >
>>>> This is free software: you are free to change and redistribute it.
>>>> There is NO WARRANTY, to the extent permitted by law. Type "show
>>>> copying"
>>>> and "show warranty" for details.
>>>> This GDB was configured as "x86_64-unknown-linux-gnu"...
>>>>
>>>> KERNEL: vmlinux-3.13.0-39-generic
>>>> DUMPFILE: Ubuntu1404_64bit-65993542.vmss
>>>> CPUS: 1
>>>> DATE: Thu Nov 13 14:10:53 2014
>>>> UPTIME: 2 days, 03:40:33
>>>> LOAD AVERAGE: 0.00, 0.01, 0.05
>>>> TASKS: 669
>>>> NODENAME: ubuntu
>>>> RELEASE: 3.13.0-39-generic
>>>> VERSION: #66-Ubuntu SMP Tue Oct 28 13:30:27 UTC 2014
>>>> MACHINE: x86_64 (2693 Mhz)
>>>> MEMORY: 1 GB
>>>> PANIC: ""
>>>> PID: 0
>>>> COMMAND: "swapper/0"
>>>> TASK: ffffffff81c15480 [THREAD_INFO: ffffffff81c00000]
>>>> CPU: 0
>>>> STATE: TASK_RUNNING
>>>> WARNING: panic task not found
>>>>
>>>> crash>
>>>>
>>>> Anyway, besides fixing whatever the problem is, please remove the
>>>> assert()
>>>> calls entirely. They did not exist in the original vmware_vmss.c file,
>>>> and shouldn't be added now.
>>>>
>>>> assert() is not used by the top-level crash sources (except for qemu.c
>>>> and qemu-load.c, which came from a 3rd party, and I didn't feel like
>>>> changing
>>>> all of them). Instead, please use the crash convention by testing for
>>>> the
>>>> anomoly, and then send an appropriate error message to error(FATAL,
...),
>>>> which will kill the crash session if it's during session
initialization,
>>>> or kill the current command if it's during crash runtime.
>>>>
>>>> Thanks,
>>>> Dave
>>>>
>