On 04/28/15 17:05, Dave Anderson wrote:
----- Original Message -----
> Using a bubble sort is slow, switch to an insertion sort.
>
> bubble sort: real 3m45.168s
>
> insertion sort: real 0m3.164s
>
> Signed-off-by: Don Slutz <dslutz(a)verizon.com>
> ---
> I do have a big (32G sized file, that gzipped is 357M).
> let me know if you want it.
Hi Don,
I'm running some sanity tests with a couple 94GB flat vmcores by comparing
the data output of some of the more verbose commands. But upon invocation,
I do see some significant speed increases, for example:
vmcore.1 2m45.545s -> 1m31.428s
vmcore.2 2m42.806s -> 0m46.624s
The numbers change from invocation to invocation, but always in the
same direction. Probably file page caching is affecting the times.
Yes, file cache size is important here.
Anyway, I've never taken the time to actually understand how the
flat format
is laid out, and I would normally ask the author to sign off on any major
changes to it. But I see that Ken'ichi Ohmichi is no longer a member
of this list, so for all practical purposes, it's currently
"unmaintained".
I'll continue testing further tomorrow, but if by chance you can whip up a
simplified explanation of the flat dumpfile layout, I'd appreciate it.
Here is my understanding.
The file starts with a header:
hyper-0-21-52:~/cores/15.04.27-13:26:11>hexdump -C vmcore.flat | more
00000000 6d 61 6b 65 64 75 6d 70 66 69 6c 65 00 7f 00 00
|makedumpfile....|
00000010 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 01
|................|
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|................|
*
(from makedumpfile.h):
struct makedumpfile_header {
char signature[SIG_LEN_MDF]; /* = "makedumpfile" */
int64_t type;
int64_t version;
};
The 1st data block starts at 4096:
00001000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40
|...............@|
And is 2 64bit integers (from makedumpfile.h):
struct makedumpfile_data_header {
int64_t offset;
int64_t buf_size;
};
The offset is the "real file offset" where this data block is.
I.E. this is the 1st 0x40 bytes in the normal file.
The data header just keeps repeating:
00001050 00 00 00 00 00 00 09 a8 00 00 00 00 00 00 1d 00
|................|
00002d60 00 00 00 00 00 00 26 a8 00 00 00 00 00 01 00 00
|......&.........|
00012d70 00 00 00 00 00 01 26 a8 00 00 00 00 00 01 00 00
|......&.........|
Until the offset is -1.
Note: The data header is stored in big endian format.
While this works for any data type, I have only seen "normal" elf data:
00001010 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
|.ELF............|
00001020 04 00 3e 00 01 00 00 00 00 00 00 00 00 00 00 00
|..>.............|
00001030 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|@...............|
00001040 00 00 00 00 40 00 38 00 2b 00 00 00 00 00 00 00
|....@.8.+.......|
00001060 05 00 00 00 50 01 00 00 01 00 00 00 43 4f 52 45
|....P.......CORE|
00001070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|................|
*
000010e0 00 00 00 00 03 00 00 00 00 00 00 00 e8 0d 33 80
|..............3.|
...
So basically what this is a stream way of providing a file so that you
can fill out the (for an ELF file) the set of PT_LOAD sections at the
end and not need 2 passes (or a local file) to store the result.
Hope this helps.
-Don Slutz
Thanks,
Dave
> It is the same as the xen rename of dom0
>
> makedumpfile.c | 56 +++++++++++++++++++++++++-------------------------------
> 1 file changed, 25 insertions(+), 31 deletions(-)
>
> diff --git a/makedumpfile.c b/makedumpfile.c
> index f6834b9..c76e22c 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -56,8 +56,10 @@ store_flat_data_array(char *file, struct flat_data **fda)
> {
> int result = FALSE, fd;
> int64_t offset_fdh;
> + int64_t offset_report = 0;
> unsigned long long num_allocated = 0;
> unsigned long long num_stored = 0;
> + unsigned long long sort_idx;
> unsigned long long size_allocated;
> struct flat_data *ptr = NULL, *cur, *new;
> struct makedumpfile_data_header fdh;
> @@ -100,11 +102,34 @@ store_flat_data_array(char *file, struct flat_data
> **fda)
> break;
> }
> cur = ptr + num_stored;
> + sort_idx = num_stored;
> + while (sort_idx) {
> + new = ptr + --sort_idx;
> + if (new->off_rearranged >= fdh.offset) {
> + cur->off_flattened = new->off_flattened;
> + cur->off_rearranged = new->off_rearranged;
> + cur->buf_size = new->buf_size;
> + cur = new;
> + } else {
> + if (CRASHDEBUG(1) && sort_idx + 1 != num_stored) {
> + fprintf(fp,
> + "makedumpfile: Moved from %lld to %lld\n",
> + num_stored, sort_idx + 1);
> + }
> + break;
> + }
> + }
> cur->off_flattened = offset_fdh + sizeof(fdh);
> cur->off_rearranged = fdh.offset;
> cur->buf_size = fdh.buf_size;
> num_stored++;
>
> + if (CRASHDEBUG(1) && (fdh.offset >> 30) > (offset_report >>
30)) {
> + fprintf(fp, "makedumpfile: At %lld GiB\n",
> + fdh.offset >> 30);
> + offset_report = fdh.offset;
> + }
> +
> /* seek for next makedumpfile_data_header. */
> if (lseek(fd, fdh.buf_size, SEEK_CUR) < 0) {
> error(INFO, "%s: seek error (flat format)\n", file);
> @@ -121,35 +146,6 @@ store_flat_data_array(char *file, struct flat_data
> **fda)
> return num_stored;
> }
>
> -static void
> -sort_flat_data_array(struct flat_data **fda, unsigned long long num_fda)
> -{
> - unsigned long long i, j;
> - struct flat_data tmp, *cur_i, *cur_j;
> -
> - for (i = 0; i < num_fda - 1; i++) {
> - for (j = i + 1; j < num_fda; j++) {
> - cur_i = *fda + i;
> - cur_j = *fda + j;
> -
> - if (cur_i->off_rearranged < cur_j->off_rearranged)
> - continue;
> -
> - tmp.off_flattened = cur_i->off_flattened;
> - tmp.off_rearranged = cur_i->off_rearranged;
> - tmp.buf_size = cur_i->buf_size;
> -
> - cur_i->off_flattened = cur_j->off_flattened;
> - cur_i->off_rearranged = cur_j->off_rearranged;
> - cur_i->buf_size = cur_j->buf_size;
> -
> - cur_j->off_flattened = tmp.off_flattened;
> - cur_j->off_rearranged = tmp.off_rearranged;
> - cur_j->buf_size = tmp.buf_size;
> - }
> - }
> -}
> -
> static int
> read_all_makedumpfile_data_header(char *file)
> {
> @@ -161,8 +157,6 @@ read_all_makedumpfile_data_header(char *file)
> if (retval < 0)
> return FALSE;
>
> - sort_flat_data_array(&fda, num);
> -
> afd.num_array = num;
> afd.array = fda;
>
> --
> 1.8.4
>
>