Hi Dave,
On 01/10/2012 09:24 AM, Dave Anderson wrote:
----- Original Message -----
> Re-posting the patch as an attachment. I'm sorry for the mess on the
> first pass.
>
> --
> David.
Hi Dave,
Your patch has merit, but I'm going to pare it down a bit. I use CRASHDEBUG(8)
fairly frequently, but with your changes as-is, it's far too verbose. It's also
a bit redundant -- here's an example of a CRASHDEBUG(8) kdump read currently:
crash> rd ffff88003c1a5cc0
<addr: ffff88003c1a5cc0 count: 1 flag: 490 (KVADDR)>
<readmem: ffff88003c1a5cc0, KVADDR, "64-bit KVADDR", 8, (FOE),
7fff0096b5a8>
addr: ffff88003c1a5cc0 paddr: 3c1a5cc0 cnt: 8
ffff88003c1a5cc0: 0000000000000000 ........
text hit rate: 65% (3505 of 5378)
crash>
And then with your patch:
crash> rd ffff88003c1a5cc0
<addr: ffff88003c1a5cc0 count: 1 flag: 490 (KVADDR)>
<readmem: ffff88003c1a5cc0, KVADDR, "64-bit KVADDR", 8, (FOE),
7fff1c97b6e8>
addr: ffff88003c1a5cc0 paddr: 3c1a5cc0 cnt: 8
rd: read_kdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
rd: read_kdump trying read_netdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
rd: read_netdump(-1, 7fff1c97b6e8, 8, ffff88003c1a5cc0, 3c1a5cc0)
rd: read KDUMP or NETDUMP_ELF64 for multi-segment: 3c195fb8
rd: seek and read offset: 3c195fb8
ffff88003c1a5cc0: 0000000000000000 ........
text hit rate: 65% (3505 of 5378)
crash>
The "<readmem: ...>" and subsequent "addr: ..." lines come from
readmem().
I don't understand the need for the next 3 lines, as they are completely
redundant:
Except that one intention was to show program flow (failures that didn't
happen) as well as continuation of program state.
* read_kdump() entered
rd: read_kdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
* None of the conditional fatal error exits in read_kdump()
rd: read_kdump trying read_netdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
* read_netdump() entered
rd: read_netdump(-1, 7fff1c97b6e8, 8, ffff88003c1a5cc0, 3c1a5cc0)
* switch (DUMPFILE_FORMAT(nd->flags)) VIA case KDUMP_ELF64
* show calculated offset
rd: read KDUMP or NETDUMP_ELF64 for multi-segment: 3c195fb8
* Not the if (FLAT_FORMAT()) read case
rd: seek and read offset: 3c195fb8
rd: read_kdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
rd: read_kdump trying read_netdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
rd: read_netdump(-1, 7fff1c97b6e8, 8, ffff88003c1a5cc0, 3c1a5cc0)
I don't argue with the displays of the resultant file offsets, but I may move
them to CRASHDEBUG(9):
I wouldn't object and can re-post a CRASHDEBUG(9) version myself if
you'd prefer.
rd: read KDUMP or NETDUMP_ELF64 for multi-segment: 3c195fb8
rd: seek and read offset: 3c195fb8
And the same goes for compressed kdumps:
crash> rd ffff81005286e0c0
<addr: ffff81005286e0c0 count: 1 flag: 490 (KVADDR)>
<readmem: ffff81005286e0c0, KVADDR, "64-bit KVADDR", 8, (FOE),
7fff11d16958>
addr: ffff81005286e0c0 paddr: 5286e0c0 cnt: 8
ffff81005286e0c0: 0000000000000000 ........
text hit rate: 66% (239 of 357)
crash>
With the patch, aside from the "pfn" displays, it's redundant information:
Except for the implied flow and continuation of state. I've been asked
to look into alleged crash bugs that were no more than corrupt or
incomplete dumps that required either hexdump style deciphering of the
structure or a method of crash exposing its own deciphering of the
structure because the program starts and exits with no more than an
error code that can't always identify a unique point of failure and
you'd need to find the cause in source anyway. For example, a case where
an incomplete crash dump has page-present flags showing the page not
present for kernel data structures (every single page present flag is
zero in-fact). IOW, no crash bug because kdump created an incomplete
crash dump.
crash> rd ffff81005286e0c0
<addr: ffff81005286e0c0 count: 1 flag: 490 (KVADDR)>
<readmem: ffff81005286e0c0, KVADDR, "64-bit KVADDR", 8, (FOE),
7fffd3ca4478>
addr: ffff81005286e0c0 paddr: 5286e0c0 cnt: 8
rd: read_diskdump(-1, 7fffd3ca4478, 8, ffff81005286e0c0, 5286e0c0), pfn=5286e
rd: Buffering page with pfn=5286e and current physaddr=5286e000
ffff81005286e0c0: 0000000000000000 ........
text hit rate: 66% (239 of 357)
crash>
For starters, I think the patch can be simplified by modifying the "addr: ..."
line that is displayed by readmem() just prior to issuing it to the configured
pc->readmem function so that it shows the receiving function. So that line,
which may be displayed several times if the read crosses a page boundary, could
be reworked to show something like this:
crash> rd ffff81005286e0c0
<addr: ffff81005286e0c0 count: 1 flag: 490 (KVADDR)>
<readmem: ffff81005286e0c0, KVADDR, "64-bit KVADDR", 8, (FOE),
7fffd3ca4478>
<read_netdump: addr: ffff81005286e0c0 paddr: 5286e0c0 cnt: 8
...
and then a line or two from the relevant read functions would show
the file offset that was resolved from the physical address.
Also, the new function that translates the pc->readmem function address to a
string for display could also be used in a single location in main.c instead of
having to put debug statements in each location where where pc->readmem is
initialized.
Isn't there at least one case, e.g. x86_64_xen_kdump_p2m_create() that
replaces and restores the pc->readmem value at runtime?
So let me play around with this a bit...
You are welcome too but I would be happy to take responsibility and
re-work it too. If you do go ahead with a re-work then please consider
my goal of monitoring program flow.
Thanks,
David