Hi,
2007/03/23 09:26:02 +0900, "Ken'ichi Ohmichi"
<oomichi(a)mxs.nes.nec.co.jp> wrote:
>> 1) Makedumpfile patch: Ken'ichi Ohmichi's email of
Wed, 7 Mar 2007
>> 10:43:38 +0900 contained the patch "point_same_zero_page.patch". That
>> patch contains the nice solution to remove redundant zero page images
>> from the diskdump dump file by pointing the page descriptors of zero
>> pages to a common zero image. I suggest that this patch should be
>> applied to makedumpfile as soon as possible, without waiting on a
>> possible solution to the ELF situation. As described in my report, ELF
>> and diskdump dump files have not shown identical behavior in the past.
>> This patch makes diskdump dump files more accurate, and leaves ELF dump
>> files at the same level of accuracy that they have always had.
I agree with Bob, I will merge the patch "point_same_zero_page.patch" into
a new makedumpfile. But this change is very important, and I want to check
that this change is correct by doing many tests.
I will release a new makedumpfile until the next weekend.
I checked whether this change is correct by the following:
(The following patches are attached with this mail)
- makedumpfile-1.1.2 with "point_same_zero_page2.patch" creates a dumpfile.
- crash-4.0-3.21 with "not-access-excluded-page.patch" analyzes the dumpfile.
- The analysis result of the dumpfile is compared with /proc/vmcore's.
And on i386 linux-2.6.19, I found the difference between the result
of the dumpfile (excluding free pages) and /proc/vmcore's by subcommand
"foreach bt".
But by using crash-4.0-3.21 without "not-access-excluded-page.patch",
there is not any difference. In a word, this difference happens due to
considering the excluded pages as unaccess pages.
It is the diff result of /proc/vmcore's analysis result and the dumpfile's
as follows:
--- result-vmcore.txt 2007-03-28 14:01:20.000000000 +0900
+++ result-dumpfile-d16.txt 2007-03-28 14:01:06.000000000 +0900
@@ -1,7 +1,24 @@
crash> foreach bt
PID: 0 TASK: c037c440 CPU: 0 COMMAND: "swapper"
+bt: diskdump: paddr(44b1ae) excluded from dump
+bt: diskdump: paddr(44b1ae) excluded from dump
+bt: diskdump: paddr(44b2a0) excluded from dump
+bt: diskdump: paddr(44b2a0) excluded from dump
+bt: diskdump: paddr(44b1ae) excluded from dump
+bt: diskdump: paddr(44b1ae) excluded from dump
+bt: diskdump: paddr(44b2a1) excluded from dump
+bt: diskdump: paddr(44b2a1) excluded from dump
+bt: cannot resolve stack trace:
+bt: diskdump: paddr(44b1ae) excluded from dump
+bt: diskdump: paddr(44b1ae) excluded from dump
+bt: diskdump: paddr(44b2a1) excluded from dump
+bt: diskdump: paddr(44b2a1) excluded from dump
#0 [c0446f54] schedule at c0311ef0
- #1 [c0446fcc] cpu_idle at c0102c8d
+bt: text symbols on stack:
+ [c0446fb0] mwait_idle_with_hints at c0102235
+ [c0446fcc] cpu_idle at c0102c92
+ [c0446fd4] start_kernel at c044b56f
+ [c0446fdc] unknown_bootoption at c044b577
PID: 0 TASK: c812f050 CPU: 1 COMMAND: "swapper"
#0 [c8127f3c] schedule at c0311ef0
_
The physical address 0x44b1ae is the symbol start_kernel's.
The text start_kernel is freed at free_initmem() while the kernel
booting, and it is not problem that this text is excluded as free pages.
I did not research the detail of this problem. I guess that the crash
utility expect it can read the text of each process.
I think it is necessary not only the change of handling the excluded
pages but also other changes of the crash utility.
Thanks
Ken'ihci Ohmichi