----- Original Message -----
Hi Dave,
On 12/17/12 11:23, Dave Anderson wrote:
>>> Right -- I would never expect error() to be called while inside
>>> an open_tmpfile() operation. Normally the behind-the-scenes data
>>> is parsed, and if anything is to be displayed while open_tmpfile()
>>> is still in play, it would be fprint()'ed using pc->saved_fp.
>>
>> I think the aesthetically pleasing solution is an
"i_am_playing_with_tmpfile()"
>> call that says it isn't closed and crash functions shouldn't be using
it.
>> Plus a parallel "i_am_done_with_tmpfile()" that gets implied by
"close_tmpfile()".
>> I can supply a patch, if you like. Probably with less verbose function names.
>
> If pc->tmpfile is non-NULL, then open_tmpfile() is in use. What would be
> the purpose of the extra functions?
It would be to allow the client code that is processing that temp file to emit
warning/info messages without disrupting the reading of that file pointer.
To me, that doesn't seem unreasonable. You run some code that emits output
to a temp file and you reprocess those data. You surely do not want such
messages showing up in the file you are re-processing. And you cannot
call close_tmpfile() because it calls ftruncate().
So, what is your recommendation for how to reprocess diverted output
wherein you might occasionally want to say something during that
reprocessing?
Three solutions come to mind:
1. Juggle file pointers before and after the __error() function call
(please say, "No.")
No.
2. Create my own temporary file and fiddle the global "fp"
and "pc" state so it
gets used while I am gathering data and crash code doesn't know about it later.
(I insist the answer must be, "No." because there is too much fiddling
with
intricate crash state.)
No.
3. These two functions that I am suggesting:
void
resume_tmpfile(void)
{
int ret ATTRIBUTE_UNUSED;
if (pc->tmpfile)
error(FATAL, "recursive temporary file usage\n");
if (!pc->tmp_fp)
error(FATAL, "temporary file not ready\n");
rewind(pc->tmp_fp);
pc->tmpfile = pc->tmp_fp;
pc->saved_fp = fp;
fp = pc->tmpfile;
}
void
sequester_tmpfile(void)
{
int ret ATTRIBUTE_UNUSED;
if (pc->tmpfile) {
fflush(pc->tmpfile);
rewind(pc->tmpfile);
pc->tmpfile = NULL;
fp = pc->saved_fp;
} else
error(FATAL, "trying to sequester an unopened temporary file\n");
}
And no...
When open_tmpfile() is in play and you want to print something, you can
always use fprintf(pc->saved_fp, ...) as is done everywhere now.
That being said, if you truly desire to use error() during an open_tmpfile()
operation, then that anomoly should be handled in the error() function.
So, if error() is called during open_tmpfile(), i.e., then the message should
be displayed as it is done now, which is to pc->stdpipe (i.e., the current
more/less scroller if it is in effect), or to stdout if not:
if (pc->stdpipe) {
fprintf(pc->stdpipe, "%s%s%s %s%s",
new_line ? "\n" : "",
type == CONT ? spacebuf : pc->curcmd,
type == CONT ? " " : ":",
type == WARNING ? "WARNING: " :
type == NOTE ? "NOTE: " : "",
buf);
fflush(pc->stdpipe);
} else {
fprintf(stdout, "%s%s%s %s%s",
new_line || end_of_line ? "\n" : "",
type == WARNING ? "WARNING" :
type == NOTE ? "NOTE" :
type == CONT ? spacebuf : pc->curcmd,
type == CONT ? " " : ":",
buf, end_of_line ? "\n" : "");
fflush(stdout);
}
and if the output is currently being redirected to a file or to a pipe,
then it is also issued to those end-points here:
if ((fp != stdout) && (fp != pc->stdpipe)) {
fprintf(fp, "%s%s%s %s", new_line ? "\n" :
"",
type == WARNING ? "WARNING" :
type == NOTE ? "NOTE" :
type == CONT ? spacebuf : pc->curcmd,
type == CONT ? " " : ":",
buf);
fflush(fp);
}
It's that "duplication" above that you're seeing.
And I am simply suggesting that the if statement above should be:
if ((fp != stdout) && (fp != pc->stdpipe) && (fp !=
pc->tmpfile)) {
because you obviously don't want the message intermingled with your open_tmpfile()
output.
I sequester the file after doing the data gathering and resume it
after I am done reprocessing it. It might be worth putting in a little jig
to ensure that open/close_tmpfile work reasonably, too. (I would guess
that either would cancel the sequestration.)
>>> I'm not sure, other than it doesn't seem to be able to find
ffffea001bb1d1e8
>>
>> I was able to figure that out. I also printed out the "kmem -v" table
and
>> sorted the result. The result with "kmem -n"
>>
>> [...]
>> 66 ffff88087fffa420 ffffea0000000000 ffffea0007380000 2162688
>> 67 ffff88087fffa430 ffffea0000000000 ffffea0007540000 2195456
>> 132608 ffff88083c9bdb98 ffff88083c9bdd98 ffff8840e49bdd98 4345298944
>> 132609 ffff88083c9bdba8 ffff88083c9796c0 ffff8840e4b396c0 4345331712
>> ;...]
>>
>> viz. it ain't there. Which is quite interesting, because if the lustre
>> cluster file system structure "cfs_trace_data" actually pointed off
into
>> unmapped memory, it would have fallen over long, long before the point
>> where it did fall over.
>
> I don't see the vmemmap range in the "kmem -v" output. It is mapped
> kernel memory, but AFAIK it's not kept in the kernel's "vmlist"
list.
> Do you see that range in your "kmem -v" output?
Also no. "kmem -v" and "kmem -n" both show the same memory mappings
(as best as _my_ memory serves, that is. For certain, neither has a mapping
for 0xffffea001bb1d1e8.)
> OK so you say you cannot get the mappings for it, but what
> does "vtop 0xffffea001bb1d1e8" show?
This:
> crash> vtop 0xffffea001bb1d1e8
> VIRTUAL PHYSICAL
> ffffea001bb1d1e8 879b1d1e8
>
> PML4 DIRECTORY: ffffffff817e7000
> PAGE DIRECTORY: 87fdf7067
> PUD: 87fdf7000 => 87fdf6067
> PMD: 87fdf66e8 => 8000000879a001e3
> PAGE: 879a00000 (2MB)
>
> PTE PHYSICAL FLAGS
> 8000000879a001e3 879a00000 (PRESENT|RW|ACCESSED|DIRTY|PSE|GLOBAL|NX)
But given:
> Sorry -- that's irrelevant. You want to access the physical
> memory that the odd vmemmap page address references (not the
> physical page behind the page structure itself).
Exactly right. I need to be able to see the binary bits for that page so I can
pull them in and write them back out to a file of just those bits. From there,
we'll be formatting a text file showing the lustre trace log.
Thank you so much! Regards, Bruce
Right... seems like it should be such a simple thing to do... :-(
I don't understand what's going on, but I'm presuming that even if the
vmemmap-type address doesn't fit into the "advertised" vmemmap range,
that the kernel's __page_to_pfn() macro should still work to get the
pfn represented by the page:
#elif defined(CONFIG_SPARSEMEM)
/*
* Note: section's mem_map is encorded to reflect its start_pfn.
* section[i].section_mem_map == mem_map's address - start_pfn;
*/
#define __page_to_pfn(pg) \
({ const struct page *__pg = (pg); \
int __sec = page_to_section(__pg); \
(unsigned long)(__pg - __section_mem_map_addr(__nr_to_section(__sec))); \
})
Maybe you could play around with emulating that macro w/crash, and see what
comes up?
Dave