Hi Dave,
On 12/17/12 11:23, Dave Anderson wrote:
>> Right -- I would never expect error() to be called while
inside
>> an open_tmpfile() operation. Normally the behind-the-scenes data
>> is parsed, and if anything is to be displayed while open_tmpfile()
>> is still in play, it would be fprint()'ed using pc->saved_fp.
>
> I think the aesthetically pleasing solution is an
"i_am_playing_with_tmpfile()"
> call that says it isn't closed and crash functions shouldn't be using it.
> Plus a parallel "i_am_done_with_tmpfile()" that gets implied by
"close_tmpfile()".
> I can supply a patch, if you like. Probably with less verbose function names.
If pc->tmpfile is non-NULL, then open_tmpfile() is in use. What would be
the purpose of the extra functions?
It would be to allow the client code that is processing that temp file to emit
warning/info messages without disrupting the reading of that file pointer.
To me, that doesn't seem unreasonable. You run some code that emits output
to a temp file and you reprocess those data. You surely do not want such
messages showing up in the file you are re-processing. And you cannot
call close_tmpfile() because it calls ftruncate().
So, what is your recommendation for how to reprocess diverted output
wherein you might occasionally want to say something during that reprocessing?
Three solutions come to mind:
1. Juggle file pointers before and after the __error() function call (please say,
"No.")
2. Create my own temporary file and fiddle the global "fp" and "pc"
state so it
gets used while I am gathering data and crash code doesn't know about it later.
(I insist the answer must be, "No." because there is too much fiddling with
intricate crash state.)
3. These two functions that I am suggesting:
void
resume_tmpfile(void)
{
int ret ATTRIBUTE_UNUSED;
if (pc->tmpfile)
error(FATAL, "recursive temporary file usage\n");
if (!pc->tmp_fp)
error(FATAL, "temporary file not ready\n");
rewind(pc->tmp_fp);
pc->tmpfile = pc->tmp_fp;
pc->saved_fp = fp;
fp = pc->tmpfile;
}
void
sequester_tmpfile(void)
{
int ret ATTRIBUTE_UNUSED;
if (pc->tmpfile) {
fflush(pc->tmpfile);
rewind(pc->tmpfile);
pc->tmpfile = NULL;
fp = pc->saved_fp;
} else
error(FATAL, "trying to sequester an unopened temporary file\n");
}
I sequester the file after doing the data gathering and resume it
after I am done reprocessing it. It might be worth putting in a little jig
to ensure that open/close_tmpfile work reasonably, too. (I would guess
that either would cancel the sequestration.)
>> I'm not sure, other than it doesn't seem to be able
to find
>> ffffea001bb1d1e8
>
> I was able to figure that out. I also printed out the "kmem -v" table and
> sorted the result. The result with "kmem -n"
>
> [...]
> 66 ffff88087fffa420 ffffea0000000000 ffffea0007380000 2162688
> 67 ffff88087fffa430 ffffea0000000000 ffffea0007540000 2195456
> 132608 ffff88083c9bdb98 ffff88083c9bdd98 ffff8840e49bdd98 4345298944
> 132609 ffff88083c9bdba8 ffff88083c9796c0 ffff8840e4b396c0 4345331712
> ;...]
>
> viz. it ain't there. Which is quite interesting, because if the lustre
> cluster file system structure "cfs_trace_data" actually pointed off into
> unmapped memory, it would have fallen over long, long before the point
> where it did fall over.
I don't see the vmemmap range in the "kmem -v" output. It is mapped
kernel memory, but AFAIK it's not kept in the kernel's "vmlist" list.
Do you see that range in your "kmem -v" output?
Also no. "kmem -v" and "kmem -n" both show the same memory mappings
(as best as _my_ memory serves, that is. For certain, neither has a mapping
for 0xffffea001bb1d1e8.)
OK so you say you cannot get the mappings for it, but what
does "vtop 0xffffea001bb1d1e8" show?
This:
crash> vtop 0xffffea001bb1d1e8
VIRTUAL PHYSICAL
ffffea001bb1d1e8 879b1d1e8
PML4 DIRECTORY: ffffffff817e7000
PAGE DIRECTORY: 87fdf7067
PUD: 87fdf7000 => 87fdf6067
PMD: 87fdf66e8 => 8000000879a001e3
PAGE: 879a00000 (2MB)
PTE PHYSICAL FLAGS
8000000879a001e3 879a00000 (PRESENT|RW|ACCESSED|DIRTY|PSE|GLOBAL|NX)
But given:
Sorry -- that's irrelevant. You want to access the physical
memory that the odd vmemmap page address references (not the
physical page behind the page structure itself).
Exactly right. I need to be able to see the binary bits for that page so I can
pull them in and write them back out to a file of just those bits. From there,
we'll be formatting a text file showing the lustre trace log.
Thank you so much! Regards, Bruce