(2012/06/20 1:20), Dave Anderson wrote:
OK, now I'm getting confused...
I note that currently __diskdump_memory_dump() does this:
for (i = 0; i< dd->num_prstatus_notes; i++) {
fprintf(fp, " notes[%d]: %lx\n",
i, (ulong)dd->nt_prstatus_percpu[i]);
}
But your patch does this:
nrcpus = kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS;
for (i = 0; i< nrcpus; i++) {
if (dd->nt_prstatus_percpu[i] == NULL) {
fprintf(fp,
" notes[%d]: (not saved)\n",
i);
continue;
}
fprintf(fp, " notes[%d]: %lx\n",
i, (ulong)dd->nt_prstatus_percpu[i]);
}
Now surely we don't want to dump "NR_CPUS" notes pointers,
unless there actually are that many cpus in the system.
For example, in RHEL6 sets CONFIG_NR_CPUS to 4096 for x86_64,
but rarely do we see a dumpfile with that many cpus.
So I would prefer that the for loop continue to be bounded
by "dd->num_prstatus_notes".
I also bothered about condition whether loop should break
at dd->nt_prstatus_percpu or not, even though remaining online cpu's
notes can't display as "not saved".
However, I choised nrcpus way since my environment could break
with online cpu nums, not NR_CPUS.
Now, I check kernel_NR_CPUS validation.
if (STREQ(ln, "CONFIG_NR_CPUS")) {
kt->kernel_NR_CPUS = atoi(val);
I'm aware if ikconfig is not valid, break condition of loop is always
NR_CPUS without doubt and display gets noisy like as your counsel.
However, given that process_elf32_notes() and process_elf64_notes()
do a cursory test of each note by checking for the NT_PRSTATUS
type, I'm presuming that in your dumpfile, dd->num_prstatus_notes
must be equal to 1, correct?
Yes, dd->num_prstatus_notes in my dumpfile is 1.
void
process_elf64_notes(void *note_buf, unsigned long size_note)
{
Elf64_Nhdr *nt;
size_t index, len = 0;
int num = 0;
for (index = 0; index< size_note; index += len) {
nt = note_buf + index;
if(nt->n_type == NT_PRSTATUS) {
dd->nt_prstatus_percpu[num] = nt;
num++;
}
len = sizeof(Elf64_Nhdr);
len = roundup(len + nt->n_namesz, 4);
len = roundup(len + nt->n_descsz, 4);
}
if (num> 0) {
pc->flags2 |= ELF_NOTES;
dd->num_prstatus_notes = num;
}
return;
}
But then map_cpus_to_prstatus_kdump_cmprs() remaps the
notes without modifying the dd->num_prstatus_notes counter.
So then there's this:
void *
diskdump_get_prstatus_percpu(int cpu)
{
if ((cpu< 0) || (cpu>= dd->num_prstatus_notes))
return NULL;
return dd->nt_prstatus_percpu[cpu];
}
So in a case such as your example, where cpu 2 was the only cpu that saved
its notes, the function above would incorrectly return NULL when called
with diskdump_get_prstatus_percpu(2).
What am I missing?
No, you are entirely right and I could not condier about x86 parts.
I'm getting the feeling that there might be more affects for any other arches.
I reconsider different approach to fix NT_PRSTATUS up.
Thanks,
Toshi
Dave
--
Crash-utility mailing list
Crash-utility(a)redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility