----- "Toshikazu Nakayama" <nakayama.ts(a)ncos.nec.co.jp> wrote:
Hi Dave,
I did some wrong changes and would understand your mentions.
Updating __per_cpu_end itself makes original UI's corruption.
And for sure, I want __per_cpu_end to be bumped with your guess.
I fixed these bugs like attachments and tested again,
of course checked st->__per_cpu_end value advised from you.
One more update is to add is_per_cpu_range().
Because there are some conditions about percpu range in symbols.c,
I think it is better to use common function than inline.
The function returns 0 (not percpu), 1 (kernels) and 2 (modules)
although callers do not distinguish between 1 and 2 now.
Hello Toshi,
Unfortunately there are still problems with this patch design,
both with the "legacy" and "current" kernel types.
I modified your patch to display the new items added to the
structures in defs.h. For example, the new "st->percpu_mod_used"
value should be shown in dump_symbol_table(), along with the
effective (extended) end of percpu space that is used by your
is_per_cpu_range() function:
crash> help -s
...
__per_cpu_start: ffffffff80419000
__per_cpu_end: ffffffff8041f508
percpu_mod_used: 8508 (extended percpu end: ffffffff80427a10)
...
Anyway, as I mentioned in my review of your first patch, support
for the "legacy" module percpu symbols is not acceptable. In
the example above, note that the end of per-cpu space gets extended
from ffffffff8041f508 to ffffffff80427a10. But that value pushes
into legitimate symbol virtual address space above "__per_cpu_end":
crash> sym -l
...
ffffffff8041f4c0 (d) per_cpu__rt_cache_stat
ffffffff8041f500 (d) per_cpu____icmp_socket
ffffffff8041f508 (A) __per_cpu_end <-- original value
ffffffff80420000 (d) .data_nosave
ffffffff80420000 (A) __nosave_begin
ffffffff80420000 (D) in_suspend
ffffffff80421000 (b) .bss
ffffffff80421000 (A) __bss_start
ffffffff80421000 (A) __nosave_end
ffffffff80421000 (B) empty_zero_page
ffffffff80422000 (B) boot_cpu_stack
ffffffff80426000 (B) boot_exception_stacks <-- bumped to ffffffff80427a10
ffffffff8042c000 (B) idt_table
ffffffff8042d000 (B) boot_delay
ffffffff8042d008 (B) printk_delay_msec
...
So, for example, take the "in_suspend" variable above, which is a global
integer variable at address ffffffff80420000:
crash> whatis in_suspend
int in_suspend;
crash>
Without your patch, it gets printed correctly like this:
crash> p in_suspend
in_suspend = $2 = 0
crash>
But with your patch, is_per_cpu_range() thinks that ffffffff80420000 is
a per-cpu symbol value because it is located within the extended range.
So the command incorrectly does this:
crash> p in_suspend
PER-CPU DATA TYPE:
int in_suspend;
PER-CPU ADDRESSES:
[0]: ffff81000157a000
[1]: ffff810001582580
[2]: ffff81000158ab00
[3]: ffff810001593080
[4]: ffff81000159b600
[5]: ffff8100015a3b80
[6]: ffff8100015ac100
[7]: ffff8100015b4680
crash>
I also have my doubts about the current kernels which have the single
"pcpu_reserved_chunk_limit" symbol, where its contents are simply added
to the base kernel's "__per_cpu_end" value. It seemingly would work OK
with x86_64, because the symbols are very low offset values, and cannot
be misconstrued for regular symbol address values:
crash> sym -l
0 (D) __per_cpu_start
0 (V) irq_stack_union
4000 (V) gdt_page
5000 (V) exception_stacks
b000 (V) tlb_vector_offset
...
15740 (V) call_single_queue
15780 (V) csd_data
157c0 (V) softnet_data
15900 (D) __per_cpu_end
ffffffff81000000 (T) _text
ffffffff81000000 (T) startup_64
ffffffff810000b7 (t) ident_complete
ffffffff81000100 (T) secondary_startup_64
...
But, taking a 32-bit x86 2.6.34-rc5 kernel as an example, the per-cpu symbols
are as shown below:
crash> sym -l
...
c0a90000 (D) __per_cpu_start <==
c0a90000 (V) gdt_page
c0a91000 (V) xen_vcpu
c0a91020 (V) xen_vcpu_info
c0a91060 (V) idt_desc
...
c0a995c0 (V) cfd_data
c0a99600 (V) call_single_queue
c0a99640 (V) csd_data
c0a99654 (D) __per_cpu_end <==
c0a9a000 (r) .smp_locks
c0a9a000 (D) __init_end
c0a9a000 (R) __smp_locks
c0aa1000 (b) .bss
c0aa1000 (B) __bss_start
c0aa1000 (R) __smp_locks_end
c0aa1000 (b) swapper_pg_pmd
c0aa2000 (b) swapper_pg_fixmap
c0aa3000 (B) empty_zero_page
c0aa4000 (b) level1_ident_pgt
...
So similar to the case of the "legacy" x86_64 example, if the
"pcpu_reserved_chunk_limit" value is simply added to the value
of the base kernel's "__per_cpu_end" value of c0a99654, it will
extend into (and overlap) legitimate base kernel virtual address
space.
And I haven't even bothered to look into how it would affect
the operation of all of the other architectures...
Perhaps the patch can be re-done so that the code can handle it on a
per-module basis without inadvertently affecting everything else?
In any case, there's no way I can accept the patch as it is
currently designed.
Thanks,
Dave