Hi Dave,
Thanks a lot, I caught several issues about current patches.
> I did some wrong changes and would understand your mentions.
>
> Updating __per_cpu_end itself makes original UI's corruption.
> And for sure, I want __per_cpu_end to be bumped with your guess.
>
> I fixed these bugs like attachments and tested again,
> of course checked st->__per_cpu_end value advised from you.
>
> One more update is to add is_per_cpu_range().
> Because there are some conditions about percpu range in symbols.c,
> I think it is better to use common function than inline.
> The function returns 0 (not percpu), 1 (kernels) and 2 (modules)
> although callers do not distinguish between 1 and 2 now.
Hello Toshi,
Unfortunately there are still problems with this patch design,
both with the "legacy" and "current" kernel types.
So similar to the case of the "legacy" x86_64 example, if
the
"pcpu_reserved_chunk_limit" value is simply added to the value
of the base kernel's "__per_cpu_end" value of c0a99654, it will
extend into (and overlap) legitimate base kernel virtual address
space.
And I haven't even bothered to look into how it would affect
the operation of all of the other architectures...
Perhaps the patch can be re-done so that the code can handle it on a
per-module basis without inadvertently affecting everything else?
In any case, there's no way I can accept the patch as it is
currently designed.
I handled module symbol in kernel scope (target is not System.map).
In kernel, percpu data can't access directly with mapped virtual address.
Related kernel macros are sometimes arch depends or looks compiler depend.
(I can not understand enough...)
Since percpu data area is entirely dynamic allocation,
additionally some arch locate part of them at VMALLOC area.
They are not straight-mapped so some arch are probably overlapped.
Anyway with previous approach,
there is no possible solution in my idea at least,
but I might get really right way from your proposal.
Retake attached patches with per module space.
I use module.percpu which guarantee straight-mapping in every module space.
Also its size get from bfd_section_size() for legacy percpu implementation
although latest kernel gives percpu_size.
- I am sorry but I don't have i386 target environs now, not tested about i386.
(I think that module.percpu can work well even if it is VMALLOC area.)
But there was still more problem at gdb interface.
When p &(module percpu) is not resolved by syment,
gdb_pass_through() require pre-registration about module sections.
The result gave wrong symbol value of module percpu.
I append percpu section in gdb request from add_symbol_file_kallsyms()
although percpu is not in kallsyms...
Because multiple GNU_ADD_SYMBOL_FILE requests for one module did not work well.
I am not very well around these implementations.
Can I resolve it with other better way?
Thanks,
Toshi.
>I modified your patch to display the new items added to the
>structures in defs.h. For example, the new "st->percpu_mod_used"
>value should be shown in dump_symbol_table(), along with the
>effective (extended) end of percpu space that is used by your
>is_per_cpu_range() function:
>
> crash> help -s
> ...
> __per_cpu_start: ffffffff80419000
> __per_cpu_end: ffffffff8041f508
> percpu_mod_used: 8508 (extended percpu end: ffffffff80427a10)
> ...
>
>Anyway, as I mentioned in my review of your first patch, support
>for the "legacy" module percpu symbols is not acceptable. In
>the example above, note that the end of per-cpu space gets extended
>from ffffffff8041f508 to ffffffff80427a10. But that value pushes
>into legitimate symbol virtual address space above "__per_cpu_end":
>
> crash> sym -l
> ...
> ffffffff8041f4c0 (d) per_cpu__rt_cache_stat
> ffffffff8041f500 (d) per_cpu____icmp_socket
> ffffffff8041f508 (A) __per_cpu_end <-- original value
> ffffffff80420000 (d) .data_nosave
> ffffffff80420000 (A) __nosave_begin
> ffffffff80420000 (D) in_suspend
> ffffffff80421000 (b) .bss
> ffffffff80421000 (A) __bss_start
> ffffffff80421000 (A) __nosave_end
> ffffffff80421000 (B) empty_zero_page
> ffffffff80422000 (B) boot_cpu_stack
> ffffffff80426000 (B) boot_exception_stacks <-- bumped to ffffffff80427a10
> ffffffff8042c000 (B) idt_table
> ffffffff8042d000 (B) boot_delay
> ffffffff8042d008 (B) printk_delay_msec
> ...
>
>So, for example, take the "in_suspend" variable above, which is a global
>integer variable at address ffffffff80420000:
>
> crash> whatis in_suspend
> int in_suspend;
> crash>
>
>Without your patch, it gets printed correctly like this:
>
> crash> p in_suspend
> in_suspend = $2 = 0
> crash>
>
>But with your patch, is_per_cpu_range() thinks that ffffffff80420000 is
>a per-cpu symbol value because it is located within the extended range.
>So the command incorrectly does this:
>
> crash> p in_suspend
> PER-CPU DATA TYPE:
> int in_suspend;
> PER-CPU ADDRESSES:
> [0]: ffff81000157a000
> [1]: ffff810001582580
> [2]: ffff81000158ab00
> [3]: ffff810001593080
> [4]: ffff81000159b600
> [5]: ffff8100015a3b80
> [6]: ffff8100015ac100
> [7]: ffff8100015b4680
> crash>
>
>I also have my doubts about the current kernels which have the single
>"pcpu_reserved_chunk_limit" symbol, where its contents are simply added
>to the base kernel's "__per_cpu_end" value. It seemingly would work OK
>with x86_64, because the symbols are very low offset values, and cannot
>be misconstrued for regular symbol address values:
>
> crash> sym -l
> 0 (D) __per_cpu_start
> 0 (V) irq_stack_union
> 4000 (V) gdt_page
> 5000 (V) exception_stacks
> b000 (V) tlb_vector_offset
> ...
> 15740 (V) call_single_queue
> 15780 (V) csd_data
> 157c0 (V) softnet_data
> 15900 (D) __per_cpu_end
> ffffffff81000000 (T) _text
> ffffffff81000000 (T) startup_64
> ffffffff810000b7 (t) ident_complete
> ffffffff81000100 (T) secondary_startup_64
> ...
>
>But, taking a 32-bit x86 2.6.34-rc5 kernel as an example, the per-cpu symbols
>are as shown below:
>
> crash> sym -l
> ...
> c0a90000 (D) __per_cpu_start <==
> c0a90000 (V) gdt_page
> c0a91000 (V) xen_vcpu
> c0a91020 (V) xen_vcpu_info
> c0a91060 (V) idt_desc
> ...
> c0a995c0 (V) cfd_data
> c0a99600 (V) call_single_queue
> c0a99640 (V) csd_data
> c0a99654 (D) __per_cpu_end <==
> c0a9a000 (r) .smp_locks
> c0a9a000 (D) __init_end
> c0a9a000 (R) __smp_locks
> c0aa1000 (b) .bss
> c0aa1000 (B) __bss_start
> c0aa1000 (R) __smp_locks_end
> c0aa1000 (b) swapper_pg_pmd
> c0aa2000 (b) swapper_pg_fixmap
> c0aa3000 (B) empty_zero_page
> c0aa4000 (b) level1_ident_pgt
> ...
>
So similar to the case of the "legacy" x86_64 example, if
the
"pcpu_reserved_chunk_limit" value is simply added to the value
of the base kernel's "__per_cpu_end" value of c0a99654, it will
extend into (and overlap) legitimate base kernel virtual address
space.
And I haven't even bothered to look into how it would affect
the operation of all of the other architectures...
Perhaps the patch can be re-done so that the code can handle it on a
per-module basis without inadvertently affecting everything else?
In any case, there's no way I can accept the patch as it is
currently designed.
>
>Thanks,
> Dave