On Wed, 2005-10-26 at 17:48 -0400, Dave Anderson wrote:
Badari Pulavarty wrote:
> On Wed, 2005-10-26 at 16:27 -0400, Dave Anderson wrote:
> > Badari Pulavarty wrote:
> >
> > > On Wed, 2005-10-26 at 14:41 -0400, Dave Anderson wrote:
> > > > Sorry I've generated some unnecesary confusion re: my comments
> > > > about the use of DEFINE_PER_CPU and DECLARE_PER_CPU.
> > > > That's what I get for trying to multi-task...
> > > >
> > > > Stepping back, the init_tss array is defined in
"arch/x86_64/kernel/init_task.c".
> > > >
> > > > In 2.6.9, it's declared like so:
> > > >
> > > > /*
> > > > * per-CPU TSS segments. Threads are completely 'soft' on
Linux,
> > > > * no more per-task TSS's. The TSS size is kept
cacheline-aligned
> > > > * so they are allowed to end up in the .data.cacheline_aligned
> > > > * section. Since TSS's are completely CPU-local, we want them
> > > > * on exact cacheline boundaries, to eliminate cacheline ping-pong.
> > > > */
> > > > DEFINE_PER_CPU(struct tss_struct, init_tss)
____cacheline_maxaligned_in_smp;
> > > >
> > > > In 2.6.13, it's slightly different in that it is initialized to
INIT_TSS:
> > > >
> > > > /*
> > > > * per-CPU TSS segments. Threads are completely 'soft' on
Linux,
> > > > * no more per-task TSS's. The TSS size is kept
cacheline-aligned
> > > > * so they are allowed to end up in the .data.cacheline_aligned
> > > > * section. Since TSS's are completely CPU-local, we want them
> > > > * on exact cacheline boundaries, to eliminate cacheline ping-pong.
> > > > */
> > > > DEFINE_PER_CPU(struct tss_struct, init_tss)
____cacheline_maxaligned_in_smp = INIT_TSS;
> > > >
> > > > Both kernels have the same DECLARE_PER_CPU in the
> > > > "x86_64/processor.h" header file:
> > > >
> > > > DECLARE_PER_CPU(struct tss_struct,init_tss);
> > > >
> > > > That being the case, and not seeing why the INIT_TSS initialization
should
> > > > have anything to do with the problem at hand, I am officially stumped
at
> > > > why the 2.6.14 kernel shows the problem with your patch.
> > >
> > > Okay, I thought so too. I will take a closer look at it and let you
> > > know what I find. I am tempted to go back to 2.6.10 and see if
> > > crash works. Do you know the last known good kernel release for crash
> > > to work ?
> > >
> >
> > Sorry -- for x86_64, I can't say that I do know the last version
> > that worked. Maybe somebody else on the list that uses other
> > than Red Hat RHEL4 kernels does?
> >
> > Dave
> >
>
> Dave,
>
> I tried 2.6.10 and crash worked fine there. Here is the what I found
> interesting. On 2.6.10 the values seem reasonable, but on 2.6.14 they
> have huge values.
>
> 2.6.10:
> cpunum: 0 data_offset 10084b80f60
> cpunum: 1 data_offset 10084b88f60
>
> 2.6.14-rc5:
>
> cpunum: 0 data_offset ffff810084af5f60
> cpunum: 1 data_offset ffff810084afdf60
>
> I got curious on the top "0xffff8" part an trimmed them.
> (basically I did data_offset & 0x00000fffffffffff).
Well that certainly needs further explanation...
>
> Now I run into next problem :( I am missing something basic.
>
> crash: read error: kernel virtual address: ffff81000000fa90 type:
> "pglist_data node_next"
>
That's probably coming from node_table_init(). Could the pglist_data
list now be using per-cpu data structures? But again, I don't understand
the significance of the ffff8 at the top of the address.
I don't know either. I did some more digging around adding printk()
in 2.6.10, 2.6.14-rc5 kernel where data_offset is getting set.
Interestingly, allocmem() returns different types of address in 2.6.10 &
2.6.14-rc5 (which is causing this huge numbers).
arch/x86_64/kernel/setup.c: setup_per_cpu_areas()
for_each_cpu_mask (i, cpu_possible_map) {
char *ptr;
if (!NODE_DATA(cpu_to_node(i))) {
printk("cpu with no node %d, num_online_nodes %d
\n",
i, num_online_nodes());
ptr = alloc_bootmem(size);
} else {
ptr = alloc_bootmem_node(NODE_DATA(cpu_to_node
(i)), size);
}
if (!ptr)
panic("Cannot allocate cpu data for CPU %d\n",
i);
cpu_pda[i].data_offset = ptr - __per_cpu_start;
printk("i %d ptr %p cpustart %p offset %lx\n", i, ptr, __per_cpu_start,
cpu_pda[i].data_offset);
memcpy(ptr, __per_cpu_start, __per_cpu_end -
__per_cpu_start);
}
Here is the output:
2.6.14-rc5
i 0 ptr ffff8100050eb000 cpustart ffffffff805f50a0 offset
ffff810084af5f60
i 1 ptr ffff8100050f3000 cpustart ffffffff805f50a0 offset
ffff810084afdf60
2.6.10
i 0 ptr 0000010005100000 cpustart ffffffff8057f0a0 offset 10084b80f60
i 1 ptr 0000010005108000 cpustart ffffffff8057f0a0 offset 10084b88f60
i 2 ptr 0000010005110000 cpustart ffffffff8057f0a0 offset 10084b90f60
i 3 ptr 0000010005118000 cpustart ffffffff8057f0a0 offset 10084b98f60
i 4 ptr 0000010005120000 cpustart ffffffff8057f0a0 offset 10084ba0f60
i 5 ptr 0000010005128000 cpustart ffffffff8057f0a0 offset 10084ba8f60
i 6 ptr 0000010005130000 cpustart ffffffff8057f0a0 offset 10084bb0f60
i 7 ptr 0000010005138000 cpustart ffffffff8057f0a0 offset 10084bb8f60
Don't know enough about x86-64 stuff. May be I can ask Andi Kleen
about this change ?
Thanks,
Badari