----- Original Message -----
At 2012/4/11 22:50, Dave Anderson Wrote:
>
>
> ----- Original Message -----
>> Hello Dave,
>>
>> I cannot get all kernels at hand. So I have to ask you about the code.
>> Please show me.
>
> Why not? Just download the upstream kernels from here:
>
>
http://www.kernel.org/pub/linux/kernel/v2.6/
>
>>>
>>> (a) On these kernel versions:
>>>
>>> 2.6.9-89.ELxenU
>>> 2.6.15-1.2054_FC5
>>> 2.6.16.33-xen
>>> 2.6.18-1.2714.el5xen
>>> 2.6.18-36.el5xen
>>> 2.6.18-58.el5xen
>>> 2.6.18-152.el5xen
>>> 2.6.31 uniprocessor kernel
>>>
>>> the command fails immediatedly with this error:
>>>
>>> ipcs: cannot resolve "hugetlbfs_file_operations"
>>>
>>>
>>> (b) On *all* RHEL5 2.6.18-era kernels, the message queue display
>>> always fails like this:
>>>
>>> ------ Message Queues --------
>>> KEY MSQID UID PERMS USED-BYTES
>>> MESSAGES
>>> ipcs: invalid structure member offset: kern_ipc_perm_id
>>> FILE: ipcs.c LINE: 899 FUNCTION: get_msg_info()
>>
>> I want to see the struct msg_queue and struct struct
>> kern_ipc_perm.
>
> Here is the output from a RHEL5 kernel:
>
> crash> msg_queue
> struct msg_queue {
> struct kern_ipc_perm q_perm;
> int q_id;
> time_t q_stime;
> time_t q_rtime;
> time_t q_ctime;
> long unsigned int q_cbytes;
> long unsigned int q_qnum;
> long unsigned int q_qbytes;
> pid_t q_lspid;
> pid_t q_lrpid;
> struct list_head q_messages;
> struct list_head q_receivers;
> struct list_head q_senders;
> }
> SIZE: 160
> crash> kern_ipc_perm
> struct kern_ipc_perm {
> spinlock_t lock;
> int deleted;
> key_t key;
> uid_t uid;
> gid_t gid;
> uid_t cuid;
> gid_t cgid;
> mode_t mode;
> long unsigned int seq;
> void *security;
> }
> SIZE: 48
> crash>
>
> which is the same as the upstream 2.6.18 kernel.
Ahh, I khow the reason now: msg_queue_q_id is not initialized!!!!
>
>>>
>>> (c) On this 2.6.36-0.16.rc3.git0.fc15 Fedora kernel, it shows:
>>>
>>> ------ Shared Memory Segments ------
>>> KEY SHMID UID PERMS BYTES
>>> NATTCH
>>> STATUS
>>> ipcs: invalid kernel virtual address: 10 type:
>>> "nsproxy.ipc_ns"
>>
>> what is struct nsproxy? Or is there any symbol referring to
>> ipc_ns?
>
> crash> nsproxy
> struct nsproxy {
> atomic_t count;
> struct uts_namespace *uts_ns;
> struct ipc_namespace *ipc_ns;
> struct mnt_namespace *mnt_ns;
> struct pid_namespace *pid_ns;
> struct net *net_ns;
> }
> SIZE: 48
> crash>
>
> It's the same as upstream 2.6.36, but it's not the offset that's
invalid,
> it's the NULL "nsproxy" address.
I am surprised that nsproxy is NULL.
Each user task belongs to a namesapce, so current_task.nsproxy should not
be NULL. I guess the current task may be a kernel thread in your test.
Thanks
Wen Congyang
Actually, even kernel threads have a valid task->nsproxy setting.
But checking into this a bit further, it's not a kernel thread,
but an exiting thread. Note the invocation-time warning that
the active, panic, task has been removed from the PID hash:
$ crash vmcore.2.6.36-0.16.rc3.git0.fc15.x86_64
vmlinux-2.6.36-0.16.rc3.git0.fc15.x86_64.gz
crash 6.0.6rc5
Copyright (C) 2002-2012 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
please wait... (determining panic task)
WARNING: active task ffff88001d190000 on cpu 0 not found in PID hash
KERNEL: vmlinux-2.6.36-0.16.rc3.git0.fc15.x86_64.gz
DUMPFILE: vmcore.2.6.36-0.16.rc3.git0.fc15.x86_64
CPUS: 1
DATE: Fri Sep 24 20:46:58 2010
UPTIME: 00:27:55
LOAD AVERAGE: 1.53, 1.80, 1.56
TASKS: 118
NODENAME: dyna0.home.front
RELEASE: 2.6.36-0.16.rc3.git0.fc15.x86_64
VERSION: #1 SMP Fri Sep 3 16:00:27 UTC 2010
MACHINE: x86_64 (1600 Mhz)
MEMORY: 510.7 MB
PANIC: ""
PID: 7124
COMMAND: "hardlink"
TASK: ffff88001d190000 [THREAD_INFO: ffff88001b17a000]
CPU: 0
STATE: EXIT_DEAD (PANIC)
crash>
Note that the "ipcs" command uses the current task, whose task_struct
address is ffff88001d190000 in this particular case, and therefore the
task_struct.nsproxy address is ffff88001d1905f0:
crash> task -R nsproxy
PID: 7124 TASK: ffff88001d190000 CPU: 0 COMMAND: "hardlink"
nsproxy = 0x0,
crash>
Resulting in the error:
crash> set debug 4
debug: 4
text hit rate: 62% (3143 of 5040)
crash> ipcs
------ Shared Memory Segments ------
KEY SHMID UID PERMS BYTES NATTCH STATUS
<readmem: ffff88001d1905f0, KVADDR, "task_struct.nsproxy", 8, (FOE),
7fffaf719e98>
<read_kdump: addr: ffff88001d1905f0 paddr: 1d1905f0 cnt: 8>
<readmem: 10, KVADDR, "nsproxy.ipc_ns", 8, (FOE), 7fffaf719e90>
ipcs: invalid kernel virtual address: 10 type: "nsproxy.ipc_ns"
text hit rate: 62% (3143 of 5040)
crash>
The "ipcs" code may have to do something similar to what the "mount"
command does here in cmd_mount():
/* find a context */
pid = 1;
while ((namespace_context = pid_to_context(pid)) == NULL)
pid++;
where namespace_context is used later in get_mount_list():
} else if (VALID_MEMBER(task_struct_nsproxy)) {
tc = namespace_context;
readmem(tc->task + OFFSET(task_struct_nsproxy), KVADDR,
&nsproxy, sizeof(void *), "task nsproxy",
FAULT_ON_ERROR);
if (!readmem(nsproxy + OFFSET(nsproxy_mnt_ns), KVADDR,
&mnt_ns, sizeof(void *), "nsproxy mnt_ns",
RETURN_ON_ERROR|QUIET))
error(FATAL, "cannot determine mount list
location!\n");
if (!readmem(mnt_ns + OFFSET(mnt_namespace_root), KVADDR,
&root, sizeof(void *), "mnt_namespace root",
RETURN_ON_ERROR|QUIET))
error(FATAL, "cannot determine mount list
location!\n");
Usually pid 1 would suffice, but as I recall, Bob Montgomery ran into
a vmcore where pid 1 wasn't found in the PID hash, so we added this so
that it keeps looking until it found one?
Dave