----- Original Message -----
Hi Dave,
Thanks for the help.
I have some doubts regarding kdump and crash utility.
I am analyzing a vmcore dump caused by an oops in customer location
using crash utility.The oops report is below
[345132.723424] BUG: unable to handle kernel NULL pointer dereference
at
0000000000000005
[345132.724928] IP: [<ffffffff811f03b3>] n_tty_read+0x58c/0x818
[345132.726100] PGD 2c8e03067 PUD 2cbd88067 PMD 0
[345132.727187] Oops: 0000 [#1] SMP
[345132.727879] last sysfs file: /sys/block/loop7/dev
[345132.728935] CPU 1
[345132.729396] Modules linked in: xt_tcpudp iptable_filter ip_tables
x_tables strmfs_mod bond0 ipmi_devintf hpwdt sctp ipv6 crc32c
libcrc32c
loop ipmi_si tpm_tis ipmi_msghandler hpilo tpm tpm_bios psmouse
serio_raw shpchp pci_hotplug container processor evdev ext3 jbd
mbcache
dm_mirror dm_region_hash dm_log dm_snapshot dm_mod sg sr_mod cdrom
ide_pci_generic ide_core usbhid hid ata_piix ata_generic libata
ehci_hcd
bnx2 uhci_hcd e1000e cciss scsi_mod button thermal fan thermal_sys edd
[last unloaded: scsi_wait_scan]
[345132.739511] Pid: 13366, comm: telnet Not tainted
2.6.32-cdma-18-amd64 #1 ProLiant DL380 G6
[345132.741423] RIP: 0010:[<ffffffff811f03b3>] [<ffffffff811f03b3>]
n_tty_read+0x58c/0x818
[345132.743220] RSP: 0018:ffff88031ce75da8 EFLAGS: 00010246
[345132.744469] RAX: 0000000000000000 RBX: ffff8802cbd54a68 RCX: 000000000061c044
[345132.746061] RDX: 0000000000000005 RSI: ffff88031ce75e87 RDI: ffff8802cbd54d1c
[345132.747726] RBP: ffff88031ce75eb8 R08: 0000000000000000 R09: 0000000000000000
[345132.749391] R10: 0000000000616680 R11: 0000000000000246 R12: 000000000061c044
[345132.750981] R13: ffff8802cbd54800 R14: 0000000000000000 R15: 7fffffffffffffff
[345132.752650] FS: 00007ffff7fee6f0(0000) GS:ffff880033020000(0000)
knlGS:0000000000000000
[345132.754569] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[345132.755915] CR2: 0000000000000005 CR3: 000000030c408000 CR4: 00000000000006e0
[345132.757579] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[345132.759169] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[345132.760778] Process telnet (pid: 13366, threadinfo ffff88031ce74000, task
ffff88031b60d580)
[345132.762707] Stack:
[345132.763162] ffff88031b60d580 ffff88031b60d580 ffff88031b60d580 ffff88031b60d580
[345132.764791] <0> 000000000061c02b 0000000000000000 0000000000000000
000000000061c02a
[345132.766510] <0> ffff8802de651a40 ffff8802cbd549c0 ffff8802cbd54c90
ffff8802cbd54d1c
[345132.768270] Call Trace:
[345132.768877] [<ffffffff81045f84>] ? default_wake_function+0x0/0xf
[345132.770309] [<ffffffff811ebf7e>] tty_read+0x7d/0xba
[345132.771526] [<ffffffff810ebcc8>] vfs_read+0xab/0x167
[345132.772541] [<ffffffff810ebe48>] sys_read+0x47/0x6f
[345132.773526] [<ffffffff8100bbc2>] system_call_fastpath+0x16/0x1b
[345132.774652] Code: 00 41 8b 85 5c 02 00 00 48 8b 9d 78 ff ff ff f0 0f
b3 03 45 19 f6 49 63 95 5c 02 00 00 49 8b 85 50 02 00 00 48 8b bd 48 ff
ff ff <0f> be 1c 10 e8 fc 6b 0e 00 48 89 c6 41 8b 85 5c 02 00 00 41 ff
[345132.778840] RIP [<ffffffff811f03b3>] n_tty_read+0x58c/0x818
[345132.780107] RSP <ffff88031ce75da8>
[345132.780969] CR2: 0000000000000005
[345132.781786] hpwdt: New timer passed in is 120 seconds.
[345132.782942] hpwdt: timer reset to 120 for kdump
After analysis, we figured out that the crash occurs in the function
n_read_tty of kernel-source/drivers/char/n_tty.c . The oops occurred on
linux kernel 2.6.32. Below is the code fragment where the page fault
occurred. The page fault occurs when executing the statement c =
tty->read_buf[tty->read_tail] .
/* N.B. avoid overrun if nr == 0 */
while (nr && tty->read_cnt) {
int eol;
eol = test_and_clear_bit(tty->read_tail,
tty->read_flags);
c = tty->read_buf[tty->read_tail]; //
page fault statement after analyzing oops
spin_lock_irqsave(&tty->read_lock, flags);
tty->read_tail = ((tty->read_tail+1) &
(N_TTY_BUF_SIZE-1));
tty->read_cnt--;
if (eol) {
/* this test should be
redundant:
* we shouldn't be reading data
if
* canon_data is 0
*/
if (--tty->canon_data < 0)
tty->canon_data = 0;
}
spin_unlock_irqrestore(&tty->read_lock,
flags);
Below is the contents of the structure tty_struct ( at the time of
oops
). This was passed as an argument to the function n_read_tty().
tty_struct ffff8802cbd54800
struct tty_struct { ...
magic = 21505,
driver = 0xffff88031b54ea00,
ops = 0xffffffff8130f650,
name = "pts9\000\...",
driver_data = 0xffff88029c8a9668,
icanon = 1 '\001',
read_buf = 0xffff8802cbfe6000 "",
read_head = 0,
read_tail = 0,
read_cnt = 0,
read_flags = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
canon_data = 0,
......................................
As per crash utility the field read_cnt is 0 when kernel oopsed.In that
case, the statement while (nr && tty->read_cnt) in the above code
fragment should have failed. This leads me to think that there was some
other thread/task in kernel which should have updated the read_cnt
field in parallel. However the crash utility reports that the runqueue
of all CPUs at the time of crash as idle. Except CPU1 which was
executing the user program telnet in kernel context ( system call ).
Below is the runqueue output.
CPU 0 RUNQUEUE: ffff880033012d80
CURRENT: PID: 0 TASK: ffffffff814204b0 COMMAND: "swapper"
RT PRIO_ARRAY: ffff880033012e98
[no tasks queued]
CFS RB_ROOT: ffff880033012e10
[no tasks queued]
CPU 1 RUNQUEUE: ffff880033032d80
CURRENT: PID: 13366 TASK: ffff88031b60d580 COMMAND: "telnet"
RT PRIO_ARRAY: ffff880033032e98
[no tasks queued]
CFS RB_ROOT: ffff880033032e10
[no tasks queued]
CPU 2 RUNQUEUE: ffff880033052d80
CURRENT: PID: 0 TASK: ffff88031e0e3540 COMMAND: "swapper"
RT PRIO_ARRAY: ffff880033052e98
[no tasks queued]
CFS RB_ROOT: ffff880033052e10
[no tasks queued]
CPU 3 RUNQUEUE: ffff880033072d80
CURRENT: PID: 0 TASK: ffff88031e113580 COMMAND: "swapper"
RT PRIO_ARRAY: ffff880033072e98
[no tasks queued]
CFS RB_ROOT: ffff880033072e10
[no tasks queued]
How is this logically possible. Crash reports there are no tasks running
currently. Or before the oops trigger and kdump capturing the memory
image, some process/thread ran which could have updated the data
structure. I wanted to know if this scenario is possible. I kindly
request your suggestion/guidance. Please let me know if you need any
other details.
It's not clear to me, but I'm not at all familiar with this code area.
Maybe during hard or soft IRQ handling on this or another cpu? Presumably
there would be protection against that happening, and maybe it's of interest
that the very next instruction after the fault is a spin_lock_irqsave()
call, but that's just a wild guess on my part...
Anyway, the crash utility shows what the state of memory was at the point
when the "telnet" process (indirectly) issued NMI interrupts to all of
the other cpus. You can verify where the other cpus were (in idle) by
"bt -a", which shows/verifies the reception of the NMI shutdown interrupt.
Dave