On 24 June 2011 15:29, Dave Anderson <anderson(a)redhat.com> wrote:
 Yeah, although the contents tty->read_buf are hard to explain.
 It gets allocated during n_tty_open() and freed during n_tty_close().
 And at the beginning of n_tty_read() there's:
        BUG_ON(!tty->read_buf);
 and the dump-time contents show a buffer allocated:
 crash> tty_struct ffff8802cbd54800
  struct tty_struct { ...
   magic = 21505,
   driver = 0xffff88031b54ea00,
   ops = 0xffffffff8130f650,
   name = "pts9\000\...",
   driver_data = 0xffff88029c8a9668,
   icanon = 1 '\001',
   read_buf = 0xffff8802cbfe6000 "",
   read_head = 0,
   read_tail = 0,
   read_cnt = 0,
   ...
 but it's a NULL pointer when read during the function?
 
Hmm, that is interesting. Assuming that we are in fact dealing with a
software bug where this memory area changed recently, the only possible
explanation I can see is that n_tty_close() has been called while
n_tty_read() is in progress. I know that's not supposed to happen, but the
implication would be that is the bug - a locking issue in the tty layer that
allows them to be closed while calls are in progress. Sadly that code isn't
as mature as it should be and there was a whole load of concurrency issues
fixed in the late 2.6.30s; I don't know the details, but it might be one of
those.
Alternatively it's regular old hardware failure and we're just looking at
junk.
I think that's about as far as we can go on the information available (and
also the extent to which it's relevant to this mailing list)