Hi Dave,

 

On a recurring series of crashes on kernel 3.10.0-693.17.1.el7.x86_64 (RHEL 7.4), a problem was triggered by deletion of a file by a process with this stack trace:

 

#0 [ffff88115b0ef798] __schedule at ffffffff816ab2ac

#1 [ffff88115b0ef828] schedule at ffffffff816ab8a9

#2 [ffff88115b0ef838] jbd2_log_wait_commit at ffffffffc0177455 [jbd2]

#3 [ffff88115b0ef8b0] jbd2_log_do_checkpoint at ffffffffc017405f [jbd2]

#4 [ffff88115b0ef918] __jbd2_log_wait_for_space at ffffffffc017445f [jbd2]

#5 [ffff88115b0ef960] add_transaction_credits at ffffffffc016e3d3 [jbd2]

#6 [ffff88115b0ef9c0] start_this_handle at ffffffffc016e5e1 [jbd2]

#7 [ffff88115b0efa58] jbd2__journal_restart at ffffffffc016ec9e [jbd2]

#8 [ffff88115b0efa98] jbd2_journal_restart at ffffffffc016ed13 [jbd2]

#9 [ffff88115b0efaa8] ext4_truncate_restart_trans at ffffffffc027be7e [ext4]:

#10 [ffff88115b0efad8] ext4_free_branches at ffffffffc02bc9d7 [ext4]

#11 [ffff88115b0efb38] ext4_free_branches at ffffffffc02bc887 [ext4]

#12 [ffff88115b0efb98] ext4_free_branches at ffffffffc02bc887 [ext4]

#13 [ffff88115b0efbf8] ext4_ind_truncate at ffffffffc02bda5e [ext4]

#14 [ffff88115b0efcb8] ext4_truncate at ffffffffc0280ca8 [ext4]

#15 [ffff88115b0efcf0] ext4_evict_inode at ffffffffc02818e0 [ext4]

#16 [ffff88115b0efd10] evict at ffffffff8121f879

#17 [ffff88115b0efd38] iput at ffffffff81220189

#18 [ffff88115b0efd68] rfs_d_iput at ffffffffc03f7d10 [redirfs]

#19 [ffff88115b0efe00] dentry_kill at ffffffff8121a90c

#20 [ffff88115b0efe30] dput at ffffffff8121a9ce

#21 [ffff88115b0efe50] path_put at ffffffff8120d576

#22 [ffff88115b0efe68] gsch_nd_release at ffffffffc043733e [gsch]

#23 [ffff88115b0efe78] gsch_unlink_hook_fn at ffffffffc043

 

It was useful for our troubleshooting to identify the file being deleted in each of these crashes, and dentry_kill() takes a dentry pointer as its only argument:

 

struct dentry *dentry_kill(struct dentry *);

 

Finding the dentry pointer (using ‘fregs’ command in PyKdump, or digging it off the stack):

 

#19 dentry_kill called from 0xffffffff8121a9ce <dput+94>

+R12: 0xffff880328399858

+R13: 0x0

+R14: 0xffff880a0fb975b8

+RBP: 0xffff88115b0efe48

+RBX: 0xffff880328399800

1 RDI: 0xffff880328399800 arg0 struct dentry *

 

‘files -d’ on this dentry doesn’t return the path:

 

crash64> files -d 0xffff880328399800

     DENTRY           INODE           SUPERBLK     TYPE PATH

ffff880328399800                0        0         N/A

 

This is because dentry.d_inode is null; at this point in the removal process, dentry_iput() has cleared it.

 

crash64> dentry.d_inode 0xffff880328399800

  d_inode = 0x0

 

And display_dentry_info() in crash gives up if this is the case:

 

                if (!inode || !superblock)

                                goto nopath;

 

But the dentry still contains all the information needed to find the path:

 

crash64> dentry.d_sb,d_name ffff880328399800

  d_sb = 0xffff8817daaf6800

  d_name = {

    {

      {

        hash = 3169988838,

        len = 30

      },

      hash_len = 132019007718

    },

    name = 0xffff880328399838 "MRAQ0431_1_10357_982129979.arc"

 

So I modified the following:

 

(defs.h)

2069d2068

<         long dentry_d_sb;

 

(filesys.c)

1698,1702c1698,1702

<         } else {

<                 inode_buf = NULL;

<         }

< 

<         superblock = ULONG(dentry_buf + OFFSET(dentry_d_sb));

---

>                 superblock = ULONG(inode_buf + OFFSET(inode_i_sb));

>       } else {

>               inode_buf = NULL;

>               superblock = 0;

>       }

1704c1704

<       if (!superblock)

---

>       if (!inode || !superblock)

2018d2017

<       MEMBER_OFFSET_INIT(dentry_d_sb, "dentry", "d_sb");

 

With this patch, ‘files -d’ correctly returns the path:

 

crash> files -d ffff880328399800

     DENTRY           INODE           SUPERBLK     TYPE PATH

ffff880328399800                0 ffff8817daaf6800 N/A /u02/oraarch/MRAQ0431/MRAQ0431_1_10357_982129979.arc

 

Can this be included as a patch to crash?

 

Thanks,

Martin

 

Martin Moore
Linux/Tru64 RTCC Engineer

CSC Americas

HPE Technology Services
Hewlett Packard Enterprise