RHEL6.7.z: NFS client with kernels 2.6.32-573.10.2.el6 or above hangs with 'not responding, still trying' messages and running processes in _spin_lock

Solution Unverified - Updated

Environment

  • Red Hat Enterprise Linux 6.7 (NFS client)
    • kernels 2.6.32-573.10.2.el6 or above and earlier than 2.6.32-573.27.1.el6
    • kernel earlier than 2.6.32-642.el6
  • NFS

Issue

Seeing BUG: soft lockup messages with a process stuck in _spin_lock called from nfs_clear_request_commit
NFS client with flush thread stuck in _spin_lock called from nfs_scan_commit
NFS client with many running processes stuck in _spin_lock, and nfs: server XYZ not responding, still trying seen

Several processes are kept in D state (disk sleep) and never returns. A "df" command never retuns, and hangs. Even a "sudo" command does not work. Load average keep increasing with no process consuming high CPU. Ultimately we had to reboot the server.

When the problem strikes, we see some of these messages:

Feb 25 10:27:28 nfs-client kernel: [64627.406889] lockd: server nfs-server.example.com not responding, still trying

Resolution

Red Hat Enterprise Linux 6

  • This issue has been resolved with the errata RHSA-2016-0855 package(s) kernel-2.6.32-642.el6 or higher for the channel(s) RHEL Desktop (v.6), RHEL HPC Node (v.6), RHEL Server (v.6), RHEL Workstation (v.6).
    • Originally tracked inside private Red Hat Bug 1322823 - "RHEL6.7.z: nfs client kernel deadlock due to inode->i_lock == nfs_commit_info.lock in code path nfs_wb_page_cancel -> nfs_clear_request_commit due to patch to fix bug 1135601."

Red Hat Enterprise Linux 6.7.z (EUS)

Workaround

  • Downgrade the kernel to a release prior to 2.6.32-573.10.2.el6, for example
# yum downgrade kernel-2.6.32-573.8.1.el6

Root Cause

  • Regression in certain RHEL6.7.z kernels caused by the fix to Bug 1135601 - System panics with "kernel BUG at fs/nfs/inode.c:113!
  • The technical explanation of the problem is as follows. First, note that inside the NFS client code, the address of inode->i_lock spinlock is copied into another NFS client structure, nfs_commit_info.lock (see function nfs_init_cinfo_from_inode). To fix bug 1135601, the inode->i_lock spinlock is now taken inside nfs_wb_page_cancel and before calling nfs_clear_request_commit. Unfortunately, inside nfs_clear_request_commit, the nfs_commit_info.lock spinlock is taken, which results in the same spinlock being taken twice and a deadlock results.
  • The simple fix is to drop the inode->i_lock before calling nfs_clear_request_commit:
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 7d8c4d3..5ae82c2 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1784,8 +1784,8 @@ int nfs_wb_page_cancel(struct inode *inode, struct page *page)
                        break;
                }
                if (nfs_lock_request(req)) {
-                       nfs_clear_request_commit(req);
                        spin_unlock(&inode->i_lock);
+                       nfs_clear_request_commit(req);
                        nfs_inode_remove_request(req);
                        /*
                         * In case nfs_inode_remove_request has marked the

Diagnostic Steps

General steps

  • Configure kdump and gather a vmcore at the time of the hang

Analysis of vmcore

  • Look inside the vmcore for an active process similar to the following, running and stuck inside _spin_lock, which is called from nfs_clear_request_commit:
crash> foreach active bt
....
PID: 11305  TASK: ffff881c7715aab0  CPU: 8   COMMAND: "sas"
 #0 [ffff881078806e90] crash_nmi_callback at ffffffff81033cf6
 #1 [ffff881078806ea0] notifier_call_chain at ffffffff8153f545
 #2 [ffff881078806ee0] atomic_notifier_call_chain at ffffffff8153f5aa
 #3 [ffff881078806ef0] notify_die at ffffffff810a783e
 #4 [ffff881078806f20] do_nmi at ffffffff8153d203
 #5 [ffff881078806f50] nmi at ffffffff8153cac0
    [exception RIP: _spin_lock+0x1c]
    RIP: ffffffff8153c32c  RSP: ffff88104032fb58  RFLAGS: 00000297
    RAX: 00000000000009c1  RBX: ffff880a82b84bc0  RCX: 0000000000000000
    RDX: 00000000000009c0  RSI: ffff8809ff681820  RDI: ffff8809ff6818d0
    RBP: ffff88104032fb58   R8: ffff88104032fc80   R9: 0000000000000002
    R10: ffffea0022eec790  R11: 0000000000000003  R12: ffff88104032fb68
    R13: 0000000000000000  R14: ffff8809ff6818d0  R15: 0000000000000020
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
 #6 [ffff88104032fb58] _spin_lock at ffffffff8153c32c
 #7 [ffff88104032fb60] nfs_clear_request_commit at ffffffffa03e9dc1 [nfs]
 #8 [ffff88104032fbb0] nfs_wb_page_cancel at ffffffffa03e9ec8 [nfs]
 #9 [ffff88104032fbe0] nfs_invalidate_page at ffffffffa03d74f7 [nfs]
#10 [ffff88104032fc00] do_invalidatepage at ffffffff8113edb5
#11 [ffff88104032fc10] truncate_inode_page at ffffffff8113f0d2
#12 [ffff88104032fc30] truncate_inode_pages_range at ffffffff8113f47f
#13 [ffff88104032fd20] truncate_inode_pages at ffffffff8113f8a5
#14 [ffff88104032fd30] truncate_pagecache at ffffffff8113f8ff
#15 [ffff88104032fd60] nfs_setattr_update_inode at ffffffffa03d9eb9 [nfs]
#16 [ffff88104032fd90] nfs3_proc_setattr at ffffffffa03eca6b [nfs]
#17 [ffff88104032fe20] nfs_setattr at ffffffffa03db610 [nfs]
#18 [ffff88104032fe60] notify_change at ffffffff811b1338
#19 [ffff88104032fed0] do_truncate at ffffffff8118fee4
#20 [ffff88104032ff40] sys_ftruncate at ffffffff81190290
#21 [ffff88104032ff80] system_call_fastpath at ffffffff8100b0d2
Components
Category
Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.