RHEL6.7.z: NFS client with kernels 2.6.32-573.10.2.el6 or above hangs with 'not responding, still trying' messages and running processes in _spin_lock
Environment
- Red Hat Enterprise Linux 6.7 (NFS client)
- kernels 2.6.32-573.10.2.el6 or above and earlier than 2.6.32-573.27.1.el6
- kernel earlier than 2.6.32-642.el6
- NFS
Issue
Seeing BUG: soft lockup messages with a process stuck in _spin_lock called from nfs_clear_request_commit
NFS client with flush thread stuck in _spin_lock called from nfs_scan_commit
NFS client with many running processes stuck in _spin_lock, and nfs: server XYZ not responding, still trying seen
Several processes are kept in D state (disk sleep) and never returns. A "df" command never retuns, and hangs. Even a "sudo" command does not work. Load average keep increasing with no process consuming high CPU. Ultimately we had to reboot the server.
When the problem strikes, we see some of these messages:
Feb 25 10:27:28 nfs-client kernel: [64627.406889] lockd: server nfs-server.example.com not responding, still trying
Resolution
Red Hat Enterprise Linux 6
- This issue has been resolved with the errata RHSA-2016-0855 package(s)
kernel-2.6.32-642.el6or higher for the channel(s) RHEL Desktop (v.6), RHEL HPC Node (v.6), RHEL Server (v.6), RHEL Workstation (v.6).- Originally tracked inside private Red Hat Bug 1322823 - "RHEL6.7.z: nfs client kernel deadlock due to inode->i_lock == nfs_commit_info.lock in code path nfs_wb_page_cancel -> nfs_clear_request_commit due to patch to fix bug 1135601."
Red Hat Enterprise Linux 6.7.z (EUS)
Workaround
- Downgrade the kernel to a release prior to 2.6.32-573.10.2.el6, for example
# yum downgrade kernel-2.6.32-573.8.1.el6
Root Cause
- Regression in certain RHEL6.7.z kernels caused by the fix to Bug 1135601 - System panics with "kernel BUG at fs/nfs/inode.c:113!
- The technical explanation of the problem is as follows. First, note that inside the NFS client code, the address of
inode->i_lockspinlock is copied into another NFS client structure,nfs_commit_info.lock(see function nfs_init_cinfo_from_inode). To fix bug 1135601, theinode->i_lockspinlock is now taken insidenfs_wb_page_canceland before callingnfs_clear_request_commit. Unfortunately, insidenfs_clear_request_commit, thenfs_commit_info.lockspinlock is taken, which results in the same spinlock being taken twice and a deadlock results. - The simple fix is to drop the inode->i_lock before calling
nfs_clear_request_commit:
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 7d8c4d3..5ae82c2 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1784,8 +1784,8 @@ int nfs_wb_page_cancel(struct inode *inode, struct page *page)
break;
}
if (nfs_lock_request(req)) {
- nfs_clear_request_commit(req);
spin_unlock(&inode->i_lock);
+ nfs_clear_request_commit(req);
nfs_inode_remove_request(req);
/*
* In case nfs_inode_remove_request has marked the
Diagnostic Steps
General steps
- Configure kdump and gather a vmcore at the time of the hang
Analysis of vmcore
- Look inside the vmcore for an active process similar to the following, running and stuck inside
_spin_lock, which is called fromnfs_clear_request_commit:
crash> foreach active bt
....
PID: 11305 TASK: ffff881c7715aab0 CPU: 8 COMMAND: "sas"
#0 [ffff881078806e90] crash_nmi_callback at ffffffff81033cf6
#1 [ffff881078806ea0] notifier_call_chain at ffffffff8153f545
#2 [ffff881078806ee0] atomic_notifier_call_chain at ffffffff8153f5aa
#3 [ffff881078806ef0] notify_die at ffffffff810a783e
#4 [ffff881078806f20] do_nmi at ffffffff8153d203
#5 [ffff881078806f50] nmi at ffffffff8153cac0
[exception RIP: _spin_lock+0x1c]
RIP: ffffffff8153c32c RSP: ffff88104032fb58 RFLAGS: 00000297
RAX: 00000000000009c1 RBX: ffff880a82b84bc0 RCX: 0000000000000000
RDX: 00000000000009c0 RSI: ffff8809ff681820 RDI: ffff8809ff6818d0
RBP: ffff88104032fb58 R8: ffff88104032fc80 R9: 0000000000000002
R10: ffffea0022eec790 R11: 0000000000000003 R12: ffff88104032fb68
R13: 0000000000000000 R14: ffff8809ff6818d0 R15: 0000000000000020
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <NMI exception stack> ---
#6 [ffff88104032fb58] _spin_lock at ffffffff8153c32c
#7 [ffff88104032fb60] nfs_clear_request_commit at ffffffffa03e9dc1 [nfs]
#8 [ffff88104032fbb0] nfs_wb_page_cancel at ffffffffa03e9ec8 [nfs]
#9 [ffff88104032fbe0] nfs_invalidate_page at ffffffffa03d74f7 [nfs]
#10 [ffff88104032fc00] do_invalidatepage at ffffffff8113edb5
#11 [ffff88104032fc10] truncate_inode_page at ffffffff8113f0d2
#12 [ffff88104032fc30] truncate_inode_pages_range at ffffffff8113f47f
#13 [ffff88104032fd20] truncate_inode_pages at ffffffff8113f8a5
#14 [ffff88104032fd30] truncate_pagecache at ffffffff8113f8ff
#15 [ffff88104032fd60] nfs_setattr_update_inode at ffffffffa03d9eb9 [nfs]
#16 [ffff88104032fd90] nfs3_proc_setattr at ffffffffa03eca6b [nfs]
#17 [ffff88104032fe20] nfs_setattr at ffffffffa03db610 [nfs]
#18 [ffff88104032fe60] notify_change at ffffffff811b1338
#19 [ffff88104032fed0] do_truncate at ffffffff8118fee4
#20 [ffff88104032ff40] sys_ftruncate at ffffffff81190290
#21 [ffff88104032ff80] system_call_fastpath at ffffffff8100b0d2
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.