Reboot hangs when NFS share is mounted and lost connection to NFS server during I/O operation in Red Hat Enterprise Linux.
Environment
- Red Hat Enterprise Linux 9
- Kernels up to RHEL 9.4 GA are believed to be affected
- Red Hat Enterprise Linux 8
- All kernels believed to be affected
- Red Hat Enterprise Linux 7
- All kernels believed to be affected
- Red Hat Enterprise Linux 6
- All kernels believed to be affected
- NFS mount point
- 'hard' mount option
- NFS IO in progress
- Network partition (can be simulated with firewalld port block on NFS server)
- 'reboot'
Issue
- Reboot hangs when NFS share is mounted and lost connection to NFS server during I/O operation.
- Mostly seen nfs disconnection log.
nfs: server 1.1.1.1 not responding, still trying
Unmounting NFS filesystems: umount.nfs: <mount point>: device is busy
umount.nfs: <mount point>: device is busy
[FAILED]
Unmounting NFS filesystems (retry): [ OK ]
The issue occurs if the mount option is hard mount
Resolution
Red Hat Enterprise Linux 6
- There is no fix available for this issue.
Red Hat Enterprise Linux 7
- There is no fix available for this issue.
Red Hat Enterprise Linux 8
- There is no fix available for this issue.
Red Hat Enterprise Linux 9
- A fix was implemented in kernel-5.14.0-427.13.1.el9_4. Update to at least this version of the kernel.
Workarounds
- The 'soft' mount option can be used and should produce a bounded time for RPCs that involve IO operations, but at the trade-off of possible data loss. For more information, please see the nfs man page for the 'soft' and 'hard' mount options.
- NOTE: although RPC requests will eventually timeout according to the 'timeo=' and 'retrans=' mount options, queued requests are not all attempted simultaneously, but in batches. As a result, if there are a lot of RPC requests already queued, it may still take a long time for all of the requests to timeout
- Resolve any network connectivity issue to NFS servers before rebooting an NFS client.
- Ensure all NFS IO operations are stopped before rebooting any NFS client. Do not rely on the shutdown process for killing processes that generate NFS IO operations.
Root Cause
To maintain file system stability, the Linux kernel will not allow a filesystem to be unmounted until all its pending IO is written back to storage, and the system can't shutdown until all file systems are unmounted. Therefore, if the system has NFS IO pending and is disconnected from the server somehow (NFSD becomes unavailable, the server is shutdown, the network parts, and so on), the file system will not unmount and the system will not shutdown. This will happen when hard is used as a mount option, which is the default.
Usually, the client can recover from the failure the the condition causing the failure is corrected and the connection is reestablished. If the condition can't be corrected, the client needs to be forcefully rebooted.
Diagnostic Steps
- How to reproduce.
[root@rhel69 ~]# uname -a
Linux rhel69 2.6.32-696.20.1.el6.x86_64 #1 SMP Fri Jan 12 15:07:59 EST 2018 x86_64 x86_64 x86_64 GNU/Linux
# mount -t nfs:/ /mnt
# dd if=/dev/zero of=/mnt/a &
# ifdown eth0 <<---- key point
# reboot
- Then, you can see below with hang :
Unmounting NFS filesystems: umount.nfs: /mnt: device is busy
umount.nfs: /mnt: device is busy
[FAILED]
- vmcore in RHEL 6.
It's been blocked for 16 minutes and 55 seconds.
crash> ps -m | grep UN
[ 0 00:16:55.201] [UN] PID: 26104 TASK: ffff8809e769e040 CPU: 8 COMMAND: "reboot"
[ 0 00:16:55.595] [UN] PID: 10922 TASK: ffff880c1c896040 CPU: 7 COMMAND: "java"
crash> ps -p 26104
PID: 0 TASK: ffffffff81a95020 CPU: 0 COMMAND: "swapper"
PID: 1 TASK: ffff88061e48b520 CPU: 15 COMMAND: "init"
PID: 26104 TASK: ffff8809e769e040 CPU: 8 COMMAND: "reboot"
crash> bt
PID: 26104 TASK: ffff8809e769e040 CPU: 8 COMMAND: "reboot"
#0 [ffff880c1b8dbbc8] schedule at ffffffff8154a8e0
#1 [ffff880c1b8dbca0] io_schedule at ffffffff8154b123
#2 [ffff880c1b8dbcc0] sync_page at ffffffff8112e63d
#3 [ffff880c1b8dbcd0] __wait_on_bit at ffffffff8154bc0f
#4 [ffff880c1b8dbd20] wait_on_page_bit at ffffffff8112e873
#5 [ffff880c1b8dbd80] wait_on_page_writeback_range at ffffffff8112ec9b
#6 [ffff880c1b8dbe80] filemap_fdatawait at ffffffff8112ed5f
#7 [ffff880c1b8dbe90] sync_inodes_sb at ffffffff811c65e4
#8 [ffff880c1b8dbf20] __sync_filesystem at ffffffff811cce52
#9 [ffff880c1b8dbf40] sync_filesystems at ffffffff811ccf76
#10 [ffff880c1b8dbf70] sys_sync at ffffffff811cd001
#11 [ffff880c1b8dbf80] system_call_fastpath at ffffffff8100b0d2
RIP: 00007ffa6fe98937 RSP: 00007ffc49111868 RFLAGS: 00010202
RAX: 00000000000000a2 RBX: ffffffff8100b0d2 RCX: 00007ffa7260e200
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 00007ffc49111bd8 R8: 00000000ffffffff R9: 0000000000100000
R10: 00007ffc49111620 R11: 0000000000000246 R12: ffffffff811cd001
R13: ffff880c1b8dbf78 R14: 0000000000000000 R15: 00007ffc49111bd0
ORIG_RAX: 00000000000000a2 CS: 0033 SS: 002b
- nfs3 environment in RHEL7
PID: 1 TASK: ffff88012f488000 CPU: 4 COMMAND: "systemd-shutdow"
#0 [ffff88012f487998] __schedule at ffffffff8168ba15
#1 [ffff88012f487a00] schedule at ffffffff8168c069
#2 [ffff88012f487a10] rpc_wait_bit_killable at ffffffffa0242e24 [sunrpc]
#3 [ffff88012f487a30] __wait_on_bit at ffffffff81689c15
#4 [ffff88012f487a70] out_of_line_wait_on_bit at ffffffff81689cc1
#5 [ffff88012f487ae8] __rpc_execute at ffffffffa0244364 [sunrpc]
#6 [ffff88012f487b50] rpc_execute at ffffffffa024739e [sunrpc]
#7 [ffff88012f487b80] rpc_run_task at ffffffffa023a310 [sunrpc]
#8 [ffff88012f487ba0] rpc_call_sync at ffffffffa023a380 [sunrpc]
#9 [ffff88012f487bf8] nfs3_rpc_wrapper.constprop.11 at ffffffffa061e5bb [nfsv3]
#10 [ffff88012f487c28] nfs3_proc_getattr at ffffffffa061f296 [nfsv3]
#11 [ffff88012f487c70] __nfs_revalidate_inode at ffffffffa068110f [nfs]
#12 [ffff88012f487ca8] nfs_revalidate_inode at ffffffffa06818f2 [nfs]
#13 [ffff88012f487cc8] nfs_weak_revalidate at ffffffffa0678beb [nfs]
#14 [ffff88012f487ce8] complete_walk at ffffffff81208f57
#15 [ffff88012f487d08] path_lookupat at ffffffff8120c203
#16 [ffff88012f487da0] filename_lookup at ffffffff8120c94b
#17 [ffff88012f487dd8] kern_path at ffffffff81210125
#18 [ffff88012f487ea8] do_mount at ffffffff81220a71
#19 [ffff88012f487f28] sys_mount at ffffffff81221516
#20 [ffff88012f487f80] system_call_fastpath at ffffffff81697089
RIP: 00007f5668734e9a RSP: 00007ffda98d3d18 RFLAGS: 00000246
RAX: 00000000000000a5 RBX: ffffffff81697089 RCX: ffffffffffffffff
RDX: 0000000000000000 RSI: 00007f5669f7ee60 RDI: 0000000000000000
RBP: 00007f5669f7eb50 R8: 0000000000000000 R9: 00007f5669264840
R10: 0000000000000021 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffda98d3df3 R14: 00007ffda98d3e08 R15: 00007ffda98d3e08
ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b
crash> log
...
[ 8373.499406] systemd-shutdown[1]: Unmounting file systems.
[ 8553.589929] nfs: server 1.1.1.1 not responding, still trying
- nfs4 environment in RHEL7
PID: 1 TASK: ffff88017ce58000 CPU: 5 COMMAND: "systemd-shutdow"
#0 [ffff88017ce4fa98] __schedule at ffffffff816ab2ac
#1 [ffff88017ce4fb20] schedule at ffffffff816ab8a9
#2 [ffff88017ce4fb30] rpc_wait_bit_killable at ffffffffc036d8e4 [sunrpc]
#3 [ffff88017ce4fb50] __wait_on_bit at ffffffff816a9405
#4 [ffff88017ce4fb90] out_of_line_wait_on_bit at ffffffff816a94b1
#5 [ffff88017ce4fc08] __rpc_wait_for_completion_task at ffffffffc036d8bd [sunrpc]
#6 [ffff88017ce4fc18] _nfs4_proc_delegreturn at ffffffffc053021a [nfsv4]
#7 [ffff88017ce4fcc8] nfs4_proc_delegreturn at ffffffffc0536a31 [nfsv4]
#8 [ffff88017ce4fd40] nfs_do_return_delegation at ffffffffc054a3c9 [nfsv4]
#9 [ffff88017ce4fd60] nfs_inode_return_delegation_noreclaim at ffffffffc054b067 [nfsv4]
#10 [ffff88017ce4fd78] nfs4_evict_inode at ffffffffc0549759 [nfsv4]
#11 [ffff88017ce4fd90] evict at ffffffff8121f879
#12 [ffff88017ce4fdb8] dispose_list at ffffffff8121f98e
#13 [ffff88017ce4fde0] evict_inodes at ffffffff81220646
#14 [ffff88017ce4fe30] generic_shutdown_super at ffffffff812056a8
#15 [ffff88017ce4fe58] kill_anon_super at ffffffff81205aa2
#16 [ffff88017ce4fe70] nfs_kill_super at ffffffffc04f3e3b [nfs]
#17 [ffff88017ce4fe90] deactivate_locked_super at ffffffff81205e59
#18 [ffff88017ce4feb0] deactivate_super at ffffffff812065c6
#19 [ffff88017ce4fec8] cleanup_mnt at ffffffff8122395f
#20 [ffff88017ce4fee0] __cleanup_mnt at ffffffff812239f2
#21 [ffff88017ce4fef0] task_work_run at ffffffff810aefc7
#22 [ffff88017ce4ff30] do_notify_resume at ffffffff8102ab52
#23 [ffff88017ce4ff50] int_signal at ffffffff816b8d37
RIP: 00007fbdac4953e7 RSP: 00007ffd08e106a8 RFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000561c42aa7700 RCX: ffffffffffffffff
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000561c42aa8970
RBP: 0000561c42aa7850 R8: 00007fbdac3e5988 R9: 0000000000000002
R10: 0000000000000000 R11: 0000000000000246 R12: 0000561c42aa8970
R13: 00007ffd08e10750 R14: 00007ffd08e10738 R15: 0000000000000000
ORIG_RAX: 00000000000000a6 CS: 0033 SS: 002b
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.