NotReady node with high load and many D state processes in OpenShift 4

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4
  • Node

Issue

  • There is a node in NotReady status with very high load and many processes in D state.
  • The load average shown for the node is very high, while there are no processes using excessive CPU.

Resolution

Follow the Diagnostic Steps section to find the cause of many processes in D status. There is additional information about processes in D state in what is "D" state (or dstate, d-state)?

If the cause is the rpc_wait_bit_killable, it typically means the processes are waiting on a response from a NFS server, and checking if there is any issue with the NFS will be required.
For processes in D state caused by NFS, please refer to the workaround below. For other causes shown in the ps -elfL WCHAN column, please open a This content is not included.new Support Case for troubleshooting.

Workaround for processes in D state caused by NFS

To clear NFS blocked processes that have not automatically recovered, the system will need to be rebooted as explained in is there a way to kill a process in the 'Z' or 'D' state without a reboot?
>If operations do not complete due to a hardware or software fault (for example network connectivity problems) it may not be possible to eliminate all processes in D state without rebooting.

Processes stuck in D status due to NFS can be caused by the recovery behavior of the NFS client after an NFS request times out. From the man nfs:

soft / hard

Determines the recovery behavior of the NFS client after an NFS request times out. If neither option is specified (or if the hard option is specified), NFS requests are retried indefinitely. If the soft option is specified, then the NFS client fails an NFS request after retrans retransmissions have been sent, causing the NFS client to return an error to the calling application.

NB: A so-called "soft" timeout **can cause silent data corruption** in certain cases. As such, use the soft option only when client responsiveness is more important than data integrity. Using NFS over TCP or increasing the value of the retrans option may mitigate some of the risks of using the soft option. 

For configuring NFS mount options, refer to how to edit mount options for pods that are backed by NFS PersistentVolumes in OpenShift 4.

Root Cause

If there are many processes in D status due to rpc_wait_bit_killable, it typically means the processes are waiting on a response from a NFS server. When a NFS client is waiting on the server and doesn't receive a response, the processes will go into D state. The load average calculation includes blocked processes, so the load average will be very high and similar to the number of blocked processes.

Diagnostic Steps

  • Access the node via oc debug node/[node_name] or ssh. If accessed via oc debug node, then run chroot /host bash).

  • Check if the uptime command shows a very high load average:

    # uptime
     11:25:02 up 25 days,  8:45,  4 users,  load average: 3524.92, 3524.04, 3523.24
    
  • Check the WCHAN (waiting channel) column for the processes in D state, which shows the name of the kernel function in which the process is sleeping:

    # ps -elfL | awk '{if($2~"D"){print $13}}'
       3519 rpc_wa
    
  • If the output shows rpc_wa, it is rpc_wait_bit_killable and it typically means the processes are waiting on a response from a NFS server.

  • Check if there are messages related to NFS in the dmesg:

        $ grep -i nfs sos_commands/kernel/dmesg | wc -l
        10750
    
        $ grep -i nfs sos_commands/kernel/dmesg | less
        [...]
        [204462.604701] nfs: server 10.0.0.1 not responding, timed out
        [...]
        [9585786.855380] nfs: server 10.0.0.1 not responding, still trying
    
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.