Pod disruption budget slows down cluster update
Environment
- Red Hat OpenShift Service on AWS (ROSA)
- Red Hat OpenShift Dedicated (OSD)
Issue
- Cluster updates to OSD or ROSA compute nodes are slow or the update appears stalled
- Pod disruption budget impacts time taken for cluster update to complete
Resolution
- Consider relaxing the
minAvailablefor anyPodDisruptionBudgetresources specified for workloads for the duration of the update to reduce the amount of time thatPodDisruptionBudgetblocks draining of pods from compute nodes. - Specify a shorter Node draining Grace period in Cluster Settings appropriate to a reasonable application pod termination time.
Root Cause
- Specifying a pod disruption budget for workloads may block draining of pods from compute nodes during the Machine Config Operator part of the cluster update process, if not defined correctly.
- Long grace periods will effectively pause the update of compute nodes with restrictive
PodDisruptionBudgetspecification since the update process will wait for the specified grace period before forcibly evicting pods from nodes.
Diagnostic Steps
Check pod disruption budget for workload namespaces as follows:
oc get poddisruptionbudget -n <namespace>
For all namespaces:
oc get poddisruptionbudget --all-namespaces
Note: some pod disruption budget specifications already exist in control plane namespaces which are required for normal operation of a managed OpenShift cluster and must not be modified.
Verify Node drain Grace period Cluster settings in This content is not included.Red Hat Hybrid Cloud Console, for example:

SBR
Components
Category
Tags
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.