ROSA HCP cluster machine pools upgrade pending long time
Environment
- Red Hat OpenShift Service on AWS (ROSA) HCP
- 4
Issue
ROSA HCP cluster machine pools upgrade pending a long time
Resolution
-
Check below KCS which can help to solve PDB issue that prevent ROSA HCP cluster machine pool upgrade
Root Cause
-
PDB is one of the factors that will cause an update pending
- REF:(OpenShift 4 cluster upgrade pre-checks requirements)[https://access.redhat.com/solutions/7004992]
Diagnostic Steps
- Check node information found there are SchedulingDisabled node
$ oc get nodes
NAME STATUS ROLES AGE VERSION
......
ip-xx-xxx-xx-xxx.ap-northeast-1.compute.internal Ready,SchedulingDisabled worker 130d v1.28.9+416ecaf
- Check machine API logs and found some pods can not be evicted
E1007 10:31:31.477147 1 machine_controller.go:648] "error when evicting pods/\"logging-loki-index-gateway-0\" -n \"openshift-logging\" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.\n" controller="machine" .........Node="ip-xx-xxx-xx-xxx.ap-northeast-1.compute.internal"
- Check if PDB information found there are PDB which prevent the node from evicting pods
$ oc get pdb -n openshift-logging
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
logging-loki-distributor 1 N/A 0 19d
logging-loki-gateway 1 N/A 0 19d
logging-loki-index-gateway 1 N/A 0 19d
logging-loki-ingester 1 N/A 0 19d
logging-loki-querier 1 N/A 0 19d
logging-loki-query-frontend 1 N/A 0 19d
SBR
Product(s)
Components
Category
Tags
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.