ROSA HCP cluster machine pools upgrade pending long time

Solution Verified - Updated 29 Oct 2024

Environment

Red Hat OpenShift Service on AWS (ROSA) HCP
- 4

Issue

ROSA HCP cluster machine pools upgrade pending a long time

Resolution

Check below KCS which can help to solve PDB issue that prevent ROSA HCP cluster machine pool upgrade
- Not able to drain a node when running LokiStack with size 1x.demo in RHOCP 4
- PodDisruptionBudget (PDB) could cause Machine-Config-Operator (MCO) to be degraded during OCP4 upgrade

Root Cause

PDB is one of the factors that will cause an update pending
- REF:(OpenShift 4 cluster upgrade pre-checks requirements)[https://access.redhat.com/solutions/7004992]

Diagnostic Steps

Check node information found there are SchedulingDisabled node

$ oc get nodes
NAME                                         STATUS                     ROLES    AGE     VERSION
......
ip-xx-xxx-xx-xxx.ap-northeast-1.compute.internal Ready,SchedulingDisabled   worker   130d    v1.28.9+416ecaf

Check machine API logs and found some pods can not be evicted

E1007 10:31:31.477147       1 machine_controller.go:648] "error when evicting pods/\"logging-loki-index-gateway-0\" -n \"openshift-logging\" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.\n" controller="machine" .........Node="ip-xx-xxx-xx-xxx.ap-northeast-1.compute.internal"

Check if PDB information found there are PDB which prevent the node from evicting pods

$ oc get pdb -n openshift-logging
NAME                          MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
logging-loki-distributor      1               N/A               0                     19d
logging-loki-gateway          1               N/A               0                     19d
logging-loki-index-gateway    1               N/A               0                     19d
logging-loki-ingester         1               N/A               0                     19d
logging-loki-querier          1               N/A               0                     19d
logging-loki-query-frontend   1               N/A               0                     19d

SBR

Shift Hosted

Product(s)

Red Hat OpenShift Service on AWS

Components

upgrade

Category

Upgrade

Tags

upgrade

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.