The revision-pruner pods are not created when the lastFailedRevision of any node is less than 6

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform 4

Issue

The revision-pruner pods in the openshift-kube-apiserver namespace, openshift-controller-manager or openshift-kube-scheduler are not created when the lastFailedRevision of of any node is less than 6.
Due to this issue, the old revision static-pod-resources files were not removed and the revision files keep increasing.

Resolution

If the lastFailedRevision is higher than 6, please review Solution 7105585.

When the lastFailedRevision is less than 6 and if this happens after a cluster install, then there needs to be a way to clean up the LastFailedRevision, lastFailedReason and lastFailedRevisionErrors:

$ oc patch kubeapiservers cluster --type='json' --subresource status -p='[{"op": "replace", "path": "/status/nodeStatuses/1/lastFailedRevision", "value": 0}]'
$ oc patch kubeapiservers cluster --type='json' --subresource status -p='[{"op": "replace", "path": "/status/nodeStatuses/1/lastFailedReason", "value": null}]'
$ oc patch kubeapiservers cluster --type='json' --subresource status -p='[{"op": "replace", "path": "/status/nodeStatuses/1/lastFailedRevisionErrors", "value": null}]'
  • NOTE: oc patch command needs to set the right index under /status/nodeStatuses/N path. So need to find the appropriate index with oc get kubeapiservers cluster -o yaml

If this happens during cluster bootstrap, recommend opening a new support case to determine which revision last failed and what the cause was.

Diagnostic Steps

  • The kube-apiserver-operator shows logs like this:
2024-09-18T04:48:02.239413318Z I0918 04:48:02.233577       1 prune_controller.go:269] Nothing to prune
  • kubeapiservers should look like this:
$ oc get kubeapiservers cluster -o yaml
~~ snip ~~
  nodeStatuses:
  - currentRevision: 34
    nodeName: XXXXXXXXXXXXXX
  - currentRevision: 34
    lastFailedCount: 1
    lastFailedReason: InstallerFailed
    lastFailedRevision: 4 *** less than 6
    lastFailedRevisionErrors:
  • Check /etc/kubernetes/static-pod-resources/ on the node
SBR
Components
Category
Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.