ControlPlaneMachineSet cannot be created with empty failureDomains for OpenShift 4 in vSphere

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4
  • Control Plane Machine Set
  • VMware vSphere

Issue

  • The ControlPlaneMachineSet resource cannot be created successfully when specifying an empty failureDomains parameter in OpenShift 4.

  • Both patterns return error:

        machines_v1beta1_machine_openshift_io:
          failureDomains:
            platform: VSphere
            failureDomains: {}
          metadata:
            labels:
    
        machines_v1beta1_machine_openshift_io:
          failureDomains: {}
          metadata:
            labels:
    
  • The following error message will appear when trying to create the ControlPlaneMachineSet resource with above configuration:

    error getting failure domains from control plane machine set machine template: unsupported platform type: VSphere
    

Resolution

Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.

Defining a failure domain for a ControlPlaneMachineSet is supported starting with OpenShift 4.16. Refer to the sample VMware vSphere failure domain configuration documentation for additional information.

Workaround


For OpenShift 4 clusters installed in vSphere, do not specify the `failureDomains` parameter at all in the `ControlPlaneMachineSet` resource.
The resource should be something like the following example with no `failureDomains` parameter at all:
[...]
    machines_v1beta1_machine_openshift_io:
      metadata:
        labels:
[...]

IMPORTANT NOTE: if there are not failureDomains configured, it is required that the name of the machines ends with -[index] (where [index] needs to be a number). In other case, the functions to generate the name of the machine, Content from github.com is not included.getMachineIndex and Content from github.com is not included.getMachineNameIndex will fail with error could not determine machine index: could not determine Machine index from name or failure domain.

Known bug in OpenShift 4.15

There was a known bug causing the upgrades to version 4.15 of OpenShift clusters installed in vSphere with ControlPlaneMachineSet configured to fail depending on the failureDomains configuration. Refer to the control-plane-machine-set-operator pod is in CrashLoopBackOff upgrading to OpenShift 4.15 for additional information.

Root Cause

The failureDomains parameter in the ControlPlaneMachineSet resource was a Technology Preview feature in OpenShift 4.15 as explained in defining a VMware vSphere failure domain for a control plane machine set (Technology Preview) , and promoted to a supported feature in OpenShift 4.16 as explained in defining a vSphere failure domain for a control plane machine set.

Diagnostic Steps

Trying to apply a ControlPlaneMachineSet from a cpms.yaml file with the unsupported failureDomains parameter configured will return the following error message:

$ oc apply -f cpms.yaml                                                                       
Error from server (spec.template.machines_v1beta1_machine_openshift_io.failureDomains: Invalid value: v1.FailureDomains{Platform:"VSphere", AWS:(*[]v1.AWSFailureDomain)(nil), Azure:(*[]v1.AzureFailureDomain)(nil), GCP:(*[]v1.GCPFailureDom
ain)(nil)}: error getting failure domains from control plane machine set machine template: unsupported platform type: VSphere): error when creating "cpms.yaml": admission webhook "controlplanemachineset.machine.openshift.io" denied the re
quest: spec.template.machines_v1beta1_machine_openshift_io.failureDomains: Invalid value: v1.FailureDomains{Platform:"VSphere", AWS:(*[]v1.AWSFailureDomain)(nil), Azure:(*[]v1.AzureFailureDomain)(nil), GCP:(*[]v1.GCPFailureDomain)(nil)}: 
error getting failure domains from control plane machine set machine template: unsupported platform type: VSphere
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.