Deploying OpenShift sandboxed containers on AWS

OpenShift sandboxed containers 1.12

Enhanced security and isolation for container workloads

Red Hat Customer Content Services

Abstract

Red Hat OpenShift sandboxed containers provide enhanced security and isolation by running containerized applications in lightweight virtual machines. You install the OpenShift sandboxed containers Operator on an OpenShift Container Platform cluster. Then, you configure your workload pods to use the optional "kata" runtime.

Preface

Providing feedback on Red Hat documentation

You can provide feedback or report an error by submitting the Create Issue form in Jira.

Procedure

  1. Ensure that you are logged in to Jira. If you do not have a Jira account, you must create a This content is not included.Red Hat Jira account.
  2. Launch the This content is not included.Create Issue form.
  3. Enter a descriptive title in the Summary field.
  4. In the Description field, include the documentation URL, chapter or section number, and a detailed description of the issue.
  5. Enter your Jira user ID in the Reporter field.
  6. Click Create.

Chapter 1. Discover

You can deploy Red Hat OpenShift sandboxed containers workloads on a Red Hat OpenShift Container Platform cluster running on Amazon Web Services (AWS). OpenShift sandboxed containers integrates Kata containers as an optional runtime, providing enhanced security and isolation by running containerized applications in lightweight virtual machines.

This integration provides a more secure runtime environment for sensitive workloads without significant changes to existing OpenShift Container Platform workflows. This runtime supports containers in dedicated virtual machines (VMs), providing improved workload isolation.

Important

Red Hat OpenShift sandboxed containers on Amazon Web Services (AWS) is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

1.1. Features

OpenShift sandboxed containers provides the following features:

Run privileged or untrusted workloads

You can safely run workloads that require specific privileges, without the risk of compromising cluster nodes by running privileged containers. Workloads that require special privileges include the following:

  • Workloads that require special capabilities from the kernel, beyond the default ones granted by standard container runtimes such as CRI-O, for example to access low-level networking features.
  • Workloads that need elevated root privileges, for example to access a specific physical device. With OpenShift sandboxed containers, it is possible to pass only a specific device through to the virtual machines (VM), ensuring that the workload cannot access or misconfigure the rest of the system.
  • Workloads for installing or using set-uid root binaries. These binaries grant special privileges and, as such, can present a security risk. With OpenShift sandboxed containers, additional privileges are restricted to the virtual machines, and grant no special access to the cluster nodes.

    Some workloads require privileges specifically for configuring the cluster nodes. Such workloads should still use privileged containers, because running on a virtual machine would prevent them from functioning.

Ensure isolation for sensitive workloads
The OpenShift sandboxed containers for Red Hat OpenShift Container Platform integrates Kata containers as an optional runtime, providing enhanced security and isolation by running containerized applications in lightweight virtual machines. This integration provides a more secure runtime environment for sensitive workloads without significant changes to existing OpenShift workflows. This runtime supports containers in dedicated virtual machines (VMs), providing improved workload isolation.
Ensure kernel isolation for each workload
You can run workloads that require custom kernel tuning (such as sysctl, scheduler changes, or cache tuning) and the creation of custom kernel modules (such as out of tree or special arguments).
Share the same workload across tenants
You can run workloads that support many users (tenants) from different organizations sharing the same OpenShift Container Platform cluster. The system also supports running third-party workloads from multiple vendors, such as container network functions (CNFs) and enterprise applications. Third-party CNFs, for example, may not want their custom settings interfering with packet tuning or with sysctl variables set by other applications. Running inside a completely isolated kernel is helpful in preventing "noisy neighbor" configuration problems.
Ensure proper isolation and sandboxing for testing software
You can run containerized workloads with known vulnerabilities or handle issues in an existing application. This isolation enables administrators to give developers administrative control over pods, which is useful when the developer wants to test or validate configurations beyond those an administrator would typically grant. Administrators can, for example, safely and securely delegate kernel packet filtering (eBPF) to developers. eBPF requires CAP_ADMIN or CAP_BPF privileges, and is therefore not allowed under a standard CRI-O configuration, as this would grant access to every process on the Container Host worker node. Similarly, administrators can grant access to intrusive tools such as SystemTap, or support the loading of custom kernel modules during their development.
Ensure default resource containment through VM boundaries
By default, OpenShift sandboxed containers manages resources such as CPU, memory, storage, and networking in a robust and secure way. Since OpenShift sandboxed containers deploys on VMs, additional layers of isolation and security give a finer-grained access control to the resource. For example, an errant container will not be able to assign more memory than is available to the VM. Conversely, a container that needs dedicated access to a network card or to a disk can take complete control over that device without getting any access to other devices.

1.2. Compatibility with OpenShift Container Platform

You must ensure that your Red Hat OpenShift Container Platform version supports the features you require.

The required functionality for OpenShift Container Platform is supported by two main components:

Kata runtime
The Kata runtime is included with Red Hat Enterprise Linux CoreOS (RHCOS) and receives updates with every OpenShift Container Platform release. When enabling peer pods with the Kata runtime, the OpenShift sandboxed containers Operator requires external network connectivity to pull the necessary image components and helper utilities to create the pod virtual machine (VM) image.
OpenShift sandboxed containers Operator
The OpenShift sandboxed containers Operator is a Rolling Stream Operator, which means the latest version is the only supported version. It works with all currently supported versions of OpenShift Container Platform.

The Operator depends on the features that come with the RHCOS host and the environment it runs in.

Note

You must install RHCOS on the worker nodes. Red Hat Enterprise Linux (RHEL) nodes are not supported.

The following compatibility matrix for OpenShift sandboxed containers and OpenShift Container Platform releases identifies compatible features and environments.

Table 1.1. Supported architectures

ArchitectureOpenShift Container Platform version (without GPU)OpenShift Container Platform version (with GPU)

x86_64

4.18.38+

4.21.9+

s390x

4.18.38+

 — 

There are two ways to deploy the Kata containers runtime:

  • Bare metal
  • Peer pods

You can deploy OpenShift sandboxed containers by using peer pods on Microsoft Azure, Amazon Web Services (AWS), or Google Cloud. With the release of OpenShift sandboxed containers 1.12.0, the OpenShift sandboxed containers Operator requires OpenShift Container Platform version 4.18.38 or later for deployments without GPU support.

The following table describes OpenShift Container Platform versions and features with the following support levels:

  • GA: General Availability
  • TP: Technology Preview
  • DP: Developer Preview
Note

The version numbers in the table represent the minimum supported version. For example, "4.21.9+" means version 4.21.9 or any later version.

Table 1.2. Feature availability by OpenShift Container Platform version

PlatformGPU4.18.38+4.19.28+4.20.18+4.21.9+

Bare metal

No

GA

GA

GA

GA

NVIDIA H100

 — 

 — 

 — 

TP

IBM Z bare metal

No

GA

GA

GA

GA

IBM Z peer pods

No

GA

GA

GA

GA

Azure

No

GA

GA

GA

GA

NVIDIA H100

 — 

 — 

 — 

DP

AWS

No

GA

GA

GA

GA

NVIDIA H100

 — 

 — 

 — 

 — 

Google Cloud

No

GA

GA

GA

GA

NVIDIA H100

 — 

 — 

 — 

 — 

1.3. Common terms

The following terms are used throughout the documentation.

Attestation
The process of verifying the integrity and trustworthiness of a Trusted Execution Environment (TEE) and the confidential containers workloads running within it, ensuring that only trusted code and data are executed. Red Hat build of Trustee performs this function.
Confidential containers
A technology that provides a confidential computing environment to protect containers and data by leveraging Trusted Execution Environments.
Initdata
A specification used to securely initialize a pod with workload-specific data (such as certificates, cryptographic keys, or an optional Kata Agent policy) at runtime, preventing the need to embed this data directly in the virtual machine (VM) image.
Kata Agent
A component within the pod Virtual Machine (VM) that enforces runtime policies and manages the lifecycle of the containers running inside the VM. Its policy controls API requests for peer pods.
Kata containers
Kata containers is a core upstream project that is used to build OpenShift sandboxed containers. OpenShift sandboxed containers integrate Kata containers with OpenShift Container Platform.
kata runtime
The optional runtime installed by the OpenShift sandboxed containers Operator when configuring bare metal deployments.
kata-cc runtime
The runtime class used specifically for confidential containers deployments on bare-metal servers.
kata-remote runtime
The runtime class used for peer pod deployments on cloud platforms or remote hypervisors.
KataConfig
A custom resource used to configure and launch OpenShift sandboxed containers.
TrusteeConfig
A custom resource used to configure and launch Red Hat build of Trustee.
OpenShift sandboxed containers
OpenShift sandboxed containers integrates Kata containers as an optional runtime to provide enhanced security and isolation for container workloads by running applications in lightweight virtual machines.
OpenShift sandboxed containers Operator
The OpenShift sandboxed containers Operator manages the lifecycle of OpenShift sandboxed containers and confidential containers on a cluster.
Peer pod

A peer pod in OpenShift sandboxed containers extends the concept of a standard pod. Unlike a standard sandboxed container, where the virtual machine is created on the worker node itself, in a peer pod, the virtual machine is created through a remote hypervisor using any supported hypervisor or cloud provider API.

The peer pod acts as a regular pod on the worker node, with its corresponding VM running elsewhere. The remote location of the VM is transparent to the user and is specified by the runtime class in the pod specification. The peer pod design circumvents the need for nested virtualization.

Pod

A pod is a construct that is inherited from Kubernetes and OpenShift Container Platform. It represents resources where containers can be deployed. Containers run inside pods, and pods are used to specify resources that can be shared between multiple containers.

In the context of OpenShift sandboxed containers, a pod is implemented as a virtual machine. Several containers can run in the same pod on the same virtual machine.

Red Hat build of Trustee
Red Hat build of Trustee is an attestation service that verifies the trustworthiness of the location where you plan to run your workload or where you plan to send confidential information. Red Hat build of Trustee includes components deployed on a trusted side and used to verify whether the remote workload is running in a Trusted Execution Environment (TEE).
Red Hat build of Trustee Operator
The Red Hat build of Trustee Operator manages the installation, lifecycle, and configuration of Red Hat build of Trustee.
Runtime class
An object that describes the specific runtime configuration used to execute a workload.
Sandbox

A sandbox is an isolated environment where programs can run. In a sandbox, you can run untested or untrusted programs without risking harm to the host machine or the operating system.

In the context of OpenShift sandboxed containers, sandboxing is achieved by running workloads in a different kernel using virtualization, providing enhanced control over the interactions between multiple workloads that run on the same host.

Trusted Execution Environment (TEE)
Hardware-based security technology leveraged by confidential containers to protect containers and data. Examples: Intel® TDX, AMD SEV-SNP.

1.4. OpenShift sandboxed containers Operator

The OpenShift sandboxed containers Operator encapsulates all of the components from Kata containers. It manages installation, lifecycle, and configuration tasks.

The OpenShift sandboxed containers Operator is packaged in the Operator bundle format as two container images:

  • The bundle image contains metadata and is required to make the operator OLM-ready.
  • The second container image contains the actual controller that monitors and manages the KataConfig resource.

The OpenShift sandboxed containers Operator is based on the Red Hat Enterprise Linux CoreOS (RHCOS) extensions concept. RHCOS extensions are a mechanism to install optional OpenShift Container Platform software. The OpenShift sandboxed containers Operator uses this mechanism to deploy sandboxed containers on a cluster.

The sandboxed containers RHCOS extension contains RPMs for Kata, QEMU, and its dependencies. You can enable them by using the MachineConfig resources that the Machine Config Operator provides.

1.5. OpenShift Virtualization

You can deploy OpenShift sandboxed containers on clusters with OpenShift Virtualization.

To run OpenShift Virtualization and OpenShift sandboxed containers at the same time, your virtual machines must be live migratable so that they do not block node reboots.

1.6. FIPS compliance

OpenShift Container Platform is designed for Federal Information Processing Standards (FIPS) 140-2 and 140-3. When running Red Hat Enterprise Linux (RHEL) or Red Hat Enterprise Linux CoreOS (RHCOS) booted in FIPS mode, OpenShift Container Platform core components use the RHEL cryptographic libraries that have been submitted to NIST for FIPS 140-2/140-3 Validation on only the x86_64, ppc64le, and s390x architectures.

For more information about the NIST validation program, see Content from csrc.nist.gov is not included.Cryptographic Module Validation Program. For the latest NIST status for the individual versions of RHEL cryptographic libraries that have been submitted for validation, see This content is not included.Compliance Activities and Government Standards.

OpenShift sandboxed containers can be used on FIPS enabled clusters.

When running in FIPS mode, OpenShift sandboxed containers components, VMs, and VM images are adapted to comply with FIPS.

Note

FIPS compliance for OpenShift sandboxed containers only applies to the kata runtime class. The peer pod runtime class, kata-remote, is not yet fully supported and has not been tested for FIPS compliance.

FIPS compliance is one of the most critical components required in highly secure environments, to ensure that only supported cryptographic technologies are allowed on nodes.

Important

The use of FIPS Validated / Modules in Process cryptographic libraries is only supported on OpenShift Container Platform deployments on the x86_64 architecture.

To understand Red Hat’s view of OpenShift Container Platform compliance frameworks, refer to the Risk Management and Regulatory Readiness chapter of the OpenShift Security Guide Book.

Chapter 2. Install

You install OpenShift sandboxed containers on Amazon Web Services (AWS) by installing the OpenShift sandboxed containers Operator.

Perform the following steps:

  1. Install the OpenShift sandboxed containers Operator.

2.1. Prerequisites

  • You have installed the latest version of Red Hat OpenShift Container Platform.
  • Your OpenShift Container Platform cluster has at least one worker node.

2.2. Installing the OpenShift sandboxed containers Operator

You can install the OpenShift sandboxed containers Operator by using the command line interface (CLI).

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.

Procedure

  1. Create an osc-namespace.yaml manifest file:

    apiVersion: v1
    kind: Namespace
    metadata:
      name: openshift-sandboxed-containers-operator
  2. Create the namespace by running the following command:

    $ oc create -f osc-namespace.yaml
  3. Create an osc-operatorgroup.yaml manifest file:

    apiVersion: operators.coreos.com/v1
    kind: OperatorGroup
    metadata:
      name: sandboxed-containers-operator-group
      namespace: openshift-sandboxed-containers-operator
    spec:
      targetNamespaces:
      - openshift-sandboxed-containers-operator
  4. Create the operator group by running the following command:

    $ oc create -f osc-operatorgroup.yaml
  5. Create an osc-subscription.yaml manifest file:

    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: sandboxed-containers-operator
      namespace: openshift-sandboxed-containers-operator
    spec:
      channel: stable
      installPlanApproval: Automatic
      name: sandboxed-containers-operator
      source: redhat-operators
      sourceNamespace: openshift-marketplace
      startingCSV: sandboxed-containers-operator.v1.12.0
  6. Create the subscription by running the following command:

    $ oc create -f osc-subscription.yaml
  7. Verify that the Operator is correctly installed by running the following command:

    $ oc get csv -n openshift-sandboxed-containers-operator

    This command can take several minutes to complete.

  8. Watch the process by running the following command:

    $ watch oc get csv -n openshift-sandboxed-containers-operator

    Example output

    NAME                             DISPLAY                                  VERSION         PHASE
    openshift-sandboxed-containers   openshift-sandboxed-containers-operator  1.12.0          Succeeded

Chapter 3. Configure

You can configure OpenShift sandboxed containers for Amazon Web Services (AWS).

Perform the following steps:

  1. Enable ports to allow internal communication with peer pods.
  2. Create the peer pods config map.
  3. Create the KataConfig custom resource.
  4. Optional: Modify the number of peer pod VMs running on each worker node.
  5. Verify the pod VM image.
  6. Disable insecure options by customizing the Kata Agent policy.
  7. Optional: If you select a custom peer pod VM image from an authenticated registry, configure a pull secret.
  8. Optional: Select a custom peer pod VM image.
  9. Configure your workload for OpenShift sandboxed containers.

3.1. Enabling ports

You must enable ports 15150 and 9000 to allow internal communication with peer pods running on AWS.

Prerequisites

  • You have installed the OpenShift sandboxed containers Operator.
  • You have installed the AWS command line tool.
  • You have access to the cluster as a user with the cluster-admin role.

Procedure

  1. Log in to your OpenShift Container Platform cluster and retrieve the instance ID:

    $ INSTANCE_ID=$(oc get nodes -l 'node-role.kubernetes.io/worker' \
      -o jsonpath='{.items[0].spec.providerID}' | sed 's#[^ ]*/##g')
  2. Retrieve the AWS region:

    $ AWS_REGION=$(oc get infrastructure/cluster -o jsonpath='{.status.platformStatus.aws.region}')
  3. Retrieve the security group IDs and store them in an array:

    $ AWS_SG_IDS=($(aws ec2 describe-instances --instance-ids ${INSTANCE_ID} \
      --query 'Reservations[*].Instances[*].SecurityGroups[*].GroupId' \
      --output text --region $AWS_REGION))
  4. For each security group ID, authorize the peer pods shim to access kata-agent communication, and set up the peer pods tunnel:

    $ for AWS_SG_ID in "${AWS_SG_IDS[@]}"; do \
      aws ec2 authorize-security-group-ingress --group-id $AWS_SG_ID --protocol tcp --port 15150 --source-group $AWS_SG_ID --region $AWS_REGION; \
      aws ec2 authorize-security-group-ingress --group-id $AWS_SG_ID --protocol tcp --port 9000 --source-group $AWS_SG_ID --region $AWS_REGION; \
    done

The ports are now enabled.

3.2. Creating the peer pods config map

You must create the peer pods config map.

Prerequisites

  • You have an Amazon Machine Image (AMI) ID if you are not using the default AMI ID based on your cluster credentials.

Procedure

  1. Obtain the following values from your AWS instance:

    1. Retrieve and record the instance ID:

      $ INSTANCE_ID=$(oc get nodes -l 'node-role.kubernetes.io/worker' \
        -o jsonpath='{.items[0].spec.providerID}' | sed 's#[^ ]*/##g')

      This is used to retrieve other values for the secret object.

    2. Retrieve and record the AWS region:

      $ AWS_REGION=$(oc get infrastructure/cluster \
        -o jsonpath='{.status.platformStatus.aws.region}') \
        && echo "AWS_REGION: \"$AWS_REGION\""
    3. Retrieve and record the AWS subnet ID:

      $ AWS_SUBNET_ID=$(aws ec2 describe-instances --instance-ids ${INSTANCE_ID} \
        --query 'Reservations[*].Instances[*].SubnetId' --region ${AWS_REGION} \
          --output text) && echo "AWS_SUBNET_ID: \"$AWS_SUBNET_ID\""
    4. Retrieve and record the AWS VPC ID:

      $ AWS_VPC_ID=$(aws ec2 describe-instances --instance-ids ${INSTANCE_ID} \
        --query 'Reservations[*].Instances[*].VpcId' --region ${AWS_REGION} \
          --output text) && echo "AWS_VPC_ID: \"$AWS_VPC_ID\""
    5. Retrieve and record the AWS security group IDs:

      $ AWS_SG_IDS=$(aws ec2 describe-instances --instance-ids ${INSTANCE_ID} \
        --query 'Reservations[*].Instances[*].SecurityGroups[*].GroupId' \
        --region  $AWS_REGION --output json | jq -r '.[][][]' | paste -sd ",") \
          && echo "AWS_SG_IDS: \"$AWS_SG_IDS\""
  2. Create a peer-pods-cm.yaml manifest file according to the following example:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: peer-pods-cm
      namespace: openshift-sandboxed-containers-operator
    data:
      CLOUD_PROVIDER: "aws"
      VXLAN_PORT: "9000"
      PROXY_TIMEOUT: "5m"
      PODVM_INSTANCE_TYPE: "t3.medium"
      PODVM_INSTANCE_TYPES: "t2.small,t2.medium,t3.large"
      PODVM_AMI_ID: "<podvm_ami_id>"
      AWS_REGION: "<aws_region>"
      AWS_SUBNET_ID: "<aws_subnet_id>"
      AWS_VPC_ID: "<aws_vpc_id>"
      AWS_SG_IDS: "<aws_sg_ids>"
      TAGS: "key1=value1,key2=value2"
      PEERPODS_LIMIT_PER_NODE: "10"
      ROOT_VOLUME_SIZE: "6"
      DISABLECVM: "true"
    PODVM_INSTANCE_TYPE
    Defines the default instance type that is used if the instance type is not defined in the workload object.
    PODVM_INSTANCE_TYPES
    Specify the allowed instance types, without spaces, for creating the pod. You can define smaller instance types for workloads that need less memory and fewer CPUs or larger instance types for larger workloads.
    PODVM_AMI_ID
    This value is populated when you run the KataConfig CR, using an AMI ID based on your cluster credentials. If you create your own AMI, specify the correct AMI ID.
    TAGS
    You can configure custom tags as key:value pairs for pod VM instances to track peer pod costs or to identify peer pods in different clusters.
    PEERPODS_LIMIT_PER_NODE
    You can increase this value to run more peer pods on a node. The default value is 10.
    ROOT_VOLUME_SIZE
    You can increase this value for pods with larger container images. Specify the root volume size in gigabytes for the pod VM. The default and minimum size is 6 GB.
  3. Create the config map by running the following command:

    $ oc create -f peer-pods-cm.yaml

3.3. Creating the KataConfig custom resource

You must create the KataConfig custom resource (CR) to install kata-remote as a runtime class on your worker nodes.

OpenShift sandboxed containers installs kata-remote as a secondary, optional runtime on the cluster and not as the primary runtime.

Creating the KataConfig CR automatically reboots the worker nodes. The reboot can take from 10 to more than 60 minutes. The following factors can increase the reboot time:

  • A large OpenShift Container Platform deployment with a greater number of worker nodes.
  • Activation of the BIOS and Diagnostics utility.
  • Deployment on a hard disk drive rather than an SSD.
  • Deployment on physical nodes such as bare metal, rather than on virtual nodes.
  • A slow CPU and network.

Procedure

  1. Create an example-kataconfig.yaml manifest file according to the following example:

    apiVersion: kataconfiguration.openshift.io/v1
    kind: KataConfig
    metadata:
      name: example-kataconfig
    spec:
      logLevel: info
    #  kataConfigPoolSelector:
    #    matchLabels:
    #      <label_key>: '<label_value>'

    where:

    matchLabels
    Optional: If you have applied node labels to install kata-remote on specific nodes, specify the key and value, for example, kata-remote: 'true'.
  2. Create the KataConfig CR by running the following command:

    $ oc create -f example-kataconfig.yaml

    The new KataConfig CR is created and installs kata-remote as a runtime class on the worker nodes.

    Wait for the kata-remote installation to complete and the worker nodes to reboot before verifying the installation.

  3. Monitor the installation progress by running the following command:

    $ watch "oc describe kataconfig | sed -n /^Status:/,/^Events/p"

    When the status of all workers under kataNodes is installed and the condition InProgress is False without specifying a reason, the kata-remote is installed on the cluster.

  4. Verify the daemon set by running the following command:

    $ oc get -n openshift-sandboxed-containers-operator ds/osc-caa-ds
  5. Verify the runtime classes by running the following command:

    $ oc get runtimeclass

    Example output

    NAME           HANDLER             AGE
    kata            kata                 34m
    kata-remote         kata-remote            152m

    You can also see the default kata runtime class in addition to kata-remote.

3.3.1. Modifying the number of peer pod VMs per node

You can modify the limit of peer pod virtual machines (VMs) per node by editing the peerpodConfig custom resource (CR).

Procedure

  1. Check the current limit by running the following command:

    $ oc get peerpodconfig peerpodconfig-openshift -n openshift-sandboxed-containers-operator \
      -o jsonpath='{.spec.limit}{"\n"}'
  2. Specify a new value for the limit key by running the following command:

    $ oc patch peerpodconfig peerpodconfig-openshift -n openshift-sandboxed-containers-operator \
      --type merge --patch '{"spec":{"limit":"<value>"}}'

3.4. Verifying the pod VM image

After kata-remote is installed on your cluster, the OpenShift sandboxed containers Operator creates a pod VM image, which is used to create peer pods. This process can take a long time because the image is created on the cloud instance. You can verify that the pod VM image was created successfully by checking the config map that you created for the cloud provider.

Procedure

  1. Obtain the config map you created for the peer pods:

    $ oc get configmap peer-pods-cm -n openshift-sandboxed-containers-operator -o yaml
  2. Check the status stanza of the YAML file.

    If the PODVM_AMI_ID parameter is populated, the pod VM image was created successfully.

Troubleshooting

  1. Retrieve the events log by running the following command:

    $ oc get events -n openshift-sandboxed-containers-operator --field-selector involvedObject.name=osc-podvm-image-creation
  2. Retrieve the job log by running the following command:

    $ oc logs -n openshift-sandboxed-containers-operator jobs/osc-podvm-image-creation

If you cannot resolve the issue, submit a Red Hat Support case and attach the output of both logs.

3.5. Customizing the Kata Agent policy

You can customize the Kata Agent policy to override the permissive default policy. The Kata Agent policy is a security mechanism that controls API requests for peer pods.

Important

You must override the default policy in a production environment.

As a minimum requirement, you must disable ExecProcessRequest to prevent a cluster administrator from accessing sensitive data by running the oc exec command on a peer pod.

You can use the default policy in development and test environments where security is not a concern, for example, in an environment where the control plane can be trusted.

A custom policy replaces the default policy entirely. To modify specific APIs, include the full policy and adjust the relevant rules.

Procedure

  1. Create a custom policy.rego file by modifying the default policy:

    package agent_policy
    
    default AddARPNeighborsRequest := true
    default AddSwapRequest := true
    default CloseStdinRequest := true
    default CopyFileRequest := true
    default CreateContainerRequest := true
    default CreateSandboxRequest := true
    default DestroySandboxRequest := true
    default GetMetricsRequest := true
    default GetOOMEventRequest := true
    default GuestDetailsRequest := true
    default ListInterfacesRequest := true
    default ListRoutesRequest := true
    default MemHotplugByProbeRequest := true
    default OnlineCPUMemRequest := true
    default PauseContainerRequest := true
    default PullImageRequest := true
    default ReadStreamRequest := false
    default RemoveContainerRequest := true
    default RemoveStaleVirtiofsShareMountsRequest := true
    default ReseedRandomDevRequest := true
    default ResumeContainerRequest := true
    default SetGuestDateTimeRequest := true
    default SignalProcessRequest := true
    default StartContainerRequest := true
    default StartTracingRequest := true
    default StatsContainerRequest := true
    default StopTracingRequest := true
    default TtyWinResizeRequest := true
    default UpdateContainerRequest := true
    default UpdateEphemeralMountsRequest := true
    default UpdateInterfaceRequest := true
    default UpdateRoutesRequest := true
    default WaitProcessRequest := true
    default ExecProcessRequest := false
    default SetPolicyRequest := false
    default WriteStreamRequest := false
    
    ExecProcessRequest if {
        input_command = concat(" ", input.process.Args)
        some allowed_command in policy_data.allowed_commands
        input_command == allowed_command
    }
    
    policy_data := {
      "allowed_commands": [
            "curl http://127.0.0.1:8006/cdh/resource/default/attestation-status/status"
      ]
    }

    The default policy allows all API calls. Adjust the true or false values to customize the policy further based on your needs.

  2. Convert the policy.rego file to a Base64-encoded string by running the following command:

    $ base64 -w0 policy.rego

    Record the output.

  3. Add the Base64-encoded policy string to the my-pod.yaml manifest:

    apiVersion: v1
    kind: Pod
    metadata:
      name: my-pod
      annotations:
        io.katacontainers.config.agent.policy: <base64_encoded_policy>
    spec:
      runtimeClassName: kata-remote
      containers:
      - name: <container_name>
        image: registry.access.redhat.com/ubi9/ubi:latest
        command:
        - sleep
        - "36000"
        securityContext:
          privileged: false
          seccompProfile:
            type: RuntimeDefault
  4. Create the pod by running the following command:

    $ oc create -f my-pod.yaml

3.6. Configuring a pull secret for peer pods

If you select a custom peer pod VM image from a private registry such as registry.access.redhat.com, you must configure a pull secret for peer pods.

Then, you can link the pull secret to the default service account or you can specify the pull secret in the peer pod manifest.

Procedure

  1. Set the NS variable to the namespace where you deploy your peer pods:

    $ NS=<namespace>
  2. Copy the pull secret to the peer pod namespace:

    $ oc get secret pull-secret -n openshift-config -o yaml \
      | sed "s/namespace: openshift-config/namespace: ${NS}/" \
      | oc apply -n "${NS}" -f -

    You can use the cluster pull secret, as in this example, or a custom pull secret.

  3. Optional: Link the pull secret to the default service account:

    $ oc secrets link default pull-secret --for=pull -n ${NS}
  4. Alternatively, add the pull secret to the peer pod manifest:

    apiVersion: v1
    kind: <Pod>
    spec:
      containers:
      - name: <container_name>
        image: <image_name>
      imagePullSecrets:
      - name: pull-secret
    # ...

3.7. Selecting a custom peer pod VM image

You can select a custom peer pod virtual machine (VM) image, tailored to your workload requirements, by adding an annotation to the pod manifest. The custom image overrides the default image specified in the peer pods config map.

Prerequisites

  • If the custom peer pod VM image is in a private registry, you have created a pull secret.
  • You have the ID of a custom pod VM image, which is compatible with your cloud provider or hypervisor.

Procedure

  1. Create a my-pod-manifest.yaml file according to the following example:

    apiVersion: v1
    kind: Pod
    metadata:
      name: my-pod-manifest
      annotations:
        io.katacontainers.config.hypervisor.image: "<custom_image_id>"
    spec:
      runtimeClassName: kata-remote
      containers:
      - name: <example_container>
        image: registry.access.redhat.com/ubi9/ubi:9.3
        command: ["sleep", "36000"]
  2. Create the pod by running the following command:

    $ oc create -f my-pod-manifest.yaml

3.8. Configuring your workload

You configure your workload for OpenShift sandboxed containers by setting kata-remote as the runtime class for the following pod-templated objects:

  • Pod objects
  • ReplicaSet objects
  • ReplicationController objects
  • StatefulSet objects
  • Deployment objects
  • DeploymentConfig objects
Important

Do not deploy workloads in an Operator namespace. Create a dedicated namespace for these resources.

You can define whether the workload should be deployed using the default instance type, which you defined in the peer pods config map, by adding an annotation to the YAML file.

If you do not want to define the instance type manually, you can add an annotation to use an automatic instance type, based on the memory available.

Prerequisites

  • You have created the KataConfig custom resource (CR).

Procedure

  1. Add spec.runtimeClassName: kata-remote to the manifest of each pod-templated workload object as in the following example:

    apiVersion: v1
    kind: <object>
    # ...
    spec:
      runtimeClassName: kata-remote
    # ...
  2. Optional: To override the default instance type, add the following annotation with an instance type that is defined in the peer pods config map:

    apiVersion: v1
    kind: <object>
    metadata:
      annotations:
        io.katacontainers.config.hypervisor.machine_type: <instance>
    # ...
  3. Optional: To use an automatic instance type, add the following annotations:

    apiVersion: v1
    kind: <Pod>
    metadata:
      annotations:
        io.katacontainers.config.hypervisor.default_vcpus: <vcpus>
        io.katacontainers.config.hypervisor.default_memory: <memory>
    # ...

    The workload will run on an automatic instance type based on the amount of memory available.

  4. Apply the changes to the workload object by running the following command:

    $ oc apply -f <object.yaml>

    OpenShift Container Platform creates the workload object and begins scheduling it.

Verification

  • Inspect the spec.runtimeClassName field of a pod-templated object. If the value is kata-remote, then the workload is running on OpenShift sandboxed containers.

Chapter 4. Update

You update OpenShift sandboxed containers by updating the OpenShift Container Platform cluster and the OpenShift sandboxed containers Operator.

Then, you update the pod virtual machine (VM) image by deleting and re-creating KataConfig custom resource (CR). Updating the OpenShift sandboxed containers Operator when enablePeerpods: true is set in the KataConfig CR does not update the pod VM image automatically.

You must perform the following steps:

  1. Update your OpenShift Container Platform cluster to update the Kata runtime and its dependencies.

    The RHCOS extension sandboxed containers contains the required components to run OpenShift sandboxed containers, such as the Kata containers runtime, the hypervisor QEMU, and other dependencies. You update the extension by updating the cluster to a new release of OpenShift Container Platform.

  2. Update the OpenShift sandboxed containers Operator.
  3. Delete the KataConfig CR.
  4. Verify that the image ID in the peer pods config map is empty.
  5. Re-create the KataConfig CR.

4.1. Updating the OpenShift sandboxed containers Operator

You can update the OpenShift sandboxed containers Operator by using the command line interface (CLI).

Procedure

  1. Create an osc-subscription.yaml manifest file:

    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: sandboxed-containers-operator
      namespace: openshift-sandboxed-containers-operator
    spec:
      channel: stable
      installPlanApproval: Automatic
      name: sandboxed-containers-operator
      source: redhat-operators
      sourceNamespace: openshift-marketplace
      startingCSV: sandboxed-containers-operator.v1.12.0
  2. Create the subscription by running the following command:

    $ oc create -f osc-subscription.yaml
  3. Verify that the Operator is correctly installed by running the following command:

    $ oc get csv -n openshift-sandboxed-containers-operator

    This command can take several minutes to complete.

  4. Watch the process by running the following command:

    $ watch oc get csv -n openshift-sandboxed-containers-operator

    Example output

    NAME                             DISPLAY                                  VERSION   REPLACES    PHASE
    openshift-sandboxed-containers   openshift-sandboxed-containers-operator  1.12.0    1.11.1      Succeeded

4.2. Deleting the KataConfig custom resource

You must delete the KataConfig custom resource (CR).

Deleting the KataConfig CR automatically reboots the worker nodes. Reboot can take from 10 to 60 minutes. The following factors can affect the reboot time:

  • A larger OpenShift Container Platform deployment with a greater number of worker nodes.
  • Activation of the BIOS and Diagnostics utility.
  • Deployment on a hard drive rather than an SSD.
  • Deployment on physical nodes such as bare metal, rather than on virtual nodes.
  • A slow CPU and network.

Prerequisites

  • You have deleted all pods that use the kata-remote runtime class.

Procedure

  1. Delete the KataConfig CR by running the following command:

    $ oc delete kataconfig example-kataconfig

    The OpenShift sandboxed containers Operator removes all resources that were initially created to enable the runtime on your cluster.

    Important

    When you delete the KataConfig CR, the CLI stops responding until all worker nodes reboot. You must wait for the deletion process to complete before performing the verification.

  2. Verify the CR removal by running the following command:

    $ oc get kataconfig example-kataconfig

    Example output

    No example-kataconfig instances exist

4.3. Verifying empty peer pod image ID

You must verify that the image ID in the peer pods config map is empty.

Procedure

  1. Obtain the value of the PODVM_AMI_ID in the peer pods config map by running the following command:

    $ oc get configmap -n openshift-sandboxed-containers-operator peer-pods-cm -o jsonpath="{.data.PODVM_AMI_ID}"
  2. If the value is not empty, update the value and patch the config map by running the following command:

    $ oc patch configmap peer-pods-cm -n openshift-sandboxed-containers-operator -p '{"data":{"PODVM_AMI_ID":""}}'

4.4. Creating the KataConfig custom resource

You must create the KataConfig custom resource (CR) to install kata-remote as a runtime class on your worker nodes.

OpenShift sandboxed containers installs kata-remote as a secondary, optional runtime on the cluster and not as the primary runtime.

Creating the KataConfig CR automatically reboots the worker nodes. The reboot can take from 10 to more than 60 minutes. The following factors can increase the reboot time:

  • A large OpenShift Container Platform deployment with a greater number of worker nodes.
  • Activation of the BIOS and Diagnostics utility.
  • Deployment on a hard disk drive rather than an SSD.
  • Deployment on physical nodes such as bare metal, rather than on virtual nodes.
  • A slow CPU and network.

Procedure

  1. Create an example-kataconfig.yaml manifest file according to the following example:

    apiVersion: kataconfiguration.openshift.io/v1
    kind: KataConfig
    metadata:
      name: example-kataconfig
    spec:
      enablePeerPods: true
      logLevel: info
    #  kataConfigPoolSelector:
    #    matchLabels:
    #      <label_key>: '<label_value>'

    where:

    matchLabels
    Optional: If you have applied node labels to install kata-remote on specific nodes, specify the key and value, for example, kata-remote: 'true'.
  2. Create the KataConfig CR by running the following command:

    $ oc create -f example-kataconfig.yaml

    The new KataConfig CR is created and installs kata-remote as a runtime class on the worker nodes.

    Wait for the kata-remote installation to complete and the worker nodes to reboot before verifying the installation.

  3. Monitor the installation progress by running the following command:

    $ watch "oc describe kataconfig | sed -n /^Status:/,/^Events/p"

    When the status of all workers under kataNodes is installed and the condition InProgress is False without specifying a reason, the kata-remote is installed on the cluster.

  4. Verify the daemon set by running the following command:

    $ oc get -n openshift-sandboxed-containers-operator ds/osc-caa-ds
  5. Verify the runtime classes by running the following command:

    $ oc get runtimeclass

    Example output

    NAME           HANDLER             AGE
    kata            kata                 34m
    kata-remote         kata-remote            152m

Chapter 5. Uninstall

You uninstall OpenShift sandboxed containers by deleting the workload pods, uninstalling the OpenShift sandboxed containers Operator, and deleting its resources.

You perform the following tasks:

  1. Delete pods that use the kata-remote runtime class.

    Important

    You must delete the workload pods before you delete the KataConfig CR. The pod names usually have the prefix podvm and custom tags, if provided.

  2. Delete the KataConfig custom resource (CR).
  3. Uninstall the OpenShift sandboxed containers Operator.
  4. Delete the KataConfig custom resource definition (CRD).

5.1. Deleting workload pods

You must delete your workload pods. The pod names usually have the prefix podvm and custom tags, if provided.

Prerequisites

  • You have installed the jq utility.

Procedure

  1. Search for the pods by running the following command:

    $ oc get pods -A -o json | jq -r '.items[] | \
      select(.spec.runtimeClassName == "kata-remote").metadata.name'
  2. Delete each pod by running the following command:

    $ oc delete pod <pod>

5.2. Deleting the KataConfig custom resource

You must delete the KataConfig custom resource (CR).

Deleting the KataConfig CR automatically reboots the worker nodes. Reboot can take from 10 to 60 minutes. The following factors can affect the reboot time:

  • A larger OpenShift Container Platform deployment with a greater number of worker nodes.
  • Activation of the BIOS and Diagnostics utility.
  • Deployment on a hard drive rather than an SSD.
  • Deployment on physical nodes such as bare metal, rather than on virtual nodes.
  • A slow CPU and network.

Prerequisites

  • You have deleted all pods that use the kata-remote runtime class.

Procedure

  1. Delete the KataConfig CR by running the following command:

    $ oc delete kataconfig example-kataconfig

    The OpenShift sandboxed containers Operator removes all resources that were initially created to enable the runtime on your cluster.

    Important

    When you delete the KataConfig CR, the CLI stops responding until all worker nodes reboot. You must wait for the deletion process to complete before performing the verification.

  2. Verify the CR removal by running the following command:

    $ oc get kataconfig example-kataconfig

    Example output

    No example-kataconfig instances exist

5.3. Uninstalling the OpenShift sandboxed containers Operator

You uninstall the OpenShift sandboxed containers Operator by using the command line.

Prerequisites

  • You have deleted all pods with the kata-remote runtime class.
  • You have deleted the KataConfig custom resource.

Procedure

  1. Delete the subscription by running the following command:

    $ oc delete subscription OpenShift sandboxed containers Operator -n openshift-sandboxed-containers-operator
  2. Delete the namespace by running the following command:

    $ oc delete namespace openshift-sandboxed-containers-operator

5.4. Deleting the KataConfig CRD

You must delete the KataConfig custom resource definition (CRD).

Prerequisites

  • You have deleted the KataConfig custom resource.
  • You have uninstalled the OpenShift sandboxed containers Operator.

Procedure

  1. Delete the KataConfig CRD by running the following command:

    $ oc delete crd kataconfigs.kataconfiguration.openshift.io
  2. Verify that the CRD was deleted by running the following command:

    $ oc get crd kataconfigs.kataconfiguration.openshift.io

    Example output

    Unknown CRD kataconfigs.kataconfiguration.openshift.io

Chapter 6. Observe

You can monitor the health of your OpenShift sandboxed containers environment.

The following tools are available:

  • OpenShift Container Platform web console. Administrators can access and query raw metrics through Prometheus.
  • Logging

6.1. Metrics

You can monitor system health by querying metrics displayed in the OpenShift Container Platform web console.

You can access the following metrics:

Kata agent metrics
Kata agent metrics display information about the kata agent process running in the VM embedded in your sandboxed containers. These metrics include data from /proc/<pid>/[io, stat, status].
Kata guest operating system metrics
Kata guest operating system metrics display data from the guest operating system running in your sandboxed containers. These metrics include data from /proc/[stats, diskstats, meminfo, vmstats] and /proc/net/dev.
Hypervisor metrics
Hypervisor metrics display data regarding the hypervisor running the VM embedded in your sandboxed containers. These metrics mainly include data from /proc/<pid>/[io, stat, status].
Kata monitor metrics
Kata monitor is the process that gathers metric data and makes it available to Prometheus. The kata monitor metrics display detailed information about the resource usage of the kata-monitor process itself. These metrics also include counters from Prometheus data collection.
Kata containerd shim v2 metrics
Kata containerd shim v2 metrics display detailed information about the kata shim process. These metrics include data from /proc/<pid>/[io, stat, status] and detailed resource usage metrics.

6.2. Viewing metrics

You can access the metrics for OpenShift sandboxed containers in the Metrics page In the OpenShift Container Platform web console.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role or with view permissions for all projects.

Procedure

  1. In the OpenShift Container Platform web console, navigate to ObserveMetrics.
  2. In the input field, enter the query for the metric you want to observe.

    All kata-related metrics begin with kata. Typing kata displays a list of all available kata metrics.

The metrics from your query are visualized on the page.

6.3. Enabling debug logs for CRI-O runtime

You can enable debug logs by updating the logLevel field in the KataConfig CR. This changes the log level in the CRI-O runtime for the worker nodes running OpenShift sandboxed containers.

Prerequisites

  • You have installed the OpenShift CLI (oc).
  • You have access to the cluster as a user with the cluster-admin role.

Procedure

  1. Change the logLevel field in your existing KataConfig CR to debug:

    $ oc patch kataconfig <kataconfig> --type merge --patch '{"spec":{"logLevel":"debug"}}'
  2. Monitor the kata-oc machine config pool until the value of UPDATED is True, indicating that all worker nodes are updated:

    $ oc get mcp kata-oc

    Example output

    NAME     CONFIG                 UPDATED  UPDATING  DEGRADED  MACHINECOUNT  READYMACHINECOUNT  UPDATEDMACHINECOUNT  DEGRADEDMACHINECOUNT  AGE
    kata-oc  rendered-kata-oc-169   False    True      False     3             1                  1                    0                     9h

Verification

  1. Start a debug session with a node in the machine config pool:

    $ oc debug node/<node_name>
  2. Change the root directory to /host:

    # chroot /host
  3. Verify the changes in the crio.conf file:

    # crio config | egrep 'log_level

    Example output

    log_level = "debug"

6.4. Viewing debug logs for components

Cluster administrators can use the debug logs to troubleshoot issues. The logs for each node are printed to the node journal.

You can review the logs for the following OpenShift sandboxed containers components:

  • Kata agent
  • Kata runtime (containerd-shim-kata-v2)
  • virtiofsd

QEMU only generates warning and error logs. These warnings and errors print to the node journal in both the Kata runtime logs and the CRI-O logs with an extra qemuPid field.

Example of QEMU logs

Mar 11 11:57:28 openshift-worker-0 kata[2241647]: time="2023-03-11T11:57:28.587116986Z" level=info msg="Start logging QEMU (qemuPid=2241693)" name=containerd-shim-v2 pid=2241647 sandbox=d1d4d68efc35e5ccb4331af73da459c13f46269b512774aa6bde7da34db48987 source=virtcontainers/hypervisor subsystem=qemu

Mar 11 11:57:28 openshift-worker-0 kata[2241647]: time="2023-03-11T11:57:28.607339014Z" level=error msg="qemu-kvm: -machine q35,accel=kvm,kernel_irqchip=split,foo: Expected '=' after parameter 'foo'" name=containerd-shim-v2 pid=2241647 qemuPid=2241693 sandbox=d1d4d68efc35e5ccb4331af73da459c13f46269b512774aa6bde7da34db48987 source=virtcontainers/hypervisor subsystem=qemu

Mar 11 11:57:28 openshift-worker-0 kata[2241647]: time="2023-03-11T11:57:28.60890737Z" level=info msg="Stop logging QEMU (qemuPid=2241693)" name=containerd-shim-v2 pid=2241647 sandbox=d1d4d68efc35e5ccb4331af73da459c13f46269b512774aa6bde7da34db48987 source=virtcontainers/hypervisor subsystem=qemu

The Kata runtime prints Start logging QEMU when QEMU starts, and Stop Logging QEMU when QEMU stops. The error appears in between these two log messages with the qemuPid field. The actual error message from QEMU appears in red.

The console of the QEMU guest is printed to the node journal as well. You can view the guest console logs together with the Kata agent logs.

Prerequisites

  • You have installed the OpenShift CLI (oc).
  • You have access to the cluster as a user with the cluster-admin role.

Procedure

  • To review the Kata agent logs and guest console logs, run the following command:

    $ oc debug node/<nodename> -- journalctl -D /host/var/log/journal -t kata -g “reading guest console”
  • To review the Kata runtime logs, run the following command:

    $ oc debug node/<nodename> -- journalctl -D /host/var/log/journal -t kata
  • To review the virtiofsd logs, run the following command:

    $ oc debug node/<nodename> -- journalctl -D /host/var/log/journal -t virtiofsd
  • To review the QEMU logs, run the following command:

    $ oc debug node/<nodename> -- journalctl -D /host/var/log/journal -t kata -g "qemuPid=\d+"

Chapter 7. Troubleshoot

You can open a Red Hat support case and provide debugging information by using must-gather. The must-gather tool collects diagnostic information about your OpenShift Container Platform cluster, including virtual machines and other data.

7.1. Using must-gather

The oc adm must-gather CLI command collects the information from your cluster that is most likely needed for debugging issues, including:

  • Resource definitions
  • Service logs

By default, the oc adm must-gather command uses the default plugin image and writes into ./must-gather.local.

Alternatively, you can collect specific information by running the command with the appropriate arguments as described in the following sections:

  • To collect data related to one or more specific features, use the --image argument with an image, as listed in a following section.

    For example:

    $ oc adm must-gather --image=registry.redhat.io/openshift-sandboxed-containers/osc-must-gather-rhel9:1.12.0
  • To collect the audit logs, use the -- /usr/bin/gather_audit_logs argument, as described in a following section.

    For example:

    $ oc adm must-gather -- /usr/bin/gather_audit_logs
    Note

    Audit logs are not collected as part of the default set of information to reduce the size of the files.

When you run oc adm must-gather, a new pod with a random name is created in a new project on the cluster. The data is collected on that pod and saved in a new directory that starts with must-gather.local. This directory is created in the current working directory.

For example:

NAMESPACE                      NAME                 READY   STATUS      RESTARTS      AGE
...
openshift-must-gather-5drcj    must-gather-bklx4    2/2     Running     0             72s
openshift-must-gather-5drcj    must-gather-s8sdh    2/2     Running     0             72s
...

Optionally, you can run the oc adm must-gather command in a specific namespace by using the --run-namespace option.

For example:

$ oc adm must-gather --run-namespace <namespace> --image=registry.redhat.io/openshift-sandboxed-containers/osc-must-gather-rhel9:1.12.0

Chapter 8. Reference

8.1. KataConfig status messages

The following table displays the status messages for the KataConfig custom resource (CR) for a cluster with two worker nodes.

Table 8.1. KataConfig status messages

StatusDescription

Initial installation

When a KataConfig CR is created and starts installing kata-remote on both workers, the following status is displayed for a few seconds.

 conditions:
    message: Performing initial installation of kata-remote on cluster
    reason: Installing
    status: 'True'
    type: InProgress
 kataNodes:
   nodeCount: 0
   readyNodeCount: 0

Installing

Within a few seconds the status changes.

 kataNodes:
   nodeCount: 2
   readyNodeCount: 0
   waitingToInstall:
   - worker-0
   - worker-1

Installing (Worker-1 installation starting)

For a short period of time, the status changes, signifying that one node has initiated the installation of kata-remote, while the other is in a waiting state. This is because only one node can be unavailable at any given time. The nodeCount remains at 2 because both nodes will eventually receive kata-remote, but the readyNodeCount is currently 0 as neither of them has reached that state yet.

 kataNodes:
   installing:
   - worker-1
   nodeCount: 2
   readyNodeCount: 0
   waitingToInstall:
   - worker-0

Installing (Worker-1 installed, worker-0 installation started)

After some time, worker-1 will complete its installation, causing a change in the status. The readyNodeCount is updated to 1, indicating that worker-1 is now prepared to execute kata-remote workloads. You cannot schedule or run kata-remote workloads until the runtime class is created at the end of the installation process.

 kataNodes:
   installed:
   - worker-1
   installing:
   - worker-0
   nodeCount: 2
   readyNodeCount: 1

Installed

When installed, both workers are listed as installed, and the InProgress condition transitions to False without specifying a reason, indicating the successful installation of kata-remote on the cluster.

 conditions:
    message: ""
    reason: ""
    status: 'False'
    type: InProgress
 kataNodes:
   installed:
   - worker-0
   - worker-1
   nodeCount: 2
   readyNodeCount: 2
StatusDescription

Initial uninstall

If kata-remote is installed on both workers, and you delete the KataConfig to remove kata-remote from the cluster, both workers briefly enter a waiting state for a few seconds.

 conditions:
    message: Removing kata-remote from cluster
    reason: Uninstalling
    status: 'True'
    type: InProgress
 kataNodes:
   nodeCount: 0
   readyNodeCount: 0
   waitingToUninstall:
   - worker-0
   - worker-1

Uninstalling

After a few seconds, one of the workers starts uninstalling.

 kataNodes:
   nodeCount: 0
   readyNodeCount: 0
   uninstalling:
   - worker-1
   waitingToUninstall:
   - worker-0

Uninstalling

Worker-1 finishes and worker-0 starts uninstalling.

 kataNodes:
   nodeCount: 0
   readyNodeCount: 0
   uninstalling:
   - worker-0
Note

The reason field can also report the following causes:

  • Failed: This is reported if the node cannot finish its transition. The status reports True and the message is Node <node_name> Degraded: <error_message_from_the_node>.
  • BlockedByExistingKataPods: This is reported if there are pods running on a cluster that use the kata-remote runtime while kata-remote is being uninstalled. The status field is False and the message is Existing pods using "kata-remote" RuntimeClass found. Please delete the pods manually for KataConfig deletion to proceed. There could also be a technical error message reported like Failed to list kata pods: <error_message> if communication with the cluster control plane fails.

Legal Notice

Copyright © Red Hat.
Except as otherwise noted below, the text of and illustrations in this documentation are licensed by Red Hat under the Creative Commons Attribution–Share Alike 3.0 Unported license . If you distribute this document or an adaptation of it, you must provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, the Red Hat logo, JBoss, Hibernate, and RHCE are trademarks or registered trademarks of Red Hat, LLC. or its subsidiaries in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
XFS is a trademark or registered trademark of Hewlett Packard Enterprise Development LP or its subsidiaries in the United States and other countries.
The OpenStack® Word Mark and OpenStack logo are trademarks or registered trademarks of the Linux Foundation, used under license.
All other trademarks are the property of their respective owners.