Configure

Red Hat OpenShift Lightspeed 1.0

Configuring OpenShift Lightspeed

Red Hat OpenShift Documentation Team

Abstract

This documentation provides information about configuring OpenShift Lightspeed.

Chapter 1. Configuring and deploying OpenShift Lightspeed

After you install the OpenShift Lightspeed Operator, configure and deploy the OpenShift Lightspeed Service to enable AI-powered assistance in your OpenShift Container Platform cluster.

Note

The instructions assume that you are installing OpenShift Lightspeed using the kubeadmin user account. If you are using a regular user account with cluster-admin privileges, read the section of the documentation that discusses Role-Based Access Control (RBAC).

First, create a credential secret using the credentials for your large language model (LLM) provider. Next, create the OLSConfig custom resource (CR) that the Operator uses to deploy the Service. Finally, verify that the OpenShift Lightspeed Service is operating.

Important

Starting with OpenShift Container Platform 4.19, the perspectives in the web console are unified. The Developer perspective is no longer enabled by default.

All users can interact with all OpenShift Container Platform web console features. However, if you are not the cluster owner, you might need to request permission to certain features from the cluster owner.

You can still enable the Developer perspective. On the Getting Started pane in the web console, you can take a tour of the console, find information on setting up your cluster, view a quick start for enabling the Developer perspective, and follow links to explore new features and capabilities.

1.1. Creating the credentials secret by using the web console

Use the OpenShift Container Platform web console to store the API token that OpenShift Lightspeed uses to authenticate with the large language model (LLM) provider.

As another option, Microsoft Azure also supports authentication by using Microsoft Entra ID.

Prerequisites

You have logged in to the OpenShift Container Platform web console as a user with the cluster-admin role. As another option, you can log in to a user account that has permission to create a secret to store the Provider tokens.
You have installed the OpenShift Lightspeed Operator.

Procedure

Click the Quick create ( ) menu in the upper-right corner of the OpenShift web console and select Import YAML.
Paste the YAML content for your LLM provider into the text area of the web console.
Note
The YAML parameter is always apitoken regardless of what the LLM provider calls the access details.
1. Use the following example to create the Secret to provide OpenShift Lightspeed with the OpenAI API key.
```
apiVersion: v1
kind: Secret
metadata:
  name: openai
  namespace: openshift-lightspeed
type: Opaque
stringData:
  apitoken: <api_token>
```
  - stringData.apitoken represents the API token, and is not base64 encoded.
2. Use the following example to create the Secret to provide OpenShift Lightspeed with the Red Hat Enterprise Linux AI key.
```
apiVersion: v1
stringData:
  apitoken: <api_token>
kind: Secret
metadata:
  name: rhelai-api-keys
  namespace: openshift-lightspeed
type: Opaque
```
  - stringData.apitoken represents the API token. The token must be base64 encoded when stored in a secret.
3. Use the following example to create the Secret to provide OpenShift Lightspeed with the Red Hat OpenShift AI key.
```
apiVersion: v1
kind: Secret
metadata:
  name: rhoai-api-keys
  namespace: openshift-lightspeed
type: Opaque
stringData:
  apitoken: <api_token>
```
  - stringData.apitoken represents the API token. The token must be base64 encoded when stored in a secret.
4. Use the following example to create the Secret to provide OpenShift Lightspeed with the IBM watsonx key.
```
apiVersion: v1
kind: Secret
metadata:
  name: watsonx-api-keys
  namespace: openshift-lightspeed
type: Opaque
stringData:
  apitoken: <api_token>
```
  - stringData.apitoken represents the API token. The token must be base64 encoded when stored in a secret.
5. Use the following example to create the Secret to provide OpenShift Lightspeed with the Microsoft Azure OpenAI key.
```
apiVersion: v1
kind: Secret
metadata:
  name: azure-api-keys
  namespace: openshift-lightspeed
type: Opaque
stringData:
  apitoken: <api_token>
```
  - stringData.apitoken represents the API token. The token must be base64 encoded when stored in a secret.
6. Optional: As another option with Microsoft Azure OpenAI you can use Microsoft Entra ID to authenticate your LLM provider. Microsoft Entra ID users must configure the required roles for their Microsoft Azure OpenAI resource. For more information, see the official Microsoft Content from learn.microsoft.com is not included.Cognitive Services OpenAI Contributor(Microsoft Azure OpenAI Service documentation). Use the following example to authenticate by using Microsoft Entra ID.
```
apiVersion: v1
data:
  client_id: <base64_encoded_client_id>
  client_secret: <base64_encoded_client_secret>
  tenant_id: <base64_encoded_tenant_id>
kind: Secret
metadata:
  name: azure-api-keys
  namespace: openshift-lightspeed
type: Opaque
```
  - data.apitoken represents the API token. The token must be base64 encoded when stored in a secret.
Click Create.

1.2. Creating the OpenShift Lightspeed custom resource file by using the web console

Use the OpenShift Container Platform web console to create the custom resource (CR) file required to deploy OpenShift Lightspeed.

The specific content of the CR file is unique for each large language model (LLM) provider. Choose the configuration file for the LLM provider that you are using.

Prerequisites

You have logged in to the OpenShift Container Platform web console as a user with the cluster-admin role. As another option, you have logged in to a user account that has permission to create a cluster-scoped CR file.
You have an LLM provider available for use with the OpenShift Lightspeed Service.
You have installed the OpenShift Lightspeed Operator.

Procedure

Click the Quick create ( ) menu in the upper-right corner of the OpenShift web console and select Import YAML.

Paste the YAML content for your LLM provider into the text area of the web console.

Use the following example to create the OLSConfig CR to configure OpenShift Lightspeed with your OpenAI provider:

apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  llm:
    providers:
      - name: myOpenai
        type: openai
        credentialsSecretRef:
          name: credentials
        url: https://api.openai.com/v1
        models:
          - name: <model_name>
  ols:
    defaultModel: <model_name>
    defaultProvider: myOpenai

Use the following example to create the OLSConfig CR to configure OpenShift Lightspeed with your Red Hat Enterprise Linux AI provider:
```
apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  llm:
    providers:
    - credentialsSecretRef:
        name: rhelai-api-keys
      models:
      - name: models/<model_name>
      name: rhelai
      type: rhelai_vllm
      url: <url>
  ols:
    defaultProvider: rhelai
    defaultModel: models/<model_name>
```
- spec.llm.providers.credentialsSecretRef.name specifies the name of the Secret that has the API key for the provider. By default, the Red Hat Enterprise Linux AI API key requires a token as part of the request. If your Red Hat Enterprise Linux AI endpoint does not require a token, you must still set the token value to any valid string for the request to authenticate.
- spec.llm.providers.url specifies the URL endpoint for the provider. The URL must end with v1 to be valid. For example, Content from 3.23.103.8 is not included.https://3.23.103.8:8000/v1.
Use the following example to create the OLSConfig CR to configure OpenShift Lightspeed with your Red Hat OpenShift AI provider:
```
apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  llm:
    providers:
    - credentialsSecretRef:
        name: rhoai-api-keys
      models:
      - name: <model_name>
      name: red_hat_openshift_ai
      type: rhoai_vllm
      url: <url>
  ols:
    defaultProvider: red_hat_openshift_ai
    defaultModel: <model_name>
```
- spec.llm.providers.credentialsSecretRef.name specifies the name of the Secret that has the API key for the provider. If your provider configuration does not require a token, you must still provide a Secret containing a valid string for the request to authenticate.
- spec.llm.providers.url specifies the URL endpoint for the provider. The URL must end with v1 to be valid. For example, https://<model_name>.<domain_name>.com:443/v1.

Use the following example to create the OLSConfig CR to configure OpenShift Lightspeed with your Microsoft Azure OpenAI provider:

apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  llm:
    providers:
      - credentialsSecretRef:
          name: azure-api-keys
        apiVersion: <api_version_for_azure_model>
        deploymentName: <azure_ai_deployment_name>
        models:
        - name: <model_name>
        name: myAzure
        type: azure_openai
        url: <azure_ai_deployment_url>
  ols:
    defaultModel: <model_name>
    defaultProvider: myAzure

Use the following example to create the OLSConfig CR to configure OpenShift Lightspeed with your IBM watsonx provider:

apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  llm:
    providers:
      - name: myWatsonx
        type: watsonx
        credentialsSecretRef:
          name: watsonx-api-keys
        url: <ibm_watsonx_deployment_name>
        projectID: <ibm_watsonx_project_id>
        models:
          - name: ibm/<model_name>
  ols:
    defaultModel: ibm/<model_name>
    defaultProvider: myWatsonx

Click Create.

1.2.1. Configuring custom TLS certificates

Use the OpenShift Container Platform web console to configure custom TLS certificates for secure communication between OpenShift Lightspeed and your service endpoints.

Prerequisites

You have logged in to the OpenShift Container Platform web console as a user with the cluster-admin role. As another option, you have logged in to a user account that has permission to create or edit the OLSConfig custom resource (CR).
You have a large language model (LLM) provider.
You have installed the OpenShift Lightspeed Operator.
You have created the credentials secret and the OLSconfig CR.

Procedure

In the OpenShift Container Platform web console, click Operators → Installed Operators.
Select All Projects in the Project dropdown at the top of the screen.
Click OpenShift Lightspeed Operator.
Click OLSConfig, then click the cluster configuration instance in the list.
Click the YAML tab.

Update the OLSconfig CR to contain the file that has the TLS secret.

apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  ols:
    tlsConfig:
      keyCertSecretRef:
        name: <lightspeed_tls>
---
apiVersion: v1
data:
  tls.crt: LS0tLS1CRUd...
  tls.key: LS0tLS1CRUd...
kind: Secret
metadata:
  name: <lightspeed_tls>
  namespace: <openshift_lightspeed>

spec.ols.tlsConfig.keyCertSecretRef.name specifies the secret that has the tls.crt and tls.key file.
apiVersion.data.tls specifies that the name of the certificate must be tls.crt and the name of the key must be tls.key.

Click Save.

Verification

Verify that a new pod exists in the lightspeed-app-server deployment by running the following command:
```
$ oc get pod -n openshift-lightspeed
```

1.3. Creating the credentials secret by using the CLI

Use the command line interface to store the API token that OpenShift Lightspeed uses to authenticate with the large language model (LLM) provider.

Alternatively, Microsoft Azure also supports authentication using Microsoft Entra ID.

Prerequisites

You have access to the OpenShift CLI (oc) as a user with the cluster-admin role. Alternatively, you are logged in to a user account that has permission to create a secret to store the Provider tokens.
You have installed the OpenShift Lightspeed Operator.

Procedure

Create a YAML file that has the credential secret for the LLM provider that you are using.
Note
The YAML parameter is always apitoken regardless of what the LLM provider calls the access details.
1. Use the following example to create the Secret to provide OpenShift Lightspeed with the OpenAI API key.
```
apiVersion: v1
kind: Secret
metadata:
  name: openai
  namespace: openshift-lightspeed
type: Opaque
stringData:
  apitoken: <api_token>
```
  - stringData.apitoken represents the API token. The token must be base64 encoded when stored in a secret.
2. Use the following example to create the Secret to provide OpenShift Lightspeed with the Red Hat OpenShift AI key.
```
apiVersion: v1
kind: Secret
metadata:
  name: rhoai-api-keys
  namespace: openshift-lightspeed
type: Opaque
stringData:
  apitoken: <api_token>
```
  - stringData.apitoken represents the API token. The token must be base64 encoded when stored in a secret.
3. Use the following example to create the Secret to provide OpenShift Lightspeed with the IBM watsonx key.
```
apiVersion: v1
kind: Secret
metadata:
  name: watsonx-api-keys
  namespace: openshift-lightspeed
type: Opaque
stringData:
  apitoken: <api_token>
```
  - stringData.apitoken represents the API token. The token must be base64 encoded when stored in a secret.
4. Use the following example to create the Secret to provide OpenShift Lightspeed with the Microsoft Azure OpenAI key.
```
apiVersion: v1
kind: Secret
metadata:
  name: azure-api-keys
  namespace: openshift-lightspeed
type: Opaque
stringData:
  apitoken: <api_token>
```
  - stringData.apitoken represents the API token. The token must be base64 encoded when stored in a secret.
5. Optional: As another option with Microsoft Azure OpenAI you can use Microsoft Entra ID to authenticate your LLM provider. Microsoft Entra ID users must configure the required roles for their Microsoft Azure OpenAI resource. For more information, see the official Microsoft Content from learn.microsoft.com is not included.Cognitive Services OpenAI Contributor(Microsoft Azure OpenAI Service documentation). Use the following example to authenticate by using Microsoft Entra ID.
```
apiVersion: v1
data:
  client_id: <base64_encoded_client_id>
  client_secret: <base64_encoded_client_secret>
  tenant_id: <base64_encoded_tenant_id>
kind: Secret
metadata:
  name: azure-api-keys
  namespace: openshift-lightspeed
type: Opaque
```
  - data.apitoken represents the API token. The token must be base64 encoded when stored in a secret.
Create the Secret by running the following command:
```
$ oc create -f /path/to/secret.yaml
```

1.4. Creating the OpenShift Lightspeed custom resource file by using the CLI

Use the command line interface to create the custom resource (CR) file required to deploy OpenShift Lightspeed.

The specific content of the CR file is unique for each large language model (LLM) provider. Choose the configuration file for the LLM provider that you are using.

Prerequisites

You have access to the OpenShift CLI (oc) and have logged in as a user with the cluster-admin role. As another option, you have logged in to a user account that has permission to create a cluster-scoped CR file.
You have an LLM provider available for use with the OpenShift Lightspeed Service.
You have installed the OpenShift Lightspeed Operator.

Procedure

Create an OLSConfig file that has the YAML content for the LLM provider you use.

Use the following example to create the OLSConfig CR to configure OpenShift Lightspeed with your OpenAI provider:

apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  llm:
    providers:
      - name: myOpenai
        type: openai
        credentialsSecretRef:
          name: credentials
        url: https://api.openai.com/v1
        models:
          - name: <model_name>
  ols:
    defaultModel: <model_name>
    defaultProvider: myOpenai

Use the following example to create the OLSConfig CR to configure OpenShift Lightspeed with your Red Hat Enterprise Linux AI provider:
```
apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  llm:
    providers:
    - credentialsSecretRef:
        name: rhelai-api-keys
      models:
      - name: models/<model_name>
      name: rhelai
      type: rhelai_vllm
      url: <url>
  ols:
    defaultProvider: rhelai
    defaultModel: models/<model_name>
```
- spec.llm.providers.credentialsSecretRef.name specifies the name of the Secret that has the API key for the provider. By default, the Red Hat Enterprise Linux AI API key requires a token as part of the request. If your Red Hat Enterprise Linux AI endpoint does not require a token, you must still set the token value to any valid string for the request to authenticate.
- spec.llm.providers.url specifies the URL endpoint for the provider. The URL must end with v1 to be valid. For example, Content from 3.23.103.8 is not included.https://3.23.103.8:8000/v1.
Use the following example to create the OLSConfig CR to configure OpenShift Lightspeed with your Red Hat OpenShift AI provider:
```
apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  llm:
    providers:
    - credentialsSecretRef:
        name: rhoai-api-keys
      models:
      - name: <model_name>
      name: red_hat_openshift_ai
      type: rhoai_vllm
      url: <url>
  ols:
    defaultProvider: red_hat_openshift_ai
    defaultModel: <model_name>
```
- spec.llm.providers.credentialsSecretRef.name specifies the name of the Secret that has the API key for the provider. By default, the Red Hat OpenShift AI API key requires a token as part of the request. If your Red Hat OpenShift AI endpoint does not require a token, you must still set the token value to any valid string for the request to authenticate.
- spec.llm.providers.url specifies the URL endpoint for the provider. The URL must end with v1 to be valid. For example, https://<model_name>.<domain_name>.com:443/v1.

Use the following example to create the OLSConfig CR to configure OpenShift Lightspeed with your Microsoft Azure OpenAI provider:

apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  llm:
    providers:
      - credentialsSecretRef:
          name: azure-api-keys
        apiVersion: <api_version_for_azure_model>
        deploymentName: <azure_ai_deployment_name>
        models:
        - name: <model_name>
        name: myAzure
        type: azure_openai
        url: <azure_ai_deployment_url>
  ols:
    defaultModel: <model_name>
    defaultProvider: myAzure

Use the following example to create the OLSConfig CR to configure OpenShift Lightspeed with your IBM watsonx provider:

apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  llm:
    providers:
      - name: myWatsonx
        type: watsonx
        credentialsSecretRef:
          name: watsonx-api-keys
        url: <ibm_watsonx_deployment_name>
        projectID: <ibm_watsonx_project_id>
        models:
          - name: ibm/<model_name>
  ols:
    defaultModel: ibm/<model_name>
    defaultProvider: myWatsonx

Run the following command so that the Operator deploys OpenShift Lightspeed using the information in the YAML configuration file.
```
$ oc create -f /path/to/config-cr.yaml
```

1.4.1. Configure support for trusted-ca certificates and LLM providers

Configure custom TLS certificates to secure communication between OpenShift Lightspeed and LLM provider service endpoints.

The OpenShift Lightspeed Service supports custom TLS certificates for secure communication with LLM providers. You can use trusted-ca certificates with these providers:

Red Hat Enterprise Linux AI vLLM
Red Hat OpenShift AI vLLM
OpenAI
Microsoft Azure OpenAI

Note

IBM watsonx does not support custom certificates.

Procedure

Create a ConfigMap object that contains the certificates.
Add the name of the object in the OLSConfig custom resource (CR) file:
```
ols:
  additionalCAConfigMapRef:
    name: <config_map_name>
```
where
<config_map_name>
Enter the name of your ConfigMap object.

1.4.2. Configuring OpenShift Lightspeed with a trusted CA certificate for the LLM

Use the OpenShift Container Platform web console to configure a trusted CA certificate for secure communication between OpenShift Lightspeed and your large language model (LLM) provider.

Note

If the LLM provider you are using requires a trusted-ca certificate to authenticate the OpenShift Lightspeed Service you must perform this procedure. If the LLM provider does not require a trusted-ca certificate to authenticate the Service, you should skip this procedure.

Procedure

Copy the contents of the certificate file and paste it into a file called caCertFileName.
Create a ConfigMap object called trusted-certs by running the following command:
```
$ oc create configmap trusted-certs --from-file=caCertFileName --namespace openshift-lightspeed
```
This command returns an output similar to the following example:
```
kind: ConfigMap
apiVersion: v1
metadata:
  name: trusted-certs
  namespace: openshift-lightspeed
data:
  caCertFileName: |
    -----BEGIN CERTIFICATE-----
    .
    .
    .
    -----END CERTIFICATE-----
```
- data.caCertFileName specifies the CA certificates required to connect to your LLM provider. You can include one or more certificates within this block to ensure secure communication.
Update the OLSConfig custom resource (CR) file to include the name of the ConfigMap object you just created. The following example uses Red Hat Enterprise Linux AI as the LLM provider.
```
apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  ols:
    defaultProvider: rhelai
    defaultModel: models/<model_name>
    additionalCAConfigMapRef:
      name: trusted-certs
```
- spec.ols.additionalCAConfigMapRef.name specifies the name of ConfigMap object.
Create the custom CR by running the following command:
```
$ oc apply -f <olfconfig_cr_filename>
```

1.5. Verifying the OpenShift Lightspeed deployment

Use the OpenShift Container Platform web console to verify that the OpenShift Lightspeed Service is running and deployed.

Important

Starting with OpenShift Container Platform 4.19, the perspectives in the web console are unified. The Developer perspective is no longer enabled by default.

Prerequisites

You have logged in to the OpenShift Container Platform web console as a user with the cluster-admin role.
You have access to the OpenShift CLI (oc).
You have installed the OpenShift Lightspeed Operator.
You have created the credentials secret and the OLSConfig Custom Resource configuration file.

Procedure

In the OpenShift Container Platform web console, click the Project drop-down list.
Note
For OpenShift Container Platform 4.18 and earlier, select the Developer perspective from the drop-down list at the top of the pane to access the Project drop-down list.
Enable the toggle switch to show default projects.
Select openshift-lightspeed from the list.

Verify that the OpenShift Lightspeed is ready by running the following command:

$ oc logs deployment/lightspeed-app-server -c lightspeed-service-api -n openshift-lightspeed | grep Uvicorn

Example output

INFO: 	Uvicorn running on https://0.0.0.0:8443 (Press CTRL+C to quit)

1.6. About OpenShift Lightspeed and role-based access control (RBAC)

Use role-based access control (RBAC) to manage system security by assigning permissions to specific roles rather than individual users.

OpenShift Lightspeed RBAC is binary. By default, not all cluster users have access to the OpenShift Lightspeed interface. Only users with administrative rights can grant access. All users of an OpenShift cluster with OpenShift Lightspeed installed can see the OpenShift Lightspeed button; however, only users with permissions can submit questions to OpenShift Lightspeed.

If you want to evaluate the RBAC features of OpenShift Lightspeed, your cluster will need users other than the kubeadmin account. The kubeadmin account always has access to OpenShift Lightspeed.

1.7. Expose the OpenShift Lightspeed Service by using a route

The OpenShift Lightspeed Service is an internal-only ClusterIP type by default. To enable external access to the OpenShift Lightspeed REST API, you must create an OpenShift route.

Prerequisites

You have oc CLI access to the cluster through the OpenShift Lightspeed Operator.
You have cluster-admin permissions.

Procedure

Create a YAML file named route.yaml with the following content:

apiVersion: route.openshift.io/v1
kind: Route
metadata:
  name: lightspeed-app-server
  namespace: openshift-lightspeed
  labels:
    app: ols
spec:
  port:
    targetPort: 8443-tcp
  tls:
    insecureEdgeTerminationPolicy: Redirect
    termination: reencrypt
  to:
    kind: Service
    name: lightspeed-app-server
    weight: 100
  wildcardPolicy: None

Note

Use the reencrypt termination policy to ensure that TLS is maintained end-to-end between the router and the pod.

Apply the configuration to your cluster:
```
oc apply -f route.yaml
```

Retrieve the exposed hostname to use as your API base URL:

OLS_HOST=$(oc get route lightspeed-app-server -n openshift-lightspeed -o jsonpath='{.spec.host}')
echo "https://${OLS_HOST}"

Verification

To verify the connection, you must ensure your client trusts the cluster’s internal certificate authority (CA):

For production: Retrieve the cluster CA bundle from the kube-root-ca.crt ConfigMap and add it to your trust store.
For testing: You can use the -k or --insecure flag with curl to bypass CA verification.

1.7.1. Granting a user access by using the CLI

Grant OpenShift Lightspeed permissions to an individual user by running a single oc adm command to apply the query-access role immediately.

Prerequisites

You have logged in to the OpenShift Container Platform web console as a user with the cluster-admin role. As another option, you have logged in as a user with the ability to grant permissions.
You have deployed the OpenShift Lightspeed service.
You have access to the OpenShift CLI (oc).

Procedure

Grant the lightspeed-operator-query-access role to a user. Enter the actual user name in place of <user_name> when running the following command:
```
$ oc adm policy add-cluster-role-to-user \
    lightspeed-operator-query-access <user_name>
```

Verification

Verify that the user has been successfully added to the cluster role binding by running the following command:
```
$ oc get clusterrolebinding lightspeed-operator-query-access
```

1.7.2. Granting a user access by using a YAML configuration file

Grant OpenShift Lightspeed permissions to an individual user by creating and applying a ClusterRoleBinding YAML file for reproducible access management.

Prerequisites

You have logged in to the OpenShift Container Platform web console as a user with the cluster-admin role. As another option, you have logged in as a user with the ability to grant permissions.
You have deployed the OpenShift Lightspeed service.
You have access to the OpenShift CLI (oc).

Procedure

Generate a YAML configuration file for the cluster role binding. Enter the actual user name in place of <user_name> when running the following command:
```
$ oc adm policy add-cluster-role-to-user lightspeed-operator-query-access <user_name> -o yaml --dry-run
```
Apply the generated configuration file to the cluster:
```
$ oc apply -f ols-user-access.yaml
```

Verification

Verify the creation of the ClusterRoleBinding by running the following command:
```
$ oc get clusterrolebinding lightspeed-operator-query-access
```

Inspect the YAML configuration and ensure that it lists the correct user in the subjects section by running the following command:

$ oc get clusterrolebinding lightspeed-operator-query-access -o yaml

This command returns an output similar to the following example:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  creationTimestamp: null
  name: lightspeed-operator-query-access
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: lightspeed-operator-query-access
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: <user_name>

1.7.3. Granting a user group access by using the CLI

Enable a user group to use the OpenShift Lightspeed Service by running a single command to apply cluster permissions immediately.

If your cluster has more advanced identity management configured, including user groups, you can grant all users of a specific group access to the OpenShift Lightspeed Service.

Prerequisites

You have logged in to the OpenShift Container Platform web console as a user with the cluster-admin role. As another option, you have logged in as a user with the ability to grant permissions.
You have deployed the OpenShift Lightspeed Service.
You have access to the OpenShift CLI (oc).

Procedure

Grant the lightspeed-operator-query-access role to your user group by running the following command. Replace <group_name> with the actual name of the user group in your cluster.
```
$ oc adm policy add-cluster-role-to-group \
lightspeed-operator-query-access <group_name>
```
Optional: Verify that the role binding contains the user group by running the following command:
```
$ oc get clusterrolebinding lightspeed-operator-query-access -o wide
```

1.7.4. Granting a user group access by using a YAML configuration file

Grant multiple users access to the OpenShift Lightspeed Service by applying a YAML configuration file.

If your cluster has more advanced identity management configured, including user groups, you can grant all users of a specific group access to the OpenShift Lightspeed Service.

Prerequisites

You have logged in to the OpenShift Container Platform web console as a user with the cluster-admin role. As another option, you have logged in as a user with the ability to grant permissions.
You have deployed the OpenShift Lightspeed Service.
You have access to the OpenShift CLI (oc).

Procedure

Generate the YAML configuration by running the following command:

$ oc adm policy add-cluster-role-to-group lightspeed-operator-query-access <group_name> -o yaml --dry-run > access-policy.yaml

Open the access-policy.yaml file and verify the subjects section contains the correct group name:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  creationTimestamp: null
  name: lightspeed-operator-query-access
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: lightspeed-operator-query-access
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: Group
  name: <user_group>

subjects.name specifies the name of the user group you are granting access to.

Apply the configuration file to the cluster by running the following command:
```
$ oc create -f access-policy.yaml
```

1.7.5. Obtain OpenShift Lightspeed authentication token

Obtain a Kubernetes bearer token by using the oc command-line interface to authenticate your requests to the OpenShift Lightspeed API.

Procedure

Perform any of the following tasks to retrieve your token:

Using the OpenShift CLI (oc):
Run the following command to retrieve the token for your current session:
```
TOKEN=$(oc whoami -t)
```
Using the Kubernetes CLI (kubectl):
Run the following command to create a token for a specific service account:
```
TOKEN=$(kubectl create token _<service_account_name>_ -n _<namespace>_)
```

1.8. Filtering and redacting information

Configure sensitive data filtering in OpenShift Lightspeed to redact private information before sending it to the large language model (LLM) provider.

Note

You should test your regular expressions against sample data to confirm that they identify the information you want to filter or redact, and that they do not identify information you want to send to the LLM. There are several third-party websites that you can use to test your regular expressions. When using third-party sites, you should practice caution with regards to sharing your private data. As another option, you can test the regular expressions locally using Python. In Python, it is possible to design very computationally-expensive regular expressions. Using several complex expressions as query filters can adversely impact the performance of OpenShift Lightspeed.

This example shows how to update the OLSConfig custom resource (CR) file to redact IP addresses, but you can also filter or redact other types of sensitive information.

Note

If you configure filtering or redacting in the OLSConfig CR file, and you configure introspectionEnabled to enable a Model Context Protocol (MCP) server, any content that the tools return is not filtered and is visible to the LLM.

Prerequisites

You have logged in to the OpenShift Container Platform web console as a user with the cluster-admin role.
You have access to the OpenShift CLI (oc).
You have installed the OpenShift Lightspeed Operator and deployed the OpenShift Lightspeed service.

Procedure

Update the OLSConfig CR file and create an entry for each regular expression to filter. The following example redacts IP addresses:
```
spec:
  ols:
    queryFilters:
      - name: ip-address
        pattern: '((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}'
        replaceWith: <IP_ADDRESS>
```
Run the following command to apply the modified OpenShift Lightspeed custom configuration:
```
$ oc apply -f OLSConfig.yaml
```

1.9. About the BYO Knowledge tool

Enhance OpenShift Lightspeed responses by using the BYO Knowledge tool to create a retrieval-augmented generation (RAG) database that includes documentation specific to your organization.

When you create a RAG database, you customize the OpenShift Lightspeed service for your environment. For example, a network administrator can use a standard operating procedure (SOP) to provision an OpenShift Container Platform cluster. Then, the network administrator can use the BYO Knowledge tool to enhance the knowledge available to the LLM by including information from the SOP.

To bring your own knowledge to an LLM, you complete the following steps:

Create the custom content in Markdown format.
Use the BYO Knowledge tool to package the content as a container image.
Push the container image to an image registry, such as quay.io.
Update the OLSConfig custom resource file to list the image that you pushed to the image registry.
Access the OpenShift Lightspeed virtual assistant and submit a question associated with the custom knowledge that you made available to the LLM.
Note
When you use the BYO Knowledge tool, you provide documents directly to the LLM provider.

OpenShift Lightspeed supports automatic updates of BYO Knowledge images that use floating tags, such as latest. If over time a BYO Knowledge image tag points to different underlying images, OpenShift Lightspeed detects those changes and updates the corresponding BYO Knowledge database accordingly. This feature uses OpenShift ImageStream objects. OpenShift Container Platform clusters check for updates to ImageStream objects every 15 minutes.

1.9.1. About document title and URL

Display the source titles and URLs OpenShift Lightspeed uses to verify the accuracy of generated responses and access the original documentation for additional context.

In the retrieval-augmented generation (RAG) database, titles and URLs accompany documents as metadata. The BYO Knowledge tool obtains the title and URL attributes from metadata if they reside in the Markdown files that the tool processes.

---
title: "Introduction to Layers {#gimp-concepts-layers}"
url: "https://docs.gimp.org/3.0/en/gimp-using-layers.html"
---

# Introduction to Layers
...

If a Markdown file does not have metadata with the title and url attributes, the first top-level Markdown heading, for example # Introduction to Layers, becomes the title and the file path becomes the URL.

1.9.2. Providing custom knowledge to the LLM

Customize the information available to the large language model (LLM) by providing access to a container image that resides in a remote image registry.

The container image that the tool generates contains a custom RAG database. The RAG database provides additional information to the LLM.

The examples in this procedure use quay.io as the remote container image registry, and the path for the custom image is quay.io/<username>/my-byok-image:latest.

Important

The BYO Knowledge tool is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

Prerequisites

You have logged in to the OpenShift Container Platform web console as a user account that has permission to create a cluster-scoped custom resource (CR) file, such as a user with the cluster-admin role.
You have an LLM provider available for use with the OpenShift Lightspeed Service.
You have installed the OpenShift Lightspeed Operator.
Your custom information consists of Markdown files with .md extensions. The tool does not support other file formats.
You have logged in to registry.redhat.io by using Podman.
You have an account for a container image registry, such as quay.io.

Procedure

Specify the location of the directory with the Markdown files for the retrieval-augmented generation (RAG) database and the path for the image that the BYO Knowledge tool generates by running the following command:

$ podman run -it --rm --device=/dev/fuse \
  -v $XDG_RUNTIME_DIR/containers/auth.json:/run/user/0/containers/auth.json:Z \
  -v <dir_tree_with_markdown_files>:/markdown:Z \
  -v <dir_for_image_tar>:/output:Z \
  registry.redhat.io/openshift-lightspeed-tech-preview/lightspeed-rag-tool-rhel9:latest

Load the container image that the BYO Knowledge tool generated by running the following command:
```
$ podman load < <directory_for_image_tar>/<my-byok-image.tar>
```

Display the Podman images that are on your local computer by running the following command:

$ podman images

This command returns an output similar to the following example:

REPOSITORY                            TAG                IMAGE ID      CREATED       SIZE
localhost/my-byok-image               latest             be7d1770bf10  1 minute  ago    2.37 GB
...

Tag the local image with a name and destination so that you can push the image to the container image registry by running the following command:
```
$ podman tag localhost/my-byok-image:latest quay.io/<username>/my-byok-image:latest
```
Push the local container image to the container image registry by running the following command:
```
$ podman push quay.io/<username>/my-byok-image:latest
```
Update the OLSconfig CR to deploy the newly created RAG database alongside the existing one:
1. In the OpenShift Container Platform web console, click Operators → Installed Operators.
2. Select All Projects in the Project dropdown at the top of the screen.
3. Click OpenShift Lightspeed Operator.
4. Click OLSConfig, then click the cluster configuration instance in the list.
5. Click the YAML tab.
6. Insert the spec.ols.rag YAML code:
```
apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  ols:
    rag:
      - image: quay.io/<username>/my-byok-image:latest
```
  - spec.ols.rag.image specifies the tag for the image that you pushed to the image registry so that the OpenShift Lightspeed Operator can access the custom content. The OpenShift Lightspeed Operator can work with more than one RAG database that you create.
Optional: Specify pull secrets in the OLSSpec section of the OLSConfig CR file. These secrets provide authentication for remote registries. Use this optional field if your RAG BYO Knowledge images reside in a private registry that the standard cluster-wide pull secret cannot access.
```
apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
  namespace: openshift-lightspeed
spec:
  llm:
    providers:
...
  ols:
    imagePullSecrets:
      - name: <my_pull_secret_1>
      - name: <my_pull_secret_2>
```
- spec.ols.imagePullSecrets defines the pull secrets that OpenShift Lightspeed uses only when you specify BYO Knowledge RAG images. Instead of linking a specific secret to a specific image, the system maintains a general list of pull secrets. For every BYO Knowledge image, the system tries each pull secret in sequential order until it achieves a successful authentication.
Click Save.

Verification

Access the OpenShift Lightspeed virtual assistant and submit a question associated with the custom content that you provided to the LLM.
The OpenShift Lightspeed virtual assistant generates a response based on the custom content.

1.9.3. Disabling the OpenShift Container Platform documentation retrieval-augmented generation (RAG) database

Disable the default OpenShift Container Platform documentation in the OLSConfig custom resource (CR) to prevent the service from using the built-in database that has the OpenShift Container Platform documentation.

Then, the only retrieval-augmented generation (RAG) databases OpenShift Lightspeed uses are the ones that you provide to the service by using the BYO Knowledge feature.

Prerequisites

You have logged in to the OpenShift Container Platform web console as a user account with permission to create a cluster-scoped CR file, such as a user with the cluster-admin role.
You have installed the OpenShift Lightspeed Operator.
You have configured the large language model provider.
You have configured the OLSConfig CR file, which automatically deploys the OpenShift Lightspeed Service.
You have created a RAG database that contains the content you want to use, as described in "Providing custom knowledge to the LLM".

Procedure

In the OpenShift Container Platform web console, click Operators → Installed Operators.
Select All Projects in the Project list at the top of the screen.
Click OpenShift Lightspeed Operator.
Click OLSConfig, then click the cluster configuration instance in the list.
Click the YAML tab.
Insert the spec.ols.byokRAGOnly YAML code.
```
apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  ols:
    byokRAGOnly: true
```
- spec.ols.byokRAGOnly specifies if the Service limits responses by using only the information found in the local documentation that you provide. Specify true so that OpenShift Lightspeed only uses RAG databases that you create by using the BYO Knowledge feature. When true, OpenShift Lightspeed does not use the default RAG database that contains the OpenShift Container Platform documentation.
Click Save.

Additional resources

Providing custom knowledge to the LLM

1.10. About cluster interaction

Enable the cluster interaction feature to provide the large language model (LLM) with additional context about your OpenShift Container Platform cluster.

OpenShift Lightspeed uses an LLM to answer your questions. When you share details about your cluster objects, the LLM provides specific answers for your environment.

The Model Context Protocol (MCP) is an open protocol that creates a standard way for applications to provide context to an LLM. Using this protocol, an MCP server enables an LLM to increase context by requesting and receiving real-time information from external resources.

The introspectionEnabled field in the OLSConfig custom resource (CR) is true by default. You do not need to specify this field to use the Observability MCP server. To disable this MCP server, you must set this field to false.

Note

If you configure filtering or redacting in the OLSConfig CR file and enable an MCP server, any content that the tools return is not filtered and remains visible to the LLM.

When you enable cluster interaction, the OpenShift Lightspeed Operator installs the Observability MCP server. This MCP server provides the OpenShift Lightspeed Service with access to the OpenShift API. Through this access, the Service performs read operations to gather cluster context for the LLM. This context enables the Service to generate answers to questions about the Kubernetes objects that reside in your OpenShift cluster.

Note

The ability of OpenShift Lightspeed to choose and use a tool effectively depends on the LLM model. In general, a larger model with more parameters performs better. The best performance comes from an extremely large frontier model that represents the latest AI capabilities. When using a small model, you might notice poor performance in tool selection or other aspects of cluster interaction.

You must enable tool calling in the LLM provider to activate the cluster interaction feature in the OpenShift Lightspeed Service.

Note

Enabling tool calling can dramatically increase token usage. When you use a public model provider, an increase in token usage usually increases billing costs.

1.10.1. Disabling cluster interaction

Disable the cluster interaction feature by modifying the OLSConfig custom resource (CR) if you do not want OpenShift Lightspeed to access cluster-specific context.

Important

The cluster interaction feature is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

Prerequisites

You have logged in to the OpenShift Container Platform web console as a user with the cluster-admin role. As another option, you have logged in to a user account that has permission to create a cluster-scoped custom resource.
You have configured the large language model (LLM) provider.
You have installed the OpenShift Lightspeed Operator.

Procedure

In the OpenShift Container Platform web console, click Operators → Installed Operators.
Click OpenShift Lightspeed Operator.
Click OLSConfig, then click the cluster configuration instance in the list.
Click the YAML tab.

Set the spec.ols.introspectionEnabled parameter to false to disable cluster interaction:

apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  ols:
    introspectionEnabled: false

Click Save.

Verification

Access the OpenShift Lightspeed virtual assistant and submit a question about your cluster objects. The OpenShift Lightspeed virtual assistant responds generally without using cluster-specific resource details.

1.10.2. Enabling a custom MCP server

Add an additional Model Context Protocol (MCP) server that interfaces with a tool in your environment so that the large language model (LLM) uses the tool to generate answers to your questions.

Important

The MCP server feature is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

Prerequisites

You have installed the OpenShift Lightspeed Operator.
You have configured a large language model provider.
You have configured the OLSConfig CR file, which automatically deploys the OpenShift Lightspeed Service.

Procedure

Edit the OpenShift Lightspeed OLSConfig custom resource (CR) file by running the following command:
```
$ oc edit olsconfig cluster
```
Add MCPServer to the spec.featureGates specification file and include the MCP server information.
```
apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  featureGates:
    - MCPServer
  mcpServers:
  - name: mcp-server-1
    url: https://mcp.example.com
    timeout: 30
    headers:
      - name: Authorization
        valueFrom:
         type: kubernetes
  - name: mcp-server-2
    url: https://mcp.example.com
    timeout: 30
    headers:
      - name: X-Special
        valueFrom:
          type: secret
          secretRef:
            name: <secret_name>
```
- spec.featureGates specifies the MCP server functionality on the OpenShift Lightspeed pod.
- spec.mcpServers.name specifies the name of the MCP server.
- spec.mcpServers.url specifies the URL path that the MCP server uses to communicate
- spec.mcpServers.timeout specifies the time that the MCP server has to respond to a query. If the Service does not receive a response within the time specified, the connection times out. In this example, the timeout is 30 seconds.
- spec.mcpServers.headers specifies MCP headers as an array of structured objects, which are required for MCP server authentication.
- spec.mcpServers.headers.name specifies the name of the header that gets sent to the MCP server.
- spec.mcpServers.headers.valueFrom.type: Specifies the authentication source type. Valid values are secret, kubernetes (to provide the user’s bearer token), or client.
- spec.mcpServers.headers.valueFrom.secretRef.name specifies the name of the secret that contains the header value. Ensure that the secret has the key name header.
  Note
  If your custom MCP server handles sensitive operations, ensure that these operations do not have read-only annotations to enforce human-in-the-loop (HITL) approvals.
Save and close the text editor to apply the changes.
The save operation applies the updates and makes the custom MCP server available to the OpenShift Lightspeed service.

1.11. Tokens and token quota limits

Token quotas manage the amount of text that the OpenShift Lightspeed Service exchanges with a large language model (LLM). These limits control costs and ensure all users get fair access to resources.

The Service measures the text it exchanges with the LLM in tokens. A token is a small unit of text, ranging from a single character to a full word. Every chat between the Service and the LLM uses tokens.

Token quotas limit how many tokens you can use within a specific time frame. You can define token quota limits for OpenShift clusters or OpenShift user accounts.

1.11.1. Activating token quota limits

To manage resource consumption, activate token quota limits for OpenShift Lightspeed by adding the quotaHandlersConfig section to the OLSConfig custom resource (CR). When you update the CR, the OpenShift Lightspeed Operator automatically generates the required internal configuration for the service.

Prerequisites

You have installed the OpenShift Lightspeed Operator.
You have configured a large language model (LLM) provider.
The OpenShift Lightspeed service has access to a PostgreSQL database.

Procedure

Edit the OpenShift Lightspeed OLSconfig CR file by running the following command:
```
$ oc edit olsconfig cluster
```
Update the spec.ols.quotaHandlersConfig specification to include the token quota limit configuration.
```
apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  ols:
    quotaHandlersConfig:
      enableTokenHistory: true
      limitersConfig:
      - name: user_monthly_limits
        type: user_limiter
        initialQuota: 100000
        quotaIncrease: 1000
        period: 30 days
      - name: cluster_monthly_limits
        type: cluster_limiter
        initialQuota: 1000000
        quotaIncrease: 100000
        period: 30 days
```
- General configuration:
  - spec.ols.quotaHandlersConfig.enableTokenHistory enables the tracking of token usage history in the PostgreSQL database.
- User and cluster quota configuration:
  - spec.ols.quotaHandlersConfig.limitersConfig[].name specifies a unique name for the quota limiter.
  - spec.ols.quotaHandlersConfig.limitersConfig[].initialQuota sets the initial number of tokens available (for example, 100000 for users or 1000000 for clusters).
  - spec.ols.quotaHandlersConfig.limitersConfig[].quotaIncrease specifies the number of tokens added to the quota at the end of each period.
  - spec.ols.quotaHandlersConfig.limitersConfig[].period defines the duration before the quota resets or the quota limit increases.
Save the changes and exit the editor.

Verification

Confirm that the configuration is active by checking the OpenShift Lightspeed Operator logs by running the following command:
```
$ oc logs -l app.kubernetes.io/name=lightspeed-operator -n openshift-lightspeed
```
Alternatively, verify the CR was saved correctly by running the following command:
```
$ oc get olsconfig cluster -o jsonpath='{.spec.ols.quotaHandlersConfig}'
```

1.12. About OpenShift Lightspeed and PostgreSQL persistence

PostgreSQL persistence ensures that OpenShift Lightspeed conversation history and quota usage data remain available across pod restarts and rescheduling events.

By default, the Service disables PostgreSQL persistence.

To enable the functionality, add the spec.ols.storage specification to the OLSConfig custom resource (CR).

1.12.1. Enabling PostgreSQL persistence

Enable PostgreSQL persistence for OpenShift Lightspeed by modifying the OLSConfig custom resource (CR) file.

Important

PostgreSQL persistence is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

Prerequisites

You have logged in to the OpenShift Container Platform web console as a user account with permission to create a cluster-scoped CR file, such as a user with the cluster-admin role.
You have installed the OpenShift Lightspeed Operator.
You have configured the large language model provider.

Procedure

In the OpenShift Container Platform web console, click Operators → Installed Operators.
Select All Projects in the Project list at the top of the screen.
Click OpenShift Lightspeed Operator.
Click OLSConfig, then click the cluster configuration instance in the list.
Click the YAML tab.
Insert the spec.ols.storage YAML code.
```
apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
  namespace: openshift-lightspeed
spec:
  llm:
    providers:
...
  ols:
    storage: {}
```
- spec.ols.storage specifies how the assistant stores persistent data, specifically conversation history. The class depends on the existing instances of the storage class in the cluster. If you leave the storage class empty, the assistant uses default values. The persistent volume allocated for the PostgreSQL database is 1 GB in size and uses the storage class of the default cluster. Specify empty braces for the storage parameter to use the default values.
Click Save.

1.12.2. Overriding default Persistent Volume Claim (PVC) specifications

Customize the storage capacity and storage class for the OpenShift Lightspeed database by modifying the OLSConfig custom resource (CR).

Prerequisites

You have logged in to the OpenShift Container Platform web console as a user account with permission to create a cluster-scoped CR, such as a user with the cluster-admin role.
You have installed the OpenShift Lightspeed Operator.
You have configured the large language model provider.
You have access to the OpenShift CLI (oc).

Procedure

In the OpenShift Container Platform web console, click Operators → Installed Operators.
Select All Projects in the Project dropdown at the top of the screen.
Click OpenShift Lightspeed Operator.
Click OLSConfig, then click the cluster configuration instance in the list.
Click the YAML tab.
Update the OLSconfig CR to override the default PVC specifications as shown in the following example.
```
apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
  namespace: openshift-lightspeed
spec:
  ols:
    storage:
      size: 768Mi
      class: gp2-csi
```
- spec.ols.storage.size specifies the total storage capacity for the database. If you do not specify this parameter, the Operator uses the default size of 1 GiB.
- spec.ols.storage.class specifies the Storage Class for the database volume. If you do not specify this parameter, the Operator uses the default storage class setting of the cluster.
Click Save.

Verification

Verify that the cluster has successfully provisioned the storage by checking the status of the Persistent Volume Claim.
```
$ oc get pvc -n openshift-lightspeed
```

1.13. About query-based tool filtering

Query-based filtering uses a hybrid retrieval-augmented generation (RAG) system to identify the most appropriate set of tool for a user request.

When a large language model (LLM) application has access to hundreds of tools, sending the full list in one prompt slows performance and raises costs. Query-based filtering finds and retrieves the most relevant set of tools for a request in milliseconds. This pre-processing step removes selection interference, and ensures that the LLM focuses its reasoning capabilities on a small, high-quality subset of functions.

Restricting the set of tools available reduces token use, prevents model confusion, and maintains high execution accuracy. This approach transforms a massive tool library into a fast, lean interface.

1.13.1. Enabling query-based tool filtering

Enable query-based tool filtering to automatically select relevant tools and specify the maximum number of tool call iterations for OpenShift Lightspeed.

Prerequisites

You are logged in to the OpenShift Container Platform web console as a user with the cluster-admin role. Alternatively, you are logged in to a user account that has permission to create a cluster-scoped CR file.
You have an LLM provider available for use with the OpenShift Lightspeed Service.
You have installed the OpenShift Lightspeed Operator.

Procedure

In the OpenShift Container Platform web console, click Operators → Installed Operators.
Select All Projects in the Project dropdown at the top of the screen.
Click OpenShift Lightspeed Operator.
Click OLSConfig, then click the cluster configuration instance in the list.
Click the YAML tab.
Modify the ols.Config custom resource (CR) file to define a feature gate and the tools filtering configuration.
```
apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  featureGates:
    - ToolFiltering
  olsConfig:
    maxIterations: 5
    toolFilteringConfig:
      alpha: 0.8
      topK: 10
      threshold: 0.01
```
- spec.featureGates.ToolFilter specifies the feature gate.
- spec.olsConfig.maxIterations defines the number of rounds Lightspeed executes when invoking the LLM with tools.
- spec.olsConfig.toolFilteringConfig.alpha specifies the weight balance between semantic (RAG-based) and keyword matching. Increasing the value provides more weight to the semantic search. The valid range of values is 0 to 1.
- spec.olsConfig.toolFilteringConfig.topk specifies the maximum number of tools available for the LLM.
- spec.olsConfig.toolFilteringConfig.threshold specifies the minimum score for the tool to be considered as a candidate. Tools with a value of the score lower than the threshold value are discarded. Increasing the value discards more tools. The valid range of values is 0.01 to 0.1.
  Note
  This example uses the default values for the maxIterations, alpha, tpok and threshold fields. If you use the default values in your configuration, you do not have to specify them in your file.
Click Save.

Verification

Navigate to the OpenShift Container Platform web console.
Select Workloads → Pods and then click the pod that contains OpenShift Lightspeed.
Click Logs and confirm that the log displays RAG information.

1.13.2. About operation approvals

To manage sensitive cluster changes, you can use the operation approvals feature. This feature requires human validation before OpenShift Lightspeed executes a selected tool.

The Observability MCP server supports both read and write operations. The Service automatically enables Human-In-The-Loop (HITL) support for non-read operations. Any non-read operation triggers an approval request in the user interface (UI).

The toolsApprovalConfig object in the custom resource (CR) defines these validation policies. This object supports configuration options to set the tool approval level.

An administrator must review and clear the pending request before the service can execute the underlying cluster changes. This mechanism prevents unintended or unauthorized alterations to your environment.

1.13.3. Tools approval configuration

The toolsApprovalConfig object controls whether tool calls require user approval before execution.

Field	Type	Default	Description
`approvalType`	string (enum)	`tool_annotations`	Approval strategy for tool execution. Support values: `never`, `always`, `tool_annotations`.
`approvalTimeout`	int	`600`	Timeout in seconds for waiting for user approval. The minimum value is `1`.

The following table depicts the approvalType values:

Value	Description
`never`	All tools execute without requiring user approval.
`always`	All tool calls require user approval before execution.
`tool_annotations`	Per-tool annotations determine approval. Each tool can individually declare whether it needs approval.

Chapter 2. OLSConfig API reference

Use the OLSConfig API reference to identify the parameters and schema requirements for the OpenShift Lightspeed configuration object. This reference defines the technical specifications you must follow to configure your service deployments.

Note

The API parameter information originated in the OLSConfig API reference and is provided here for convenience.

2.1. OLSConfig API specifications

Description

Red Hat Red Hat OpenShift Lightspeed Lightspeed instance. OLSConfig is the Schema for the olsconfigs API

Type

object

Required

spec

Property	Type	Description
`apiVersion`	`string`	APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and might reject unrecognized values. More info: Content from git.k8s.io is not included.https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
`kind`	`string`	Kind is a string value representing the REST resource this object represents. Servers might infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: Content from git.k8s.io is not included.https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
`metadata`	`object`	Standard object’s metadata. More info: Content from git.k8s.io is not included.https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata
`spec`	`object`	OLSConfigSpec defines the desired state of OLSConfig

2.1.1. .metadata

Description: Standard object’s metadata. More info: Content from git.k8s.io is not included.https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata
Type: object

2.1.2. .spec

Description

OLSConfigSpec defines the desired state of OLSConfig

Type

object

Required

llm
ols

Property	Type	Description
`featureGates`	`array (string)`	Feature Gates holds list of features to be enabled explicitly, otherwise they are disabled by default. possible values: MCPServer
`llm`	`object`	LLMSpec defines the desired state of the large language model (LLM).
`mcpServers`	`array`	MCP Server settings
`ols`	`object`	OLSSpec defines the desired state of OLS deployment.
`olsDataCollector`	`object`	OLSDataCollectorSpec defines allowed OLS data collector configuration.
`toolsApprovalConfig`	`object`	ToolsApprovalConfig defines the configuration for tool execution approvals.

2.1.3. .spec.llm

Description

LLMSpec defines the desired state of the large language model (LLM).

Type

object

Required

providers

Property	Type	Description
`providers`	`array`

2.1.4. .spec.llm.providers

Description
Type: array

2.1.5. .spec.llm.providers[]

Description

ProviderSpec defines the desired state of LLM provider.

Type

object

Required

credentialsSecretRef
models
name
type

Property	Type	Description
`apiVersion`	`string`	API Version for Azure OpenAI provider
`credentialsSecretRef`	`object`	The name of the secret object that stores API provider credentials
`deploymentName`	`string`	Azure OpenAI deployment name
`models`	`array`	List of models from the provider
`name`	`string`	Provider name
`projectID`	`string`	Watsonx Project ID
`tlsSecurityProfile`	`object`	TLS Security Profile used by connection to provider
`type`	`string`	Provider type
`url`	`string`	Provider API URL

2.1.6. .spec.llm.providers[].credentialsSecretRef

Description: The name of the secret object that stores API provider credentials
Type: object

Property	Type	Description
`name`	`string`	Name of the referent. This field is effectively required, but due to backwards compatibility is allowed to be empty. Instances of this type with an empty value here are almost certainly wrong. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

2.1.7. .spec.llm.providers[].models

Description: List of models from the provider
Type: array

2.1.8. .spec.llm.providers[].models[]

Description

ModelSpec defines the LLM model to use and its parameters.

Type

object

Required

name

Property	Type	Description
`contextWindowSize`	`integer`	Defines the model’s context window size, in tokens. The default is 128k tokens.
`name`	`string`	Model name
`parameters`	`object`	Model API parameters
`url`	`string`	Model API URL

2.1.9. .spec.llm.providers[].models[].parameters

Description: Model API parameters
Type: object

Property	Type	Description
`maxTokensForResponse`	`integer`	Max tokens for response. The default is 2048 tokens.

2.1.10. .spec.llm.providers[].tlsSecurityProfile

Description: TLS Security Profile used by connection to provider
Type: object

Property	Type	Description
`custom`	``	custom is a user-defined TLS security profile. Be extremely careful using a custom profile as invalid configurations can be catastrophic. An example custom profile looks like this: ciphers: - ECDHE-ECDSA-CHACHA20-POLY1305 - ECDHE-RSA-CHACHA20-POLY1305 - ECDHE-RSA-AES128-GCM-SHA256 - ECDHE-ECDSA-AES128-GCM-SHA256 minTLSVersion: VersionTLS11
`intermediate`	``	intermediate is a TLS security profile based on: Content from wiki.mozilla.org is not included.https://wiki.mozilla.org/Security/Server_Side_TLS#Intermediate_compatibility_.28recommended.29 and looks like this (yaml): ciphers: - TLS_AES_128_GCM_SHA256 - TLS_AES_256_GCM_SHA384 - TLS_CHACHA20_POLY1305_SHA256 - ECDHE-ECDSA-AES128-GCM-SHA256 - ECDHE-RSA-AES128-GCM-SHA256 - ECDHE-ECDSA-AES256-GCM-SHA384 - ECDHE-RSA-AES256-GCM-SHA384 - ECDHE-ECDSA-CHACHA20-POLY1305 - ECDHE-RSA-CHACHA20-POLY1305 - DHE-RSA-AES128-GCM-SHA256 - DHE-RSA-AES256-GCM-SHA384 minTLSVersion: VersionTLS12
`modern`	``	modern is a TLS security profile based on: Content from wiki.mozilla.org is not included.https://wiki.mozilla.org/Security/Server_Side_TLS#Modern_compatibility and looks like this (yaml): ciphers: - TLS_AES_128_GCM_SHA256 - TLS_AES_256_GCM_SHA384 - TLS_CHACHA20_POLY1305_SHA256 minTLSVersion: VersionTLS13
`old`	``	old is a TLS security profile based on: Content from wiki.mozilla.org is not included.https://wiki.mozilla.org/Security/Server_Side_TLS#Old_backward_compatibility and looks like this (yaml): ciphers: - TLS_AES_128_GCM_SHA256 - TLS_AES_256_GCM_SHA384 - TLS_CHACHA20_POLY1305_SHA256 - ECDHE-ECDSA-AES128-GCM-SHA256 - ECDHE-RSA-AES128-GCM-SHA256 - ECDHE-ECDSA-AES256-GCM-SHA384 - ECDHE-RSA-AES256-GCM-SHA384 - ECDHE-ECDSA-CHACHA20-POLY1305 - ECDHE-RSA-CHACHA20-POLY1305 - DHE-RSA-AES128-GCM-SHA256 - DHE-RSA-AES256-GCM-SHA384 - DHE-RSA-CHACHA20-POLY1305 - ECDHE-ECDSA-AES128-SHA256 - ECDHE-RSA-AES128-SHA256 - ECDHE-ECDSA-AES128-SHA - ECDHE-RSA-AES128-SHA - ECDHE-ECDSA-AES256-SHA384 - ECDHE-RSA-AES256-SHA384 - ECDHE-ECDSA-AES256-SHA - ECDHE-RSA-AES256-SHA - DHE-RSA-AES128-SHA256 - DHE-RSA-AES256-SHA256 - AES128-GCM-SHA256 - AES256-GCM-SHA384 - AES128-SHA256 - AES256-SHA256 - AES128-SHA - AES256-SHA - DES-CBC3-SHA minTLSVersion: VersionTLS10
`type`	`string`	type is one of Old, Intermediate, Modern or Custom. Custom provides the ability to specify individual TLS security profile parameters. Old, Intermediate and Modern are TLS security profiles based on: Content from wiki.mozilla.org is not included.https://wiki.mozilla.org/Security/Server_Side_TLS#Recommended_configurations The profiles are intent based, so they might change over time as new ciphers are developed and existing ciphers are found to be insecure. Depending on precisely which ciphers are available to a process, the list might be reduced. Note that the Modern profile is currently not supported because it is not yet well adopted by common software libraries.

2.1.11. .spec.mcpServers

Description: MCP Server settings
Note
The introspectionEnabled field in the OLSConfig custom resource (CR) is true by default. You do not need to specify this field to use the built-in Kubernetes MCP server. To disable the built-in Kubernetes MCP server, you must set introspectionEnabled to false.
Type: array

2.1.12. .spec.mcpServers[]

Description

MCPServer defines the settings for a single MCP server.

Type

object

Required

name

Property	Type	Description
`args`	`array (string)`	Custom arguments passed to the MCP server during initialization.
`name`	`string`	Name of the MCP server
`streamableHTTP`	`object`	Streamable HTTP Transport settings

2.1.13. .spec.mcpServers[].args

Description: Custom arguments passed directly to the MCP server process initialization command.
Type: array (string)

2.1.14. .spec.ols

Description

OLSSpec defines the desired state of OLS deployment.

Type

object

Required

defaultModel
defaultProvider

Property	Type	Description
`additionalCAConfigMapRef`	`object`	Additional CA certificates for TLS communication between OLS service and LLM Provider
`byokRAGOnly`	`boolean`	Only use BYOK RAG sources, ignore the Red Hat OpenShift Lightspeed documentation RAG
`conversationCache`	`object`	Conversation cache settings
`defaultModel`	`string`	Default model for usage
`defaultProvider`	`string`	Default provider for usage
`deployment`	`object`	OLS deployment settings
`introspectionEnabled`	`boolean`	Enable introspection features
`logLevel`	`string`	Log level. Valid options are DEBUG, INFO, WARNING, ERROR and CRITICAL. Default: "INFO".
`proxyConfig`	`object`	Proxy settings for connecting to external servers, such as LLM providers.
`queryFilters`	`array`	Query filters
`quotaHandlersConfig`	`object`	LLM Token Quota Configuration
`rag`	`array`	RAG databases
`storage`	`object`	Persistent Storage Configuration
`tlsConfig`	`object`	TLS configuration of the Lightspeed backend’s HTTPS endpoint
`tlsSecurityProfile`	`object`	TLS Security Profile used by API endpoints
`userDataCollection`	`object`	User data collection switches

2.1.15. .spec.ols.additionalCAConfigMapRef

Description: Additional CA certificates for TLS communication between OLS service and LLM Provider
Type: object

Property	Type	Description
`name`	`string`	Name of the referent. This field is effectively required, but due to backwards compatibility is allowed to be empty. Instances of this type with an empty value here are almost certainly wrong. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

2.1.16. .spec.ols.conversationCache

Description: Conversation cache settings
Type: object

Property	Type	Description
`postgres`	`object`	PostgresSpec defines the desired state of Postgres.
`type`	`string`	Conversation cache type. Default: "postgres"

2.1.17. .spec.ols.conversationCache.postgres

Description: PostgresSpec defines the desired state of Postgres.
Type: object

Property	Type	Description
`credentialsSecret`	`string`	Secret that holds postgres credentials
`dbName`	`string`	Postgres database name
`maxConnections`	`integer`	Postgres maxconnections. Default: "2000"
`sharedBuffers`	`integer-or-string`	Postgres sharedbuffers
`user`	`string`	Postgres user name

2.1.18. .spec.ols.deployment

Description: OLS deployment settings
Type: object

Property	Type	Description
`api`	`object`	API container settings.
`console`	`object`	Console container settings.
`dataCollector`	`object`	Data Collector container settings.
`database`	`object`	Database container settings.
`mcpServer`	`object`	MCP server container settings.
`replicas`	`integer`	Defines the number of desired OLS pods. Default: "1"

2.1.19. .spec.ols.deployment.api

Description: API container settings.
Type: object

Property	Type	Description
`nodeSelector`	`object (string)`
`resources`	`object`	ResourceRequirements describes the compute resource requirements.
`tolerations`	`array`

2.1.20. .spec.ols.deployment.api.resources

Description: ResourceRequirements describes the compute resource requirements.
Type: object

Property Type Description

Property	Type	Description
`claims`	`array`	Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container. This is an alpha field and requires enabling the DynamicResourceAllocation feature gate. This field is immutable. It can only be set for containers.
`limits`	`integer-or-string`	Limits describes the maximum amount of compute resources allowed. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
`requests`	`integer-or-string`	Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

claims

array

Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container.

This is an alpha field and requires enabling the DynamicResourceAllocation feature gate.

This field is immutable. It can only be set for containers.

limits

integer-or-string

Limits describes the maximum amount of compute resources allowed. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

requests

integer-or-string

Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

2.1.21. .spec.ols.deployment.api.resources.claims

Description

Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container.

This is an alpha field and requires enabling the DynamicResourceAllocation feature gate.

This field is immutable. It can only be set for containers.

Type

array

2.1.22. .spec.ols.deployment.api.resources.claims[]

Description

ResourceClaim references one entry in PodSpec.ResourceClaims.

Type

object

Required

name

Property	Type	Description
`name`	`string`	Name must match the name of one entry in pod.spec.resourceClaims of the Pod where this field is used. It makes that resource available inside a container.
`request`	`string`	Request is the name chosen for a request in the referenced claim. If empty, everything from the claim is made available, otherwise only the result of this request.

2.1.23. .spec.ols.deployment.api.tolerations

Description
Type: array

2.1.24. .spec.ols.deployment.api.tolerations[]

Description: The pod this Toleration is attached to tolerates any taint that matches the triple <key,value,effect> using the matching operator <operator>.
Type: object

Property	Type	Description
`effect`	`string`	Effect indicates the taint effect to match. Empty means match all taint effects. When specified, allowed values are NoSchedule, PreferNoSchedule and NoExecute.
`key`	`string`	Key is the taint key that the toleration applies to. Empty means match all taint keys. If the key is empty, operator must be Exists; this combination means to match all values and all keys.
`operator`	`string`	Operator represents a key’s relationship to the value. Valid operators are Exists and Equal. Defaults to Equal. Exists is equivalent to wildcard for value, so that a pod can tolerate all taints of a particular category.
`tolerationSeconds`	`integer`	TolerationSeconds represents the period of time the toleration (which must be of effect NoExecute, otherwise this field is ignored) tolerates the taint. By default, it is not set, which means tolerate the taint forever (do not evict). Zero and negative values will be treated as 0 (evict immediately) by the system.
`value`	`string`	Value is the taint value the toleration matches to. If the operator is Exists, the value should be empty, otherwise just a regular string.

2.1.25. .spec.ols.deployment.console

Description: Console container settings.
Type: object

Property	Type	Description
`caCertificate`	`string`	Certificate Authority (CA) certificate used by the console proxy endpoint.
`nodeSelector`	`object (string)`
`replicas`	`integer`	Defines the number of desired Console pods. Default: "1"
`resources`	`object`	ResourceRequirements describes the compute resource requirements.
`tolerations`	`array`

2.1.26. .spec.ols.deployment.console.resources

Description: ResourceRequirements describes the compute resource requirements.
Type: object

Property Type Description

Property	Type	Description
`claims`	`array`	Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container. This is an alpha field and requires enabling the DynamicResourceAllocation feature gate. This field is immutable. It can only be set for containers.
`limits`	`integer-or-string`	Limits describes the maximum amount of compute resources allowed. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
`requests`	`integer-or-string`	Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

claims

array

Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container.

This is an alpha field and requires enabling the DynamicResourceAllocation feature gate.

This field is immutable. It can only be set for containers.

limits

integer-or-string

Limits describes the maximum amount of compute resources allowed. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

requests

integer-or-string

2.1.27. .spec.ols.deployment.console.resources.claims

Description

Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container.

This is an alpha field and requires enabling the DynamicResourceAllocation feature gate.

This field is immutable. It can only be set for containers.

Type

array

2.1.28. .spec.ols.deployment.console.resources.claims[]

Description

ResourceClaim references one entry in PodSpec.ResourceClaims.

Type

object

Required

name

Property	Type	Description
`name`	`string`	Name must match the name of one entry in pod.spec.resourceClaims of the Pod where this field is used. It makes that resource available inside a container.
`request`	`string`	Request is the name chosen for a request in the referenced claim. If empty, everything from the claim is made available, otherwise only the result of this request.

2.1.29. .spec.ols.deployment.console.tolerations

Description
Type: array

2.1.30. .spec.ols.deployment.console.tolerations[]

Description: The pod this Toleration is attached to tolerates any taint that matches the triple <key,value,effect> using the matching operator <operator>.
Type: object

Property	Type	Description
`effect`	`string`	Effect indicates the taint effect to match. Empty means match all taint effects. When specified, allowed values are NoSchedule, PreferNoSchedule and NoExecute.
`key`	`string`	Key is the taint key that the toleration applies to. Empty means match all taint keys. If the key is empty, operator must be Exists; this combination means to match all values and all keys.
`operator`	`string`	Operator represents a key’s relationship to the value. Valid operators are Exists and Equal. Defaults to Equal. Exists is equivalent to wildcard for value, so that a pod can tolerate all taints of a particular category.
`tolerationSeconds`	`integer`	TolerationSeconds represents the period of time the toleration (which must be of effect NoExecute, otherwise this field is ignored) tolerates the taint. By default, it is not set, which means tolerate the taint forever (do not evict). Zero and negative values will be treated as 0 (evict immediately) by the system.
`value`	`string`	Value is the taint value the toleration matches to. If the operator is Exists, the value should be empty, otherwise just a regular string.

2.1.31. .spec.ols.deployment.dataCollector

Description: Data Collector container settings.
Type: object

Property	Type	Description
`resources`	`object`	ResourceRequirements describes the compute resource requirements.

2.1.32. .spec.ols.deployment.dataCollector.resources

Description: ResourceRequirements describes the compute resource requirements.
Type: object

Property Type Description

Property	Type	Description
`claims`	`array`	Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container. This is an alpha field and requires enabling the DynamicResourceAllocation feature gate. This field is immutable. It can only be set for containers.
`limits`	`integer-or-string`	Limits describes the maximum amount of compute resources allowed. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
`requests`	`integer-or-string`	Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

claims

array

Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container.

This is an alpha field and requires enabling the DynamicResourceAllocation feature gate.

This field is immutable. It can only be set for containers.

limits

integer-or-string

Limits describes the maximum amount of compute resources allowed. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

requests

integer-or-string

2.1.33. .spec.ols.deployment.dataCollector.resources.claims

Description

Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container.

This is an alpha field and requires enabling the DynamicResourceAllocation feature gate.

This field is immutable. It can only be set for containers.

Type

array

2.1.34. .spec.ols.deployment.dataCollector.resources.claims[]

Description

ResourceClaim references one entry in PodSpec.ResourceClaims.

Type

object

Required

name

Property	Type	Description
`name`	`string`	Name must match the name of one entry in pod.spec.resourceClaims of the Pod where this field is used. It makes that resource available inside a container.
`request`	`string`	Request is the name chosen for a request in the referenced claim. If empty, everything from the claim is made available, otherwise only the result of this request.

2.1.35. .spec.ols.deployment.database

Description: Database container settings.
Type: object

Property	Type	Description
`nodeSelector`	`object (string)`
`resources`	`object`	ResourceRequirements describes the compute resource requirements.
`tolerations`	`array`

2.1.36. .spec.ols.deployment.database.resources

Description: ResourceRequirements describes the compute resource requirements.
Type: object

Property Type Description

Property	Type	Description
`claims`	`array`	Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container. This is an alpha field and requires enabling the DynamicResourceAllocation feature gate. This field is immutable. It can only be set for containers.
`limits`	`integer-or-string`	Limits describes the maximum amount of compute resources allowed. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
`requests`	`integer-or-string`	Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

claims

array

Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container.

This is an alpha field and requires enabling the DynamicResourceAllocation feature gate.

This field is immutable. It can only be set for containers.

limits

integer-or-string

Limits describes the maximum amount of compute resources allowed. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

requests

integer-or-string

2.1.37. .spec.ols.deployment.database.resources.claims

Description

Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container.

This is an alpha field and requires enabling the DynamicResourceAllocation feature gate.

This field is immutable. It can only be set for containers.

Type

array

2.1.38. .spec.ols.deployment.database.resources.claims[]

Description

ResourceClaim references one entry in PodSpec.ResourceClaims.

Type

object

Required

name

Property	Type	Description
`name`	`string`	Name must match the name of one entry in pod.spec.resourceClaims of the Pod where this field is used. It makes that resource available inside a container.
`request`	`string`	Request is the name chosen for a request in the referenced claim. If empty, everything from the claim is made available, otherwise only the result of this request.

2.1.39. .spec.ols.deployment.database.tolerations

Description
Type: array

2.1.40. .spec.ols.deployment.database.tolerations[]

Description: The pod this Toleration is attached to tolerates any taint that matches the triple <key,value,effect> using the matching operator <operator>.
Type: object

Property	Type	Description
`effect`	`string`	Effect indicates the taint effect to match. Empty means match all taint effects. When specified, allowed values are NoSchedule, PreferNoSchedule and NoExecute.
`key`	`string`	Key is the taint key that the toleration applies to. Empty means match all taint keys. If the key is empty, operator must be Exists; this combination means to match all values and all keys.
`operator`	`string`	Operator represents a key’s relationship to the value. Valid operators are Exists and Equal. Defaults to Equal. Exists is equivalent to wildcard for value, so that a pod can tolerate all taints of a particular category.
`tolerationSeconds`	`integer`	TolerationSeconds represents the period of time the toleration (which must be of effect NoExecute, otherwise this field is ignored) tolerates the taint. By default, it is not set, which means tolerate the taint forever (do not evict). Zero and negative values will be treated as 0 (evict immediately) by the system.
`value`	`string`	Value is the taint value the toleration matches to. If the operator is Exists, the value should be empty, otherwise just a regular string.

2.1.41. .spec.ols.deployment.mcpServer

Description: MCP server container settings.
Type: object

Property	Type	Description
`resources`	`object`	ResourceRequirements describes the compute resource requirements.

2.1.42. .spec.ols.deployment.mcpServer.resources

Description: ResourceRequirements describes the compute resource requirements.
Type: object

Property Type Description

Property	Type	Description
`claims`	`array`	Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container. This is an alpha field and requires enabling the DynamicResourceAllocation feature gate. This field is immutable. It can only be set for containers.
`limits`	`integer-or-string`	Limits describes the maximum amount of compute resources allowed. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
`requests`	`integer-or-string`	Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

claims

array

Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container.

This is an alpha field and requires enabling the DynamicResourceAllocation feature gate.

This field is immutable. It can only be set for containers.

limits

integer-or-string

Limits describes the maximum amount of compute resources allowed. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

requests

integer-or-string

2.1.43. .spec.ols.deployment.mcpServer.resources.claims

Description

Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container.

This is an alpha field and requires enabling the DynamicResourceAllocation feature gate.

This field is immutable. It can only be set for containers.

Type

array

2.1.44. .spec.ols.deployment.mcpServer.resources.claims[]

Description

ResourceClaim references one entry in PodSpec.ResourceClaims.

Type

object

Required

name

Property	Type	Description
`name`	`string`	Name must match the name of one entry in pod.spec.resourceClaims of the Pod where this field is used. It makes that resource available inside a container.
`request`	`string`	Request is the name chosen for a request in the referenced claim. If empty, everything from the claim is made available, otherwise only the result of this request.

2.1.45. .spec.ols.proxyConfig

Description: Proxy settings for connecting to external servers, such as LLM providers.
Type: object

Property	Type	Description
`proxyCACertificate`	`object`	The configmap holding proxy CA certificate
`proxyURL`	`string`	Proxy URL, e.g. Content from proxy.example.com is not included.https://proxy.example.com:8080 If not specified, the cluster wide proxy will be used, though env var "https_proxy".

2.1.46. .spec.ols.proxyConfig.proxyCACertificate

Description: The configmap holding proxy CA certificate
Type: object

Property	Type	Description
`name`	`string`	Name of the referent. This field is effectively required, but due to backwards compatibility is allowed to be empty. Instances of this type with an empty value here are almost certainly wrong. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

2.1.47. .spec.ols.queryFilters

Description: Query filters
Type: array

2.1.48. .spec.ols.queryFilters[]

Description: QueryFiltersSpec defines filters to manipulate questions/queries.
Type: object

Property	Type	Description
`name`	`string`	Filter name.
`pattern`	`string`	Filter pattern.
`replaceWith`	`string`	Replacement for the matched pattern.

2.1.49. .spec.ols.quotaHandlersConfig

Description: LLM Token Quota Configuration
Type: object

Property	Type	Description
`enableTokenHistory`	`boolean`	Enable token history
`limitersConfig`	`array`	Token quota limiters

2.1.50. .spec.ols.quotaHandlersConfig.limitersConfig

Description: Token quota limiters
Type: array

2.1.51. .spec.ols.quotaHandlersConfig.limitersConfig[]

Description

LimiterConfig defines settings for a token quota limiter

Type

object

Required

initialQuota
name
period
quotaIncrease
type

Property	Type	Description
`initialQuota`	`integer`	Initial value of the token quota
`name`	`string`	Name of the limiter
`period`	`string`	Period of time the token quota is for
`quotaIncrease`	`integer`	Token quota increase step
`type`	`string`	Type of the limiter

2.1.52. .spec.ols.rag

Description: RAG databases
Type: array

2.1.53. .spec.ols.rag[]

Description

RAGSpec defines how to retrieve a RAG databases.

Type

object

Required

image

Property	Type	Description
`image`	`string`	The URL of the container image to use as a RAG source
`indexID`	`string`	The Index ID of the RAG database
`indexPath`	`string`	The path to the RAG database inside of the container image

2.1.54. .spec.ols.storage

Description: Persistent Storage Configuration
Type: object

Property	Type	Description
`class`	`string`	Storage class of the requested volume
`size`	`integer-or-string`	Size of the requested volume

2.1.55. .spec.ols.tlsConfig

Description: TLS configuration of the Lightspeed backend’s HTTPS endpoint
Type: object

Property	Type	Description
`keyCertSecretRef`	`object`	KeySecretRef is the secret that holds the TLS key.

2.1.56. .spec.ols.tlsConfig.keyCertSecretRef

Description: KeySecretRef is the secret that holds the TLS key.
Type: object

Property	Type	Description
`name`	`string`	Name of the referent. This field is effectively required, but due to backwards compatibility is allowed to be empty. Instances of this type with an empty value here are almost certainly wrong. More info: Content from kubernetes.io is not included.https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

2.1.57. .spec.ols.tlsSecurityProfile

Description: TLS Security Profile used by API endpoints
Type: object

Property	Type	Description
`custom`	``	custom is a user-defined TLS security profile. Be extremely careful using a custom profile as invalid configurations can be catastrophic. An example custom profile looks like this: ciphers: - ECDHE-ECDSA-CHACHA20-POLY1305 - ECDHE-RSA-CHACHA20-POLY1305 - ECDHE-RSA-AES128-GCM-SHA256 - ECDHE-ECDSA-AES128-GCM-SHA256 minTLSVersion: VersionTLS11
`intermediate`	``	intermediate is a TLS security profile based on: Content from wiki.mozilla.org is not included.https://wiki.mozilla.org/Security/Server_Side_TLS#Intermediate_compatibility_.28recommended.29 and looks like this (yaml): ciphers: - TLS_AES_128_GCM_SHA256 - TLS_AES_256_GCM_SHA384 - TLS_CHACHA20_POLY1305_SHA256 - ECDHE-ECDSA-AES128-GCM-SHA256 - ECDHE-RSA-AES128-GCM-SHA256 - ECDHE-ECDSA-AES256-GCM-SHA384 - ECDHE-RSA-AES256-GCM-SHA384 - ECDHE-ECDSA-CHACHA20-POLY1305 - ECDHE-RSA-CHACHA20-POLY1305 - DHE-RSA-AES128-GCM-SHA256 - DHE-RSA-AES256-GCM-SHA384 minTLSVersion: VersionTLS12
`modern`	``	modern is a TLS security profile based on: Content from wiki.mozilla.org is not included.https://wiki.mozilla.org/Security/Server_Side_TLS#Modern_compatibility and looks like this (yaml): ciphers: - TLS_AES_128_GCM_SHA256 - TLS_AES_256_GCM_SHA384 - TLS_CHACHA20_POLY1305_SHA256 minTLSVersion: VersionTLS13
`old`	``	old is a TLS security profile based on: Content from wiki.mozilla.org is not included.https://wiki.mozilla.org/Security/Server_Side_TLS#Old_backward_compatibility and looks like this (yaml): ciphers: - TLS_AES_128_GCM_SHA256 - TLS_AES_256_GCM_SHA384 - TLS_CHACHA20_POLY1305_SHA256 - ECDHE-ECDSA-AES128-GCM-SHA256 - ECDHE-RSA-AES128-GCM-SHA256 - ECDHE-ECDSA-AES256-GCM-SHA384 - ECDHE-RSA-AES256-GCM-SHA384 - ECDHE-ECDSA-CHACHA20-POLY1305 - ECDHE-RSA-CHACHA20-POLY1305 - DHE-RSA-AES128-GCM-SHA256 - DHE-RSA-AES256-GCM-SHA384 - DHE-RSA-CHACHA20-POLY1305 - ECDHE-ECDSA-AES128-SHA256 - ECDHE-RSA-AES128-SHA256 - ECDHE-ECDSA-AES128-SHA - ECDHE-RSA-AES128-SHA - ECDHE-ECDSA-AES256-SHA384 - ECDHE-RSA-AES256-SHA384 - ECDHE-ECDSA-AES256-SHA - ECDHE-RSA-AES256-SHA - DHE-RSA-AES128-SHA256 - DHE-RSA-AES256-SHA256 - AES128-GCM-SHA256 - AES256-GCM-SHA384 - AES128-SHA256 - AES256-SHA256 - AES128-SHA - AES256-SHA - DES-CBC3-SHA minTLSVersion: VersionTLS10
`type`	`string`	type is one of Old, Intermediate, Modern or Custom. Custom provides the ability to specify individual TLS security profile parameters. Old, Intermediate and Modern are TLS security profiles based on: Content from wiki.mozilla.org is not included.https://wiki.mozilla.org/Security/Server_Side_TLS#Recommended_configurations The profiles are intent based, so they might change over time as new ciphers are developed and existing ciphers are found to be insecure. Depending on precisely which ciphers are available to a process, the list might be reduced. Note that the Modern profile is currently not supported because it is not yet well adopted by common software libraries.

2.1.58. .spec.ols.userDataCollection

Description: User data collection switches
Type: object

Property	Type	Description
`feedbackDisabled`	`boolean`
`transcriptsDisabled`	`boolean`

2.1.59. .spec.olsDataCollector

Description: OLSDataCollectorSpec defines allowed OLS data collector configuration.
Type: object

Property	Type	Description
`logLevel`	`string`	Log level. Valid options are DEBUG, INFO, WARNING, ERROR and CRITICAL. Default: "INFO".

Chapter 3. REST API authentication configurations

Use REST API authentication configurations to secure programmatic interactions with the OpenShift Lightspeed Service.

You must configure the appropriate authentication modules and follow security guidelines to manage service access in production and development environments.

3.1. REST API authentication configurations

Configure REST API authentication for OpenShift Lightspeed by using the authentication_config.module parameter. Selecting the correct module ensures authorized access and maintains environment security.

The OpenShift Lightspeed REST API requires authentication by default. You can define the authentication behavior by using the authentication_config.module parameter in your YAML configuration file.

The following modules are supported for the authentication_config.module parameter:

k8s: The default authentication module. This module uses Kubernetes TokenReview and SubjectAccessReview to validate authentication.
noop: Disables cluster validation and provides no-operation authentication. This logs insecure-mode warnings at startup.
noop-with-token: Disables cluster validation but still requires a Bearer token in the Authorization header.

Warning

Setting the authentication module to noop or setting dev_config.disable_auth to true bypasses all access control. Do not use these settings in a production environment.

Chapter 4. Integrating Google Vertex AI with OpenShift Lightspeed

As an administrator, you can integrate Google Vertex AI as a large language model (LLM) provider for OpenShift Lightspeed.

4.1. Google Vertex AI provider types

OpenShift Lightspeed supports Google Vertex AI as an LLM provider. You can deploy Google-native models or Anthropic models hosted on the Google Cloud Platform (GCP) infrastructure.

Both provider types authenticate using a GCP service account JSON key stored within a Kubernetes Secret.

Table 4.1. Supported provider types

Provider type	Use case	Required configuration field
`google_vertex`	Google-native models like Gemini.	`googleVertexConfig`
`google_vertex_anthropic`	Anthropic models like Claude hosted on Vertex AI.	`googleVertexAnthropicConfig`

4.2. Configuring Google Vertex AI

To use Google Vertex AI, create a credentials secret and apply an OLSConfig custom resource (CR).

Prerequisites

The OpenShift Lightspeed Operator must be installed.
You must possess a valid GCP service account JSON key file.
The Vertex AI API must be enabled in your Google Cloud project.
Your GCP service account must have appropriate Vertex AI permissions.

Procedure

Create the credentials Secret in the operator namespace by running the following command:
```
oc create secret generic llmcreds \
  --from-file=gcp-service-account.json=/path/to/service-account-key.json \
  -n openshift-lightspeed
```
Note
The Operator looks for a key named apitoken by default if you omit the credentialKey field later.

Create an OLSConfig CR file named olsconfig.yaml using one of the following examples:

Example configuration for Gemini (google_vertex):

apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  llm:
    providers:
      - name: google
        type: google_vertex
        credentialsSecretRef:
          name: llmcreds
        credentialKey: gcp-service-account.json
        googleVertexConfig:
          projectID: my-gcp-project-123
          location: us-central1
        models:
          - name: gemini-2.5-flash-lite
  ols:
    defaultModel: gemini-2.5-flash-lite
    defaultProvider: google

Example configuration for Claude (google_vertex_anthropic):

apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
  name: cluster
spec:
  llm:
    providers:
      - name: google-anthropic
        type: google_vertex_anthropic
        credentialsSecretRef:
          name: llmcreds
        credentialKey: gcp-service-account.json
        googleVertexAnthropicConfig:
          projectID: my-gcp-project-123
          location: us-east4
        models:
          - name: claude-3-sonnet
  ols:
    defaultModel: claude-3-sonnet
    defaultProvider: google-anthropic

Apply the configuration file to your cluster:
```
oc apply -f olsconfig.yaml
```

Verification

Verify that the Operator has completed reconciliation:
```
oc get olsconfig cluster -o jsonpath='{.status.overallStatus}'
```
Expected output: Ready

4.3. OLSConfig field reference for Google Vertex AI

The following reference tables describe the configuration schema for Google Vertex AI providers.

Table 4.2. Provider fields (spec.llm.providers[])

Field	Type	Required	Description
`name`	`string`	Yes	Logical name for the provider. Referenced by `spec.ols.defaultProvider`.
`type`	`string`	Yes	Must be set to `google_vertex` or `google_vertex_anthropic`.
`credentialsSecretRef.name`	`string`	Yes	Name of the Secret in the operator namespace that contains provider credentials.
`credentialKey`	`string`	No	Key name inside the Secret to read. Defaults to `apitoken`.
`url`	`string`	No	The provider API endpoint URL. This field is typically not required for Vertex AI.
`models`	`array`	Yes	List of models available from the provider.

Table 4.3. Google Vertex configuration (spec.llm.providers[].googleVertexConfig)

Field	Type	Required	Description
`projectID`	`string`	Yes	The Google Cloud project ID (for example, `my-gcp-project-123`).
`location`	`string`	Yes	The target GCP region for Vertex AI (for example, `us-central1`).

Table 4.4. Google Vertex Anthropic configuration (spec.llm.providers[].googleVertexAnthropicConfig)

Field	Type	Required	Description
`projectID`	`string`	Yes	The Google Cloud project ID.
`location`	`string`	Yes	The target GCP region for Vertex AI (for example, `us-east4`).

Table 4.5. Model fields (spec.llm.providers[].models[])

Field	Type	Required	Description
`name`	`string`	Yes	Model name (such as `gemini-2.5-flash-lite`). Referenced by `spec.ols.defaultModel`.
`url`	`string`	No	The model-specific API endpoint URL.
`contextWindowSize`	`integer`	No	Context window size in tokens. Minimum value: 1024.
`parameters.maxTokensForResponse`	`integer`	No	Maximum tokens allowed for responses. Default value: 2048.
`parameters.toolBudgetRatio`	`float`	No	Ratio of the context window allocated for the tool token budget. Range: 0.1 to 0.5. Default value: 0.5.

Legal Notice

Except as otherwise noted below, the text of and illustrations in this documentation are licensed by Red Hat under the Creative Commons Attribution–Share Alike 3.0 Unported license . If you distribute this document or an adaptation of it, you must provide the URL for the original version.

Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.

Red Hat, the Red Hat logo, JBoss, Hibernate, and RHCE are trademarks or registered trademarks of Red Hat, LLC. or its subsidiaries in the United States and other countries.

Linux® is the registered trademark of Linus Torvalds in the United States and other countries.

XFS is a trademark or registered trademark of Hewlett Packard Enterprise Development LP or its subsidiaries in the United States and other countries.

The OpenStack® Word Mark and OpenStack logo are trademarks or registered trademarks of the Linux Foundation, used under license.

All other trademarks are the property of their respective owners.