Description
CP4D 5.1.2 upgrade to 5.1.3
Deployer based upgrade. The newest configuration has been copied from deployer page, the parameters adjusted same way as it was in the original configuration file.
The step which started with
[34m# Updating ibm-common-service-operator in namesapce cpd-operators...[0m
[INFO] v4.12 is equal to v4.12
[INFO] catalogsource opencloud-operators is the same as opencloud-operators
[INFO] ibm-common-service-operator has already updated channel v4.12 and catalogsource opencloud-operators in the subscription.
subscription.operators.coreos.com/ibm-common-service-operator configured
[32m[✔] Successfully patched subscription ibm-common-service-operator in cpd-operators[0m
[INFO] Waiting for operator ibm-common-service-operator to be upgraded
[33m[DEBUG] oc get subscription.operators.coreos.com -l operators.coreos.com/ibm-common-service-operator.cpd-operators='' -n cpd-operators -o jsonpath='{.items[*].status.conditions}' ->
[{"lastTransitionTime":"2025-04-17T16:34:43Z","message":"all available catalogsources are healthy","reason":"AllCatalogSourcesHealthy","status":"False","type":"CatalogSourcesUnhealthy"},{"lastTransitionTime":"2025-02-28T15:47:16Z","reason":"ReferencedInstallPlanNotFound","status":"True","type":"InstallPlanMissing"},{"message":"error using catalogsource openshift-marketplace/redhat-operators: error encountered while listing bundles: rpc error: code = DeadlineExceeded desc = context deadline exceeded","reason":"ErrorPreventedResolution","status":"True","type":"ResolutionFailed"}]
didn't finish in time, with the error seen in the log:
[31m[✘] Error in /tmp/work/cpfs_scripts/5.1.3/cp3pt0-deployment/common/utils.sh at line 127 in function wait_for_condition: Timeout after 40 minutes waiting for operator ibm-common-service-operator to be upgraded[0m
Checked the status of the ibm-common-service-operator
No errors, the related pod also in a Running status.
1.746467559133527e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/validate-operator-ibm-com-v3-commonservice", "code": 200, "reason": "", "UID": "1c76ab8b-fd1e-4a36-b535-5715f6781fa8", "allowed": true}
All similar log lines there are ending up with code 200.
The deployer fails after each of 3 attempts with the same error.
Configmap
global_config:
environment_name: demo
cloud_platform: existing-ocp
confirm_destroy: False
optimize_deploy: True
env_id: cpd-demo
openshift:
- name: "{{ env_id }}"
ocp_version: "4.15"
cluster_name: "{{ env_id }}"
domain_name: example.com
mcg:
install: False
storage_type: storage-class
storage_class: managed-nfs-storage
gpu:
install: auto
openshift_ai:
install: auto
channel: auto
openshift_storage:
- storage_name: ocs-external-storagecluster-cephfs
storage_type: custom
ocp_storage_class_file: ocs-external-storagecluster-cephfs
ocp_storage_class_block: ocs-external-storagecluster-ceph-rbd
cp4d:
- project: cpd
openshift_cluster_name: "{{ env_id }}"
cp4d_version: latest
cp4d_entitlement:
- cpd-enterprise
# - cpd-standard
- cognos-analytics
- data-product-hub
- datastage
# - ikc-premium
# - ikc-standard
# - openpages
# - planning-analytics
# - product-master
# - speech-to-text
# - text-to-speech
# - watson-assistant
# - watson-discovery
# - watsonx-ai
# - watsonx-code-assistant-ansible
# - watsonx-code-assistant-z
# - watsonx-data
# - watsonx-gov-mm
# - watsonx-gov-rc
# - watsonx-orchestrate
db2u_limited_privileges: False
accept_licenses: True
use_fs_iam: True
operators_project: cpd-operators
ibm_cert_manager: False
cartridges:
- name: cp-foundation
scale: level_1
license_service:
threads_per_core: 2
- name: lite