Skip to content

CMP-4478: Fix ProfileBundle reversion on operator restart using Subscription patch#79

Open
yuumasato wants to merge 1 commit into
ComplianceAsCode:mainfrom
yuumasato:fix-profilebundle-race-subscription
Open

CMP-4478: Fix ProfileBundle reversion on operator restart using Subscription patch#79
yuumasato wants to merge 1 commit into
ComplianceAsCode:mainfrom
yuumasato:fix-profilebundle-race-subscription

Conversation

@yuumasato

Copy link
Copy Markdown
Member

Problem

ProfileBundles with custom content images are reverted to default images when the compliance-operator restarts (e.g., node reboots, pod crashes).

Root cause: The operator's ensureDefaultProfileBundles() runs at startup and unconditionally patches existing ProfileBundles with the image from the RELATED_IMAGE_PROFILE env var, which defaults to the CSV's default image.

Impact: Tests using custom content images, from PRs for example, failed intermittently when the operator pod restarted during test execution.

Solution

Patch Subscription.spec.config.env with RELATED_IMAGE_PROFILE instead of manually creating ProfileBundles. This ensures the operator uses the custom content image from startup.

Why Subscription patch:

  • Subscription.spec.config.env is the designed OLM mechanism for env var overrides
  • Subscription config persists across OLM reconciliation and operator upgrades
  • Subscription env vars take precedence over CSV defaults (documented OLM behavior)
  • Solves both initial setup AND restart/reboot cases

Changes

  • Add patchOperatorContentImage() to patch Subscription with custom env var
  • Add waitForOperatorRollout() to wait for Deployment rollout after patch
  • Remove ensureTestProfileBundles() (no longer needed - 74 lines removed)
  • Update Setup() flow: install → patch Subscription → wait for rollout → wait for valid ProfileBundles

Testing

Run this script to verify the fix:

#!/bin/bash
# After running: go test -v -timeout 15m . -run='^$' -install-operator=true -content-image=quay.io/wsato/ocp4-content:latest

# Check ProfileBundle has custom image
oc get profilebundle ocp4 -n openshift-compliance -o jsonpath='{.spec.contentImage}'
# Output: quay.io/wsato/ocp4-content:latest

# Restart operator (simulates node reboot)
oc delete pod -l name=compliance-operator -n openshift-compliance
oc wait --for=condition=Ready pod -l name=compliance-operator -n openshift-compliance --timeout=120s

# Check again - should still have custom image
oc get profilebundle ocp4 -n openshift-compliance -o jsonpath='{.spec.contentImage}'
# Output: quay.io/wsato/ocp4-content:latest ✅

# Old approach (manual ProfileBundle patch) gets overwritten:
oc patch profilebundle ocp4 -n openshift-compliance --type=merge -p '{"spec":{"contentImage":"ghcr.io/complianceascode/k8scontent:latest"}}'
oc delete pod -l name=compliance-operator -n openshift-compliance
oc wait --for=condition=Ready pod -l name=compliance-operator -n openshift-compliance --timeout=120s

# Reverted back to Subscription's value
oc get profilebundle ocp4 -n openshift-compliance -o jsonpath='{.spec.contentImage}'
# Output: quay.io/wsato/ocp4-content:latest (reverted from default)

ProfileBundles with custom content images were reverted to default images
when the compliance-operator restarted (e.g., node reboots). This happened
because the operator's ensureDefaultProfileBundles() runs at startup and
unconditionally patches existing ProfileBundles with the image from the
RELATED_IMAGE_PROFILE env var, which defaults to the CSV's default image.

Solution: Patch Subscription.spec.config.env with RELATED_IMAGE_PROFILE
instead of manually creating ProfileBundles. This ensures the operator
uses the custom content image from startup.

Changes:
- Add patchOperatorContentImage() to patch Subscription with custom env var
- Add waitForOperatorRollout() to wait for Deployment rollout after patch
- Remove ensureTestProfileBundles() (no longer needed)
- Update Setup() flow: install → patch Subscription → wait for rollout → wait for valid ProfileBundles

How it works:
1. patchOperatorContentImage() patches Subscription with RELATED_IMAGE_PROFILE
2. OLM automatically merges Subscription env vars into Deployment
3. Deployment rollout triggers with new env var
4. Operator reads RELATED_IMAGE_PROFILE and creates ProfileBundles with custom image
5. Changes survive operator restarts (node reboots, pod crashes)

Why Subscription patch:
- Subscription.spec.config.env is the designed OLM mechanism for env overrides
- Subscription config persists across OLM reconciliation and operator upgrades
- Subscription env vars take precedence over CSV defaults (documented)
- Solves both initial setup AND restart/reboot cases

Tested:
- Initial setup with custom image
- Operator restart resilience (simulates node reboot)
- Multiple consecutive restarts
- Old approach (ProfileBundle patch) reverted on restart, new approach persists

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@yuumasato yuumasato changed the title Fix ProfileBundle reversion on operator restart using Subscription patch CMP-4478: Fix ProfileBundle reversion on operator restart using Subscription patch Jul 3, 2026
@openshift-ci

openshift-ci Bot commented Jul 3, 2026

Copy link
Copy Markdown

@yuumasato: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ocp4-cis fa30229 link true /test e2e-aws-ocp4-cis
ci/prow/e2e-aws-openshift-node-compliance fa30229 link true /test e2e-aws-openshift-node-compliance
ci/prow/e2e-aws-ocp4-stig fa30229 link true /test e2e-aws-ocp4-stig
ci/prow/e2e-aws-openshift-platform-compliance fa30229 link true /test e2e-aws-openshift-platform-compliance
ci/prow/e2e-aws-rhcos4-moderate fa30229 link true /test e2e-aws-rhcos4-moderate

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant