Skip to content

Conversation

@polasudo
Copy link
Contributor

@polasudo polasudo commented Aug 20, 2025

Description

This pull request refactors the way image expiry labels are set in the next-build-image.yaml GitHub Actions workflow. Instead of hardcoding expiry durations in multiple places, the expiry value is now computed once based on the branch name and reused throughout the workflow. This improves maintainability and reduces the risk of inconsistencies.

This pull request significantly enhances the Docker image build and publishing workflows, with a focus on improving image expiry management and lifecycle automation for Quay. The changes introduce dynamic expiry labeling based on build type, enforce Quay credential checks, and automate the cleanup of per-architecture images using both Docker BuildX annotations and the Quay API (when available).

Key improvements include:

Image Expiry Management and Labeling

  • Expiry duration for images is now dynamically determined based on the type of build (e.g., 14 days for "next" builds, 183 days for releases), and this value is reused throughout the workflow (EXPIRES_AFTER environment variable). Expiry labels are consistently applied to both Docker and OCI standards (quay.expires-after and org.opencontainers.image.expires).

Multi-arch Manifest and Tag Handling

  • Manifest lists are created and annotated for each tag to guarantee Quay applies expiry settings per tag. This improves consistency and reliability of tag expiry.

These changes collectively make the image publishing process more secure, maintainable, and aligned with best practices for automated container lifecycle management.

Manifest creation improvements:

  • Update the manifest creation step to annotate each image tag with the correct expiry value using the ${EXPIRES_AFTER} environment variable, ensuring Quay applies the correct expiry per tag.

Which issue(s) does this PR fix

PR acceptance criteria

Please make sure that the following steps are complete:

  • GitHub Actions are completed and successful
  • Unit Tests are updated and passing
  • E2E Tests are updated and passing
  • Documentation is updated if necessary (requirement for new features)
  • Add a screenshot if the change is UX/UI related

How to test changes / Special notes to the reviewer

@openshift-ci
Copy link

openshift-ci bot commented Aug 20, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@polasudo polasudo changed the title spike: set the same tag expiry for ALL floating tags (not just the first one published RHIDP-8270 fix: set the same tag expiry for ALL floating tags (not just the first one published RHIDP-8270 Aug 20, 2025
@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

Updated the next-build-image workflow to read tags from the metadata JSON instead of directly from the output variables. This change ensures that the manifest list is created using unique tags, enhancing the tagging process for image builds.
@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2025

@github-actions
Copy link
Contributor

github-actions bot commented Sep 8, 2025

@github-actions
Copy link
Contributor

github-actions bot commented Sep 9, 2025

@github-actions
Copy link
Contributor

@polasudo polasudo changed the title fix: set the same tag expiry for ALL floating tags (not just the first one published RHIDP-8270, RHIDP-8691 fix: set the same tag expiry for ALL floating tags, not just the first one published RHIDP-8270, RHIDP-8691 Sep 11, 2025
@github-actions
Copy link
Contributor

Copy link
Member

@nickboldt nickboldt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

let's try it and see what borks

@openshift-ci
Copy link

openshift-ci bot commented Sep 15, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nickboldt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@nickboldt nickboldt merged commit a43b870 into redhat-developer:main Sep 15, 2025
13 of 14 checks passed
@nickboldt
Copy link
Member

merged; manually triggered in https://github.com/redhat-developer/rhdh/actions/runs/17744786774

@nickboldt
Copy link
Member

/cherry-pick release-1.7
/cherry-pick release-1.6

@openshift-cherrypick-robot
Copy link
Contributor

@nickboldt: new pull request created: #3422

In response to this:

/cherry-pick release-1.7
/cherry-pick release-1.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-cherrypick-robot
Copy link
Contributor

@nickboldt: #3307 failed to apply on top of branch "release-1.7":

Applying: updating build to mkae it easier to tag
Applying: local testing
Applying: refactor: improve tag handling in build workflow
Applying: refactor: enhance tag generation logic in build workflow
Applying: feat(ci): add Quay login and permission checks to build workflow Quay login step and a preflight check to verify push permissions.
Applying: fix(ci): update environment variable usage in tag generation for build workflow
Applying: testing tagging
Applying: feat(ci): enhance Docker build workflow with annotations and improved metadata handling
.git/rebase-apply/patch:116: trailing whitespace.
            
.git/rebase-apply/patch:121: trailing whitespace.
            
warning: 2 lines add whitespace errors.
Using index info to reconstruct a base tree...
M	.github/actions/docker-build/action.yaml
Falling back to patching base and 3-way merge...
Auto-merging .github/actions/docker-build/action.yaml
CONFLICT (content): Merge conflict in .github/actions/docker-build/action.yaml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0008 feat(ci): enhance Docker build workflow with annotations and improved metadata handling

In response to this:

/cherry-pick release-1.7
/cherry-pick release-1.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@nickboldt
Copy link
Member

cherrypick failed so manually pushed to 1.7 as e5e5628

guyoron1 pushed a commit to guyoron1/rhdh that referenced this pull request Sep 29, 2025
…t one published RHIDP-8270, RHIDP-8691 (redhat-developer#3307)

* updating build to mkae it easier to tag

* local testing

* refactor: improve tag handling in build workflow

Updated the next-build-image workflow to read tags from the metadata JSON instead of directly from the output variables. This change ensures that the manifest list is created using unique tags, enhancing the tagging process for image builds.

* refactor: enhance tag generation logic in build workflow

* feat(ci): add Quay login and permission checks to build workflow
Quay login step and a preflight check to verify push permissions.

* fix(ci): update environment variable usage in tag generation for build workflow

* testing tagging

* feat(ci): enhance Docker build workflow with annotations and improved metadata handling

* refactor: improve tag handling in build workflow

* updating tagging

* feat(ci): add Quay API expiry enforcement to build workflow

* test tagging

* ci: rollback next-build-image workflow to previous version

* test of bonus step to make sure it will get tagged

* ci: restore next-build-image workflow to state of d1b0b9f

* deleting local testing inputs, preparing for PR

* ci: add per-arch cleanup and expiry enforcement

- Add step to delete temporary per-arch images after multi-arch manifest creation
- Ensure expiry is set on all multi-arch tags via Quay API
- Clean up intermediate tags: *-amd64, *-arm64, *-{sha}-amd64, *-{sha}-arm64
- Keep only final multi-arch manifests with proper expiration

* ci: configure workflow for testing with polasudo/testing repository

- Set default REGISTRY_IMAGE to polasudo/testing for testing
- Add workflow_dispatch inputs for manual testing:
  - registry_image: specify target repository (default: polasudo/testing)
  - test_cleanup: enable/disable cleanup testing (default: true)
- Restore registry override logic for manual dispatch
- Cleanup step runs when test_cleanup=true or on automatic triggers

* fix: add CSRF token handling for secure Quay API operations

- Fetch CSRF token from /api/v1/user/ endpoint before PUT/DELETE operations
- Include X-CSRFToken header in expiry setting and tag deletion requests
- Add fallback method to get CSRF token from response headers
- Resolves HTTP 403 'CSRF token was invalid or missing' errors
- Maintains security compliance with Quay's authentication requirements

* fix: implement proper Basic Auth for Quay robot accounts

- Replace Bearer token with Basic Auth using username:token base64 encoding
- Robot accounts (polasudo+skuska) require Basic Auth, not Bearer tokens
- Add QUAY_USERNAME environment variable to cleanup step
- Improve authentication logging for robot accounts
- Should resolve HTTP 403 CSRF errors for robot account API calls

* debug: add comprehensive CSRF token debugging for robot accounts

- Add DEBUG output for API responses from /api/v1/user/
- Try multiple CSRF token field names (csrf_token, csrfToken)
- Try extracting CSRF token from response headers
- Try alternative repository API endpoint for CSRF token
- Should help identify where/how CSRF tokens are provided for robot accounts

* fix: use Docker BuildX annotations to bypass CSRF token issues

- Replace Quay API calls with docker buildx imagetools create for expiry
- Use re-annotation approach to set expiry on multi-arch tags
- Robot accounts can authenticate via Docker but not Quay CSRF APIs
- Per-arch images will expire naturally since they already have expiry set
- Avoids CSRF token authentication issues with robot accounts

* test: workaround for CSRF token issue

* feat: integrate OAuth-based Quay lifecycle management

- Add OAuth token support for API operations
- Implement Python-based expiry management for multi-arch tags
- Add per-arch tag deletion after multi-arch merge
- Maintain backward compatibility with robot accounts
- Solves 403 authentication errors for API operations

* fix: correct environment variable order for tag cleanup

- Move export statements before Python script execution
- Add debug output to show which tags are being processed
- Improve error handling for expiry API responses
- Should resolve empty tag arrays issue

* feat: add timing fixes for Quay manifest propagation

- Add 30-second wait step before cleanup to allow manifest propagation
- Add retry logic to handle race conditions in tag operations
- Should resolve empty tag arrays caused by timing issues

* cleanup wf for push to main

* remove test inputs from workflow_dispatch

* feat: add workflow_dispatch testing inputs and simplify build logic

* refactor: clean up workflow and finalize improvements

* refactor: remove redundant Quay credentials check in merge job
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants