Skip to content

Conversation

@rhopp
Copy link
Contributor

@rhopp rhopp commented Oct 7, 2025

No description provided.

@openshift-ci
Copy link

openshift-ci bot commented Oct 7, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci
Copy link

openshift-ci bot commented Oct 7, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rhopp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Oct 7, 2025
@coderabbitai
Copy link

coderabbitai bot commented Oct 7, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rhopp
Copy link
Contributor Author

rhopp commented Oct 7, 2025

/retest

@abraverm
Copy link

abraverm commented Oct 7, 2025

@rhopp continue to lurk on your great work 👍

@rhopp
Copy link
Contributor Author

rhopp commented Oct 9, 2025

/retest

@rhopp rhopp force-pushed the hive-try3 branch 2 times, most recently from 65c5e42 to 0f4afc8 Compare October 9, 2025 14:17
@rhopp
Copy link
Contributor Author

rhopp commented Oct 10, 2025

/retest

@rhopp
Copy link
Contributor Author

rhopp commented Oct 22, 2025

/retest

1 similar comment
@rhopp
Copy link
Contributor Author

rhopp commented Oct 23, 2025

/retest

@rhopp rhopp force-pushed the hive-try3 branch 3 times, most recently from 8a41531 to 8b6a645 Compare October 31, 2025 14:06
@rhopp
Copy link
Contributor Author

rhopp commented Nov 4, 2025

/retest

@rhopp rhopp force-pushed the hive-try3 branch 2 times, most recently from 260dc09 to d07c279 Compare November 6, 2025 16:27
@rhopp
Copy link
Contributor Author

rhopp commented Nov 7, 2025

/retest

@rhopp rhopp force-pushed the hive-try3 branch 2 times, most recently from a36fdcf to 70ad5ed Compare November 11, 2025 09:51
@rhopp
Copy link
Contributor Author

rhopp commented Nov 11, 2025

/retest

rhopp and others added 13 commits November 13, 2025 12:25
Add a 5-minute retry loop (30 attempts with 10-second intervals) to ensure
successful login to the provisioned cluster using kubeadmin credentials. This
handles cases where the cluster API is accessible but authentication may not
be immediately ready.

The retry loop includes proper validation via 'oc whoami' and integrates with
the existing provisioning retry logic.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
The previous implementation had a race condition where 'oc whoami' would
succeed immediately after login but fail moments later when called again.
This caused intermittent authentication failures even though login was
reported as successful.

Changes:
- Add 2-second wait after successful login to allow auth to propagate
- Capture 'oc whoami' output once instead of calling it multiple times
- Add additional verification step with 'oc version' to ensure cluster commands work
- Improve error logging to show exit codes and output for debugging

This should resolve the "Unauthorized" errors that occurred right after
successful login (as seen in lines 399-405 of the previous run logs).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
The --short flag is not supported by the oc version command (unlike kubectl).
Using 'oc get namespaces' instead provides better verification because:
- It actually requires authentication and cluster access to succeed
- oc version can show client version even without being logged in
- This ensures we're truly authenticated and can access cluster resources

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Signed-off-by: Radim Hopp <[email protected]>
Signed-off-by: Radim Hopp <[email protected]>
Add comprehensive stability monitoring to diagnose intermittent authorization
failures that occur after successful cluster provisioning. This will help
identify if the cluster becomes unstable over time or if there are specific
patterns to the failures.

The observation loop runs for 10 minutes (120 iterations at 5-second intervals)
and tests three critical components:
1. Cluster Operators (oc get co) - validates cluster operator availability
2. Console URL accessibility - ensures the web console remains reachable
3. API Server (oc get namespaces) - verifies authentication and API access

For each test, the script tracks:
- Success/failure counts
- Pattern string showing timeline (e.g., "SSSSSFFFSSSS" where S=success, F=failure)
- Timestamped logs for any failures
- Progress updates every ~100 seconds

This diagnostic data will help determine:
- If failures are sporadic or follow a pattern
- Which component(s) are unstable
- How long it takes for the cluster to stabilize
- Whether the issue is authentication-specific or broader

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Add a new task to collect cluster artifacts when the pipeline fails. This task:
- Runs in the finally section to execute even when other tasks fail
- Only executes when pipeline status is not "Succeeded"
- Logs into the provisioned cluster using the ocp-login-command
- Runs gather-extra.sh script to collect diagnostic information
- Pushes collected artifacts to OCI storage for later analysis

The collected artifacts will help diagnose issues that occur during
test execution, particularly the intermittent authorization failures.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@openshift-merge-robot
Copy link
Collaborator

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@sonarqubecloud
Copy link

@rhopp
Copy link
Contributor Author

rhopp commented Nov 18, 2025

/retest

@konflux-ci-qe-bot
Copy link

@rhopp: The following test has Failed, say /retest to rerun failed tests.

PipelineRun Name Status Rerun command Build Log Test Log
e2e-4.19-dp2jk Failed /retest View Pipeline Log View Test Logs

Inspecting Test Artifacts

To inspect your test artifacts, follow these steps:

  1. Install ORAS (see the ORAS installation guide).
  2. Download artifacts with the following commands:
mkdir -p oras-artifacts
cd oras-artifacts
oras pull quay.io/konflux-test-storage/rhtap-team/rhtap-cli:e2e-4.19-dp2jk

Test results analysis

<not enabled>

OCI Artifact Browser URL

<not enabled>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants