Skip to content

Conversation

LilyLinh
Copy link
Contributor

Issue link

Jira

What changes have been made

This PR implements Ray version validation to prevent compatibility issues between the CodeFlare SDK and user-specified runtime images.

Solution: Added automatic Ray version detection and validation during cluster configuration:

  • Version Detection: New utility function extract_ray_version_from_image() parses Ray versions from various image name formats (e.g., ray:2.47.1, quay.io/modh/ray:2.47.1-py311-cu121)
  • Validation Logic: validate_ray_version_compatibility() compares runtime image Ray version against SDK version (2.47.1)
  • Integration: Validation automatically runs during ClusterConfiguration.__post_init__()
  • User Experience: Clear error messages guide users to fix version mismatches; warnings for undetectable versions
  • Documentation: Updated cluster configuration docs with Ray version requirements and best practices

Verification steps

Build the SDK, test its functionality in a notebook.

  • In Cluster create and configure step, when using runtime image with correct Ray version image="quay.io/modh/ray:2.47.1-py311-cu121", the cluster is created and up successfully; when using a different Ray version ( e.g., image="quay.io/modh/ray:2.46.1-py311-cu121"), it should show an error message.

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • Testing is not required for this change

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 13, 2025
@LilyLinh LilyLinh requested a review from kryanbeane August 13, 2025 09:48
Copy link

codecov bot commented Aug 13, 2025

Codecov Report

❌ Patch coverage is 91.22807% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.57%. Comparing base (5a77f7b) to head (8761510).
⚠️ Report is 3 commits behind head on ray-jobs-feature.

Files with missing lines Patch % Lines
src/codeflare_sdk/common/utils/validation.py 85.71% 5 Missing ⚠️
Additional details and impacted files
@@                 Coverage Diff                  @@
##           ray-jobs-feature     #881      +/-   ##
====================================================
- Coverage             93.65%   93.57%   -0.08%     
====================================================
  Files                    20       21       +1     
  Lines                  1717     1774      +57     
====================================================
+ Hits                   1608     1660      +52     
- Misses                  109      114       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@LilyLinh LilyLinh changed the base branch from main to ray-jobs-feature August 13, 2025 10:01
Copy link
Contributor

@kryanbeane kryanbeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only small changes but this is great work @LilyLinh well done :)

@LilyLinh LilyLinh force-pushed the RHOAIENG-29330 branch 2 times, most recently from 6d8b730 to 7327753 Compare August 25, 2025 10:18
@LilyLinh LilyLinh marked this pull request as ready for review August 25, 2025 11:38
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 25, 2025
@kryanbeane
Copy link
Contributor

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 26, 2025
Copy link
Contributor

@kryanbeane kryanbeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great thanks for making those changes @LilyLinh ! Fantastic work :) lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 26, 2025
Copy link
Contributor

openshift-ci bot commented Aug 26, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kryanbeane

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 26, 2025
@kryanbeane
Copy link
Contributor

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 26, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit 3489a6b into project-codeflare:ray-jobs-feature Aug 26, 2025
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants