Skip to content

Updating Daft links in Ray documentation #54328

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 87 commits into from

Conversation

ccmao1130
Copy link

@ccmao1130 ccmao1130 commented Jul 3, 2025

Why are these changes needed?

Want to update Daft links, messaging, and logo across Ray documentation

Related issue number

n/a

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@ccmao1130 ccmao1130 force-pushed the patch-1 branch 4 times, most recently from 7b2424c to 72c6bd2 Compare July 3, 2025 22:43
@ccmao1130 ccmao1130 marked this pull request as ready for review July 3, 2025 23:34
@ccmao1130 ccmao1130 requested review from a team as code owners July 3, 2025 23:34
@ccmao1130 ccmao1130 force-pushed the patch-1 branch 2 times, most recently from 7d67544 to 39a2b33 Compare July 4, 2025 00:45
Copy link
Contributor

@dstrodtman dstrodtman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm from the docs side.

@cszhu cszhu added docs An issue or change related to documentation data Ray Data-related issues community-contribution Contributed by the community labels Jul 8, 2025
@omatthew98 omatthew98 self-requested a review July 17, 2025 20:33
Copy link
Contributor

@omatthew98 omatthew98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the microcheck failed because of the updated pip dependency? https://buildkite.com/ray-project/microcheck/builds/19843#0197f08b-3255-4fd3-b8b1-0c756dd4c467. Probably just need to recompile / update requirements_compiled.txt: requirements_compiled.txt is not up to date. Please download it from Artifacts tab and git push the changes.

@omatthew98 omatthew98 added the @external-author-action-required Alternate tag for PRs where the author doesn't have labeling permission. label Jul 17, 2025
aslonnie and others added 12 commits July 28, 2025 21:25
"Dask on Ray" (DoR) is broken in dask==2024.11.0 or later as reported in
ray-project#48689 because Dask removed a
private function in dask/dask#11378 that DoR has
been relying on. Not only dask/dask#11378, Dask
has migrated their task data structure to a new format (the high-level
motivation is described in dask/dask#9969).
Since this migration spans across a series of changes between 2024.11.0 and
2025.1.0, it's not realistic to copy what's been removed and paste them
in Ray.

This change adapts Dask on Ray to the change to keep its functionality.
The change is compatible only with
`dask>=2024.11.0,<2025.1.0` because Dask made another major change in
2025.1.0, breaking the shuffle optimization introduced in
ray-project#13951

Signed-off-by: Lonnie Liu <[email protected]>
Co-authored-by: Hiromu Hota <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
…ct#54312)

The semaphore is clever, but using signal instead for consistency with
other tests.

---------

Signed-off-by: Edward Oakes <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
ray-project#53999 resulted in this test
being flaky on mac. This test's purpose seems to be similar to
https://github.com/ray-project/ray/blob/986115ce566fda437c5e3fcca3705c225b06f3b8/python/ray/tests/test_streaming_generator_4.py#L73
and was kind of trying to test a feature that didn't exist. But since it
wasn't actually pausing the generator for backpressure, the generator
would usually finish before the node removal actually happens. Now
sometimes when the node removal happens before the generator finishes,
we'll lose objects and go through the new path. We could also go through
the new resubmission path multiple times for one node death because
multiple objects from the same generator may be marked lost. Therefore,
sometimes we run out of retries before getting to the third retry in the
test and it fails with
`ray.exceptions.RayTaskError(ObjectReconstructionFailedMaxAttemptsExceededError)`

The fix to make this not flaky would be to do the follow up listed in
the previous pr.
> Currently, if multiple objects from the same generator are queued up
to be recovered when the recovery periodical runner runs, we could
resubmit for the first object and then once again queue up a resubmit
for the second if argument resolution and sequence numbering lines up.
Since this doesn't actually affect correctness and requires a bit of
refactoring, it'll be in a follow-up PR.

---------

Signed-off-by: dayshah <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
…ay 1.4.0 (ray-project#53943)

but that if autoscaling is used, the autoscaler image must have Ray
2.45.0 or later.

closes ray-project/kuberay#3580

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [x] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: David Xia <[email protected]>
Co-authored-by: angelinalg <[email protected]>
Co-authored-by: Dhyey Shah <[email protected]>
Co-authored-by: Kai-Hsun Chen <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
## Problem

This check assumes that if the actor is not in `registered_actors_` it
must be in `actor_to_register_callbacks_`.
https://github.com/ray-project/ray/blob/c6d7d7eaa1e4dd0fd42ba45891d8501ab14ceb44/src/ray/gcs/gcs_server/gcs_actor_manager.cc#L871-L872

This isn't true because in `DestroyActor`, the actor is always removed
from `actor_to_register_callbacks_`,

https://github.com/ray-project/ray/blob/c6d7d7eaa1e4dd0fd42ba45891d8501ab14ceb44/src/ray/gcs/gcs_server/gcs_actor_manager.cc#L1073
but only removed from `registered_actors_` if the actor is restartable.

https://github.com/ray-project/ray/blob/c6d7d7eaa1e4dd0fd42ba45891d8501ab14ceb44/src/ray/gcs/gcs_server/gcs_actor_manager.cc#L1143-L1146

It's also erasing from `actor_to_register_callbacks_` without actually
calling the callbacks, so rpc's from the core worker side could be left
hanging forever.

## Fix
The fix is to never erase from `actor_to_register_callbacks_` in
`DestroyActor`, and to always respond to the all the queued callbacks
with the appropriate status in the Put callback. The logic for which
status to respond with is the same. If it's in `registered_actors_`
after the table put is done the rpc should be completed with the ok
status. If it's not there by the time the table put is done it, it
should respond to the rpc with SchedulingCancelled.

Additional changes
- removed the need to pass the actor ptr into the RegisterActor callback
because it was unused
- removed an accessor that was only used for tests, added it to the test
fixture and turned the test fixture into a friend
- add logic in the accessor to read both the gRPC status and the
GcsStatus see comment here
ray-project#53634 (comment)

### Follow ups
- Fix the mess noted here
ray-project#53634 (comment)
- There's almost surely more lurking issues here due to actor management
split brain + kv operation ordering assumptions. Needs to be
investigated.
- The actor state transition machine here needs to be clearer, e.g.
actors shouldn't be put into registered_actors_ if they're not
registered yet.

---------

Signed-off-by: dayshah <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
Move `httpx` out of `test_utils` because for some reason it is not
available in the image used for `test_runtime_env_container.

Signed-off-by: Cindy Zhang <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
Signed-off-by: ccmao1130 <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
…t#54317)

This PR is smaller than it looks.

The `TaskManager` class currently exposes two interfaces: `TaskFinisher`
and `TaskResubmission`. While these interfaces are well-intentioned,
they are only implemented by `TaskManager` itself, and the methods they
define are not fully independent. As a result, it’s unlikely that these
interfaces could be meaningfully separated or implemented in isolation.

This change consolidates them into a single `TaskManager` interface,
which can be reused where needed. The goal is to reduce the number of
concepts and components required to reason about the Ray core, and to
simplify the overall design.

Test:
- CI

Signed-off-by: Cuong Nguyen <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
elliot-barn and others added 12 commits July 28, 2025 21:25
Adding uv binary to be used in CI

---------

Signed-off-by: elliot-barn <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
Signed-off-by: Linkun <[email protected]>
Signed-off-by: Linkun Chen <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
…and task detail pages (ray-project#54292)

Follow-up to ray-project#53423

Missed a few places in the UI.
Also updates placement group tables to use the same code preview
component as the actor and tasks tables.

Placement group table
![Screenshot 2025-07-02 at 4 00
53 PM](https://github.com/user-attachments/assets/8de97470-abda-4680-b2fb-a4f90add0063)
![Screenshot 2025-07-02 at 4 00
56 PM](https://github.com/user-attachments/assets/a3c37e6f-c9db-4b37-b873-a5fbbd3012d7)

Actor detail
![Screenshot 2025-07-02 at 4 01
05 PM](https://github.com/user-attachments/assets/839cdaea-b441-4380-9c77-4ccb4ebfe563)

Task detail
![Screenshot 2025-07-02 at 4 01
19 PM](https://github.com/user-attachments/assets/aa12461c-a192-4114-a7fd-824613b9c6e6)

---------

Signed-off-by: Alan Guo <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
See inline comments for each.

---------

Signed-off-by: Edward Oakes <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
…#54413)

Found that we pass by value in cluster task manager constructor, use
move to avoid unnecessary copy.

Signed-off-by: You-Cheng Lin (Owen) <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
Fixes: ray-project#53478

- migrating check_library_usage_telemetry from `_private` to `_common`
- migrating TelemetryCallsite from `_private` to `_common`.

Signed-off-by: ChanChan Mao <[email protected]>
…Y_enable_autoscaler_v2 for ray up (ray-project#54456)

Use a different env var for ray up to enable autoscaler v2 to avoid
accidentally enabling v2 due to env inheritance.

Signed-off-by: Rueian <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
…ation (ray-project#53647)

## Why are these changes needed?

- Currently, Serve can not catch multiple FastAPI deployments in a
single application if user sets the docs path to None in their FastAPI
app.
- We can check multiple ASGIAppReplicaWrapper in a single application to
avoid this issue.

## Related issue number

Closes ray-project#53024

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [x] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [x] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Ziy1-Tan <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
…t#54458)

Deflake
`test_autoscaling_policy_with_metr_disab.py::TestAutoscalingMetrics::test_basic`

When `RAY_SERVE_COLLECT_AUTOSCALING_METRICS_ON_HANDLE=0`, we collect
ongoing request metrics at the replica and queued request metrics at the
handle -- but ongoing request metrics are updated very fast while queued
metrics are sent every 10s. Because of this delay the total number of
ongoing requests climbs to almost 100 because before the queued request
metrics are flushed, almost every request is double counted.

Example:
https://buildkite.com/ray-project/postmerge/builds/11322#0197eaca-62e1-457d-947b-a981210e98b9/177-852
Note that we are sending exactly 50 requests and expect the number of
replicas to scale to exactly 5. However the metrics grow above 50 here,
almost to 100, which causes the test to be flaky / fail.

This pr sets the env var `RAY_SERVE_HANDLE_METRIC_PUSH_INTERVAL_S=0.1`
and pairs with other stabilizing changes.

Signed-off-by: Cindy Zhang <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
This PR replaces some of the manual string literals of urls within
`test_api`, `test_deploy`, `test_deploy_2`, `test_deploy_app`,
`test_failure` with `get_application_urls` and splits some of the tests
into separate files.

---------

Signed-off-by: doyoung <[email protected]>
Signed-off-by: Alexey Kudinkin <[email protected]>
Signed-off-by: Linkun <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kevin H. Luu <[email protected]>
Signed-off-by: kevin <[email protected]>
Signed-off-by: elliot-barn <[email protected]>
Signed-off-by: Lonnie Liu <[email protected]>
Signed-off-by: dayshah <[email protected]>
Signed-off-by: You-Cheng Lin (Owen) <[email protected]>
Signed-off-by: Edward Oakes <[email protected]>
Signed-off-by: Srinath Krishnamachari <[email protected]>
Signed-off-by: srinathk10 <[email protected]>
Signed-off-by: noemotiovon <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]>
Signed-off-by: Cindy Zhang <[email protected]>
Signed-off-by: Hao Chen <[email protected]>
Signed-off-by: Timothy Seah <[email protected]>
Signed-off-by: Timothy Seah <[email protected]>
Signed-off-by: Vignesh Hirudayakanth <[email protected]>
Signed-off-by: Balaji Veeramani <[email protected]>
Co-authored-by: Alexey Kudinkin <[email protected]>
Co-authored-by: Linkun <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]>
Co-authored-by: Qiaolin Yu <[email protected]>
Co-authored-by: Kevin H. Luu <[email protected]>
Co-authored-by: Elliot Barnwell <[email protected]>
Co-authored-by: Lonnie Liu <[email protected]>
Co-authored-by: harshit-anyscale <[email protected]>
Co-authored-by: Dhyey Shah <[email protected]>
Co-authored-by: Owen Lin (You-Cheng Lin) <[email protected]>
Co-authored-by: Edward Oakes <[email protected]>
Co-authored-by: srinathk10 <[email protected]>
Co-authored-by: Chenguang Li <[email protected]>
Co-authored-by: Ryan O'Leary <[email protected]>
Co-authored-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Cindy Zhang <[email protected]>
Co-authored-by: Hao Chen <[email protected]>
Co-authored-by: Timothy Seah <[email protected]>
Co-authored-by: Timothy Seah <[email protected]>
Co-authored-by: Justin Yu <[email protected]>
Co-authored-by: Vignesh Hirudayakanth <[email protected]>
Co-authored-by: Balaji Veeramani <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
Signed-off-by: ChanChan Mao <[email protected]>
@richardliaw
Copy link
Contributor

hm looks like the commit history got messed up; open a new PR?

@ccmao1130
Copy link
Author

omg i know.. sorry i'm not a developer let me redo this 😭

@ccmao1130 ccmao1130 closed this Jul 29, 2025
@ccmao1130 ccmao1130 deleted the patch-1 branch July 29, 2025 04:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-contribution Contributed by the community data Ray Data-related issues docs An issue or change related to documentation @external-author-action-required Alternate tag for PRs where the author doesn't have labeling permission.
Projects
None yet
Development

Successfully merging this pull request may close these issues.