Skip to content

Conversation

kenchung285
Copy link
Contributor

@kenchung285 kenchung285 commented Jul 15, 2025

Why are these changes needed?

Add retry logic and timeout to apiserver v2:

  1. Timeout is set to long timeout (30 seconds) to prevent unbounded request
  2. The retry logic determines if a http status code is retryable

Wrap errors with more informations

Related issue number

Closes #3606

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@kenchung285 kenchung285 marked this pull request as ready for review July 17, 2025 16:57
@kenchung285
Copy link
Contributor Author

@dentiny PTAL, thanks!

@kenchung285 kenchung285 requested a review from dentiny July 18, 2025 10:04
Copy link
Contributor

@dentiny dentiny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any unit tests, curious how did you test it?

@kenchung285
Copy link
Contributor Author

I don't see any unit tests, curious how did you test it?

Currently my implementation logs the error message to stdout, so I can test my implementation by manual test with deploying a apiserver

I test with the following steps

cd apiserver
make start-local-apiserver

# An invalid path
curl http://localhost:31888/apis/ray.io/v1/namespaces/ray-system/invalid/path

# See stdout logs of the apiserver pod
kubectl logs -n ray-system pod/kuberay-apiserver-85744c5487-8csl4

And what I got is:
Screenshot 2025-07-18 at 8 32 43 PM

@kenchung285
Copy link
Contributor Author

However, this may not work for the case that we wrapped error message into error

@kenchung285 kenchung285 changed the title [apiserver]: Add retry and timeout to apiserver V2 [apiserver] Add retry and timeout to apiserver V2 Jul 18, 2025
@kenchung285 kenchung285 requested a review from dentiny July 18, 2025 14:33
@kenchung285
Copy link
Contributor Author

@dentiny PTAL, thanks!

@kenchung285 kenchung285 requested review from rueian and dentiny July 23, 2025 16:50
@kenchung285
Copy link
Contributor Author

@dentiny @rueian , PTAL, thanks!

Copy link
Contributor

@dentiny dentiny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor

@fscnick fscnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kenchung285 kenchung285 requested a review from rueian July 26, 2025 04:12

if attempt > 0 && req.GetBody != nil {
var bodyCopy io.ReadCloser
bodyCopy, err = req.GetBody()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to use req.GetBody()?

Copy link
Contributor Author

@kenchung285 kenchung285 Jul 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because req.Body in go is an one-time streamer, which can only be used once, we need to regenerate a new body for the retries by req.GetBody()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You think can just use req.Body = io.NopCloser(bytes.NewReader(bodyBytes)) once you move bodyBytes, err = io.ReadAll(req.Body) out of the loop.

…ng cancelled during backoff

Signed-off-by: Cheng-Yeh Chung <[email protected]>
@kenchung285 kenchung285 requested a review from rueian July 27, 2025 15:39
Comment on lines 131 to 133
if attempt > 0 && req.GetBody != nil {
req.Body = io.NopCloser(bytes.NewReader(bodyBytes))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if attempt > 0 && req.GetBody != nil {
req.Body = io.NopCloser(bytes.NewReader(bodyBytes))
}
if bodyBytes != nil {
req.Body = io.NopCloser(bytes.NewReader(bodyBytes))
}

Could you add a test that uses req.Body as well?

kenchung285 and others added 3 commits July 28, 2025 00:30
Co-authored-by: Rueian <[email protected]>
Signed-off-by: Cheng-Yeh Chung <[email protected]>
Co-authored-by: Rueian <[email protected]>
Signed-off-by: Cheng-Yeh Chung <[email protected]>
@rueian
Copy link
Contributor

rueian commented Jul 27, 2025

Thanks @kenchung285!

@rueian rueian merged commit 4c46d56 into ray-project:master Jul 27, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] [apiserver] Add timeout and retry for apiserver v2
7 participants