Skip to content

[ML] Flag updates from Inference #131725

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Conversation

prwhelan
Copy link
Member

@prwhelan prwhelan commented Jul 22, 2025

Flag updates from Inference so Serverless can detect them.
Swap tests to set adaptive allocations rather than num allocations to pass in serverless.

@prwhelan prwhelan added >test Issues or PRs that are addressing/adding tests :ml Machine learning Team:ML Meta label for the ML team v9.2.0 labels Jul 22, 2025
@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Jul 22, 2025
@prwhelan prwhelan marked this pull request as ready for review July 22, 2025 18:55
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Contributor

@jan-elastic jan-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@prwhelan prwhelan changed the title [ML] Use adaptive allocations in test [ML] Flag updates from Inference Jul 23, 2025
@prwhelan prwhelan requested a review from jan-elastic July 23, 2025 20:54

public void setFromInference(boolean fromInference) {
this.fromInference = fromInference;
this.isInternal = fromInference;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks confusing the setFromInference also sets isInternal.

@@ -27,6 +27,7 @@
import java.io.IOException;
import java.util.Objects;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing a bit of context: why do we need to distinguish between these cases?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a corresponding Serverless PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let me ping you with the internal documentation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing a bit of context: why do we need to distinguish between these cases?

We need to allow updates to num_allocations in serverless that originate from the AdaptiveAllocationsScalerService (ADAPTIVE_ALLOCATIONS), but we want to disallow updates from users (API and INFERENCE). The only alternative I thought of was refactoring AdaptiveAllocationsScalerService to update directly rather than through the API, but that felt more intrusive.

// we changed over from a boolean to an enum
// when it was a boolean, true came from adaptive allocations and false came from the rest api
// treat "inference" as if it came from the api
out.writeBoolean(isInternal());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to determine if source == Source.ADAPTIVE_ALLOCATIONS here? Since this will return true for Source.INFERENCE as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we set the boolean to true if the source was either from the inference update api or the adaptive allocations autoscaler. out.writeBoolean(isInternal()) preserves this logic (i think). It means the stream reader will think an inference api call is an adaptive allocations api call, but that only affects serverless which is only mixed cluster during a rolling update.

@@ -119,11 +131,15 @@ public void setAdaptiveAllocationsSettings(AdaptiveAllocationsSettings adaptiveA
}

public boolean isInternal() {
return isInternal;
return source == Source.INFERENCE || source == Source.ADAPTIVE_ALLOCATIONS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm that we do want Source.INFERENCE here for all the usage of isInternal() below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed! Yeah inference update code previously set isInternal to true (back when the boolean existed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning serverless-linked Added by automation, don't add manually Team:ML Meta label for the ML team >test Issues or PRs that are addressing/adding tests v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants