-
Notifications
You must be signed in to change notification settings - Fork 25.4k
[ML] Flag updates from Inference #131725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[ML] Flag updates from Inference #131725
Conversation
Pinging @elastic/ml-core (Team:ML) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
public void setFromInference(boolean fromInference) { | ||
this.fromInference = fromInference; | ||
this.isInternal = fromInference; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks confusing the setFromInference
also sets isInternal
.
...src/main/java/org/elasticsearch/xpack/core/ml/action/UpdateTrainedModelDeploymentAction.java
Outdated
Show resolved
Hide resolved
@@ -27,6 +27,7 @@ | |||
import java.io.IOException; | |||
import java.util.Objects; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm missing a bit of context: why do we need to distinguish between these cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a corresponding Serverless PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let me ping you with the internal documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm missing a bit of context: why do we need to distinguish between these cases?
We need to allow updates to num_allocations
in serverless that originate from the AdaptiveAllocationsScalerService
(ADAPTIVE_ALLOCATIONS
), but we want to disallow updates from users (API
and INFERENCE
). The only alternative I thought of was refactoring AdaptiveAllocationsScalerService
to update directly rather than through the API, but that felt more intrusive.
// we changed over from a boolean to an enum | ||
// when it was a boolean, true came from adaptive allocations and false came from the rest api | ||
// treat "inference" as if it came from the api | ||
out.writeBoolean(isInternal()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to determine if source == Source.ADAPTIVE_ALLOCATIONS
here? Since this will return true for Source.INFERENCE
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, we set the boolean to true if the source was either from the inference update api or the adaptive allocations autoscaler. out.writeBoolean(isInternal())
preserves this logic (i think). It means the stream reader will think an inference api call is an adaptive allocations api call, but that only affects serverless which is only mixed cluster during a rolling update.
@@ -119,11 +131,15 @@ public void setAdaptiveAllocationsSettings(AdaptiveAllocationsSettings adaptiveA | |||
} | |||
|
|||
public boolean isInternal() { | |||
return isInternal; | |||
return source == Source.INFERENCE || source == Source.ADAPTIVE_ALLOCATIONS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you confirm that we do want Source.INFERENCE
here for all the usage of isInternal()
below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed! Yeah inference update code previously set isInternal
to true (back when the boolean existed)
Flag updates from Inference so Serverless can detect them.
Swap tests to set adaptive allocations rather than num allocations to pass in serverless.