Configurable Inference timeout during Query time #131551

Samiul-TheSoccerFan · 2025-07-18T17:07:57Z

This PR focuses on introducing user configurable inference timeout settings and use that as timeout during inference calls. Currently, it is hardcoded to 10s and the goal is to make it configurable.

Setup

PUT _inference/sparse_embedding/my-elser-model
{
  "service": "elser",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1
  },
  "task_settings": {}
}

PUT my-semantic-index-5
{
  "mappings": {
    "properties": {
      "writer": {
        "type": "semantic_text",
        "inference_id": "my-elser-model"
      },
      "reader": {
        "type": "semantic_text"
      }
    }
  }
}

PUT my-semantic-index-6
{
  "mappings": {
    "properties": {
      "writer": {
        "type": "semantic_text",
        "inference_id": "my-elser-model"
      },
      "reader": {
        "type": "semantic_text"
      }
    }
  }
}

POST my-semantic-index-5/_doc/1
{
  "writer": "Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-5/_doc/2
{
  "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-5/_doc/3
{
   "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-6/_doc/1
{
  "writer": "Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-6/_doc/2
{
  "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-6/_doc/3
{
   "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

GET the default settings:

GET /my-semantic-index-5/_settings

GET /my-semantic-index-5/_settings?include_defaults=true

GET /my-semantic-index-6/_settings

GET /my-semantic-index-6/_settings?include_defaults=true

Update the inference timeout value:

PUT /my-semantic-index-6/_settings
{
  "index": {
    "semantic_text": {
      "inference_timeout": "1s"
    }
  }
}

GET the updated settings:

GET /my-semantic-index-5/_settings

GET /my-semantic-index-5/_settings?include_defaults=true

GET /my-semantic-index-6/_settings

GET /my-semantic-index-6/_settings?include_defaults=true

elasticsearchmachine · 2025-07-18T17:08:24Z

Pinging @elastic/search-eng (Team:SearchOrg)

elasticsearchmachine · 2025-07-18T17:08:24Z

Pinging @elastic/search-relevance (Team:Search - Relevance)

elasticsearchmachine · 2025-07-18T17:08:25Z

Hi @Samiul-TheSoccerFan, I've created a changelog YAML for you.

Samiul-TheSoccerFan · 2025-07-18T17:19:56Z

@Mikep86 Do we need to ping ML team in the PR too?

Mikep86 · 2025-07-18T17:46:12Z

@Samiul-TheSoccerFan Yes, we should ping the ML team since it touches code they own

elasticsearchmachine · 2025-07-18T17:53:26Z

Pinging @elastic/ml-core (Team:ML)

jonathan-buttner

I left a comment, if you could take a look that'd be great!

...plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/SenderService.java

kderusso

Change looks good to me, but some more tests need to be updated

...ugin/core/src/main/java/org/elasticsearch/xpack/core/ml/search/SparseVectorQueryBuilder.java

jonathan-buttner

I left a few suggestions.

x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferencePlugin.java

...plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/SenderService.java

Mikep86

Partial review, I think we have unhandled edge cases and potentially divergent default values to manage.

x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferencePlugin.java

Mikep86 · 2025-07-21T13:26:34Z

...plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/SenderService.java

+        if (timeout == null) {
+            timeout = clusterService.getClusterSettings().get(InferencePlugin.INFERENCE_QUERY_TIMEOUT);
+        }


We only want to apply this timeout if the input type is SEARCH or INTERNAL_SEARCH. Which brings up another edge case: If we allow timeout to be null now, we need to set default timeouts for the other input types as well.

Samiul-TheSoccerFan · 2025-07-23T19:05:11Z

@elasticmachine update branch

Mikep86

Got to everything except SageMakerServiceTests. I will take a look at those in a follow-up review.

.../plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/ServiceUtils.java

x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferencePlugin.java

x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/Utils.java

...in/inference/src/test/java/org/elasticsearch/xpack/inference/services/ServiceUtilsTests.java

.../elasticsearch/xpack/inference/services/elasticsearch/ElasticsearchInternalServiceTests.java

Mikep86

And a review of SageMakerServiceTests :)

...rc/test/java/org/elasticsearch/xpack/inference/services/sagemaker/SageMakerServiceTests.java

Samiul-TheSoccerFan · 2025-07-24T21:45:15Z

@elasticmachine update branch

davidkyle

I have a refactor PR that will probably cause a merge conflict with this PR

In #131759 the clusterService member is removed from all the Inference Service implementations as intellij was reporting it as being unused. @Samiul-TheSoccerFan do you need the clusterService here? I'm happy to close my PR without merging otherwise if you just need it for the ElasticsearchInternalService I can work around that.

Samiul-TheSoccerFan · 2025-07-25T12:54:20Z

@davidkyle Yes, we do need to pass the clusterService to all services so it become available to their super classes (SageMaker, and SenderService). The unused issue hopefully will be go away once we merge this PR.

Mikep86

One little thing to fix up, then we're good to go 👍

x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferencePlugin.java

Samiul-TheSoccerFan · 2025-07-25T20:43:46Z

@elasticmachine update branch

Samiul-TheSoccerFan · 2025-07-28T11:58:05Z

@elasticmachine update branch

Mikep86

LGTM!

…-tracking * upstream/main: (26 commits) Add release notes for v9.1.0 release (elastic#131953) Unmute multi_node generative tests (elastic#132021) Avoid re-enqueueing merge tasks (elastic#132020) Fix file entitlements for shared data dir (elastic#131748) ES|QL brute force l2_norm vector function (elastic#132025) Make ES|QL SAMPLE not a pipeline breaker (elastic#132014) Speed up tail computation in MemorySegmentES91OSQVectorsScorer (elastic#132001) Remove deprecated usages in `TransportPutFollowAction` (elastic#132038) Simulate impact of shard movement using shard-level write load (elastic#131406) Remove RemoteClusterService.getConnections() method (elastic#131948) Fix off by one in ValuesBytesRefAggregator (elastic#132032) Use unicode strings in data generation by default (elastic#132028) Adding index.refresh_interval as a data stream setting (elastic#131482) [ES|QL] Add more Min/MaxOverTime CSV tests (elastic#131070) Restrict remote ENRICH after FORK (elastic#131945) Fix decoding of non-ascii field names in ignored source (elastic#132018) [docs] Use centrally maintained version variables (elastic#131939) Configurable Inference timeout during Query time (elastic#131551) ESQL: Allow pruning columns added by InlineJoin (elastic#131204) ESQL: Fail `profile` on text response formats (elastic#128627) ...

Samiul-TheSoccerFan added 6 commits July 18, 2025 08:55

introducing timeout as cluster settings

7b1f1da

forcing null to be send instead of default value

95066c7

applying timeout in infer level

e60c409

removing unused variable

846f6c2

adding unit tests for cluster timeout values

74dcc03

fix linting issues

5be1e11

Samiul-TheSoccerFan added >enhancement :SearchOrg/Relevance Label for the Search (solution/org) Relevance team Team:Search - Relevance The Search organization Search Relevance team v9.2.0 labels Jul 18, 2025

elasticsearchmachine added the Team:SearchOrg Meta label for the Search Org (Enterprise Search) label Jul 18, 2025

Update docs/changelog/131551.yaml

29d5b7c

Samiul-TheSoccerFan mentioned this pull request Jul 18, 2025

Configurable Inference Timeout #129880

Closed

update changelog

d7b8116

Samiul-TheSoccerFan added Team:ML Meta label for the ML team :ml Machine learning labels Jul 18, 2025

Samiul-TheSoccerFan requested review from a team July 18, 2025 17:53

jonathan-buttner requested changes Jul 18, 2025

View reviewed changes

...plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/SenderService.java Outdated Show resolved Hide resolved

kderusso reviewed Jul 18, 2025

View reviewed changes

...ugin/core/src/main/java/org/elasticsearch/xpack/core/ml/search/SparseVectorQueryBuilder.java Show resolved Hide resolved

fix ml core SparseVectorQueryBuilder unit test

bc67010

jonathan-buttner requested changes Jul 18, 2025

View reviewed changes

x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferencePlugin.java Outdated Show resolved Hide resolved

...plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/SenderService.java Outdated Show resolved Hide resolved

Mikep86 reviewed Jul 21, 2025

View reviewed changes

adding comment and Nullable annotation

2fe3f60

Samiul-TheSoccerFan added 4 commits July 23, 2025 12:04

removed duplicate setting

62daced

update infernece plugin and utils streamline settings registration

f11f52a

using mockClusterService in all services

9b030ac

adding min value

154aff6

elasticmachine and others added 3 commits July 23, 2025 21:05

Merge branch 'main' into inference-timeout-as-cluster-settings

2275b99

Adding tests for provided timeout to work as expected

b9a907b

simplify inference timeout settings

43eaf0d

Samiul-TheSoccerFan requested a review from Mikep86 July 23, 2025 19:29

[CI] Auto commit changes from spotless

0c80477

Mikep86 reviewed Jul 24, 2025

View reviewed changes

Samiul-TheSoccerFan added 3 commits July 24, 2025 16:02

added better async handling in the test and simplify response

739b4fa

revert back ingest timeout and simplify unit tests

e3d029a

remove redundant code

6c7b1fa

Merge branch 'main' into inference-timeout-as-cluster-settings

e54c7f6

Samiul-TheSoccerFan requested a review from Mikep86 July 24, 2025 21:49

davidkyle reviewed Jul 25, 2025

View reviewed changes

Mikep86 reviewed Jul 25, 2025

View reviewed changes

x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferencePlugin.java Outdated Show resolved Hide resolved

fix unnecessary instance creation

1bb407c

Merge branch 'main' into inference-timeout-as-cluster-settings

4f3d3ae

Samiul-TheSoccerFan requested a review from Mikep86 July 25, 2025 20:43

Merge branch 'main' into inference-timeout-as-cluster-settings

aa1240e

Mikep86 approved these changes Jul 28, 2025

View reviewed changes

Samiul-TheSoccerFan merged commit e28de98 into elastic:main Jul 28, 2025
33 checks passed

Configurable Inference timeout during Query time #131551

Configurable Inference timeout during Query time #131551

Uh oh!

Conversation

Samiul-TheSoccerFan commented Jul 18, 2025

Setup

Uh oh!

elasticsearchmachine commented Jul 18, 2025

Uh oh!

elasticsearchmachine commented Jul 18, 2025

Uh oh!

elasticsearchmachine commented Jul 18, 2025

Uh oh!

Samiul-TheSoccerFan commented Jul 18, 2025

Uh oh!

Mikep86 commented Jul 18, 2025

Uh oh!

elasticsearchmachine commented Jul 18, 2025

Uh oh!

jonathan-buttner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kderusso left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jonathan-buttner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Mikep86 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Mikep86 Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Samiul-TheSoccerFan commented Jul 23, 2025

Uh oh!

Mikep86 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Mikep86 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Samiul-TheSoccerFan commented Jul 24, 2025

Uh oh!

davidkyle left a comment

Choose a reason for hiding this comment

Uh oh!

Samiul-TheSoccerFan commented Jul 25, 2025

Uh oh!

Mikep86 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Samiul-TheSoccerFan commented Jul 25, 2025

Uh oh!

Samiul-TheSoccerFan commented Jul 28, 2025

Uh oh!

Mikep86 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!