Skip to content

Conversation

shuangli-z
Copy link

@shuangli-z shuangli-z commented Aug 11, 2025

Description

Add index insight feature that aims to improve the understanding of indices for other LLM-based index-related features.
Currently supports three types of insights: statistical data, field description and log related index check.

API examples

  1. Open index insight by
PUT /_plugins/_ml/index_insight_config
{
    "is_enable": true
}

The API will put the index insight config into a system index. Then create an index with the fixed name ml_index_insight to store the index insight.

You can disable the config with the API

PUT /_plugins/_ml/index_insight_config
{
    "is_enable": false
}

Only admin user can change the index insight config.

also, you can check the config through

GET /_plugins/_ml/index_insight_config
  1. Set agent id for index insight feature, the agent will be used to call LLM to generate some descriptive content like field description.
POST /_plugins/_ml/agents/_register
{
  "name": "GENERAL_TOOL",
  "type": "flow",
  "tools": [
    {
      "type": "MLModelTool",
      "description": "A general tool to answer any question",
      "parameters": {
        "model_id": "<model_id>"
      }
    }
  ]
}

PUT /.plugins-ml-config/_doc/os_index_insight_agent
{
    "type": "os_index_insight_agent",
    "configuration": {
    "agent_id": "<agent_id>"
    }
}
  1. Invoke index insight feature
GET /_plugins/_ml/insights/<target_index>/<insight_type>

We also wrap a tool for index insight action to use.

POST /_plugins/_ml/agents/_register
{
  "name": "flow agent with only index insight tool",
  "type": "flow",
  "description": "this is a flow agent which only has a index insight tool",
  "tools": [
    {
      "type": "IndexInsightTool",
      "name": "IndexInsightTool",
      "description": "Use this tool to get details of one index according to different tast type, including STATISTICAL_DATA: the data distribution and index mapping of the index, FIELD_DESCRIPTION: The description of each column, LOG_RELATED_INDEX_CHECK: Whether the index is related to log/trace and whether it contains trace/log fields",
      "attributes": {
                "input_schema": "{\"type\":\"object\",\"properties\":{\"indexName\":{\"description\":\"The index name you want to query with\",\"type\":\"string\"},\"taskType\":{\"description\":\"The task type of index insight, including STATISTICAL_DATA: the data distribution and index mapping of the index, FIELD_DESCRIPTION: The description of each column, LOG_RELATED_INDEX_CHECK: Whether the index is related to log/trace and whether it contains trace/log fields \",\"type\":\"string\"}}}",
                "strict": "false"
              }
    }
  ],
  "app_type": "my_app"
}

and execute by

POST /_plugins/_ml/agents/<agent_id>/_execute
{
  "parameters": {
    "index_name": "<index_name>",
    "task_type": "STATISTICAL_DATA"
  }
}

Related Issues

Resolves #3993

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

shuangli-z and others added 29 commits July 22, 2025 14:57
Signed-off-by: xinyual <[email protected]>
Signed-off-by: xinyual <[email protected]>
Signed-off-by: xinyual <[email protected]>
…exInsightTask is now processed one at a time

Signed-off-by: shuangli-z <[email protected]>
Signed-off-by: xinyual <[email protected]>
- TODO: Add prerequisite chain logic

Signed-off-by: shuangli-z <[email protected]>
Signed-off-by: xinyual <[email protected]>
Signed-off-by: xinyual <[email protected]>
Signed-off-by: xinyual <[email protected]>
Signed-off-by: xinyual <[email protected]>
Signed-off-by: xinyual <[email protected]>
@shuangli-z shuangli-z requested a review from b4sjoo as a code owner August 11, 2025 08:27
@shuangli-z shuangli-z requested a deployment to ml-commons-cicd-env-require-approval August 25, 2025 06:39 — with GitHub Actions Waiting
@shuangli-z shuangli-z requested a deployment to ml-commons-cicd-env-require-approval August 25, 2025 06:39 — with GitHub Actions Waiting
@shuangli-z shuangli-z requested a deployment to ml-commons-cicd-env-require-approval August 25, 2025 06:39 — with GitHub Actions Waiting
@shuangli-z shuangli-z requested a deployment to ml-commons-cicd-env-require-approval August 25, 2025 08:07 — with GitHub Actions Waiting
@shuangli-z shuangli-z requested a deployment to ml-commons-cicd-env-require-approval August 25, 2025 08:07 — with GitHub Actions Waiting
@shuangli-z shuangli-z requested a deployment to ml-commons-cicd-env-require-approval August 25, 2025 08:07 — with GitHub Actions Waiting
@shuangli-z shuangli-z requested a deployment to ml-commons-cicd-env-require-approval August 25, 2025 08:07 — with GitHub Actions Waiting
@shuangli-z shuangli-z had a problem deploying to ml-commons-cicd-env-require-approval August 28, 2025 06:11 — with GitHub Actions Error
@shuangli-z shuangli-z had a problem deploying to ml-commons-cicd-env-require-approval August 28, 2025 06:11 — with GitHub Actions Error
@shuangli-z shuangli-z had a problem deploying to ml-commons-cicd-env-require-approval August 28, 2025 06:11 — with GitHub Actions Failure
@shuangli-z shuangli-z had a problem deploying to ml-commons-cicd-env-require-approval August 28, 2025 06:11 — with GitHub Actions Failure
@zane-neo
Copy link
Collaborator

There are failure IT:

REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests 'org.opensearch.ml.rest.RestMLDeleteIndexInsightConfigIT.testDeleteIndexInsightContainer_SuccessDelete' -Dtests.seed=B5D4FBC80ED75458 -Dtests.security.manager=false -Dtests.locale=kab-Latn-DZ -Dtests.timezone=EST -Druntime.java=24

    [2025-08-28T21:44:27,371][INFO ][o.o.m.r.RestMLDeleteIndexInsightConfigIT] [testDeleteIndexInsightContainer_FailSinceNotSet] after test
REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests 'org.opensearch.ml.rest.RestMLDeleteIndexInsightConfigIT.testDeleteIndexInsightContainer_FailSinceNotSet' -Dtests.seed=B5D4FBC80ED75458 -Dtests.security.manager=false -Dtests.locale=kab-Latn-DZ -Dtests.timezone=EST -Druntime.java=24


Suite: Test class org.opensearch.ml.rest.RestMLDeleteIndexInsightConfigIT
  2> REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests 'org.opensearch.ml.rest.RestMLDeleteIndexInsightConfigIT.testDeleteIndexInsightContainer_SuccessDelete' -Dtests.seed=B5D4FBC80ED75458 -Dtests.security.manager=false -Dtests.locale=kab-Latn-DZ -Dtests.timezone=EST -Druntime.java=24
  2> org.opensearch.client.ResponseException: method [PUT], host [http://[::1]:37893], URI [/_plugins/_ml/index_insight_container], status line [HTTP/1.1 400 Bad Request]
    {"error":"no handler found for uri [/_plugins/_ml/index_insight_container] and method [PUT]"}
        at __randomizedtesting.SeedInfo.seed([B5D4FBC80ED75458:BD2DE5E14B2425F6]:0)
        at app//org.opensearch.client.RestClient.convertResponse(RestClient.java:501)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:384)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:359)
        at app//org.opensearch.ml.utils.TestHelper.makeRequest(TestHelper.java:199)
        at app//org.opensearch.ml.utils.TestHelper.makeRequest(TestHelper.java:172)
        at app//org.opensearch.ml.rest.RestMLDeleteIndexInsightConfigIT.testDeleteIndexInsightContainer_SuccessDelete(RestMLDeleteIndexInsightConfigIT.java:37)
  2> REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests 'org.opensearch.ml.rest.RestMLDeleteIndexInsightConfigIT.testDeleteIndexInsightContainer_FailSinceNotSet' -Dtests.seed=B5D4FBC80ED75458 -Dtests.security.manager=false -Dtests.locale=kab-Latn-DZ -Dtests.timezone=EST -Druntime.java=24
  2> java.lang.ClassCastException: class java.lang.String cannot be cast to class java.util.Map (java.lang.String and java.util.Map are in module java.base of loader 'bootstrap')
        at __randomizedtesting.SeedInfo.seed([B5D4FBC80ED75458:A2733CDEF5DE3EB3]:0)
        at org.opensearch.ml.rest.RestMLDeleteIndexInsightConfigIT.testDeleteIndexInsightContainer_FailSinceNotSet(RestMLDeleteIndexInsightConfigIT.java:95)
  2> NOTE: leaving temporary files on disk at: /__w/ml-commons/ml-commons/plugin/build/testrun/integTest/temp/org.opensearch.ml.rest.RestMLDeleteIndexInsightConfigIT_B5D4FBC80ED75458-001
  2> NOTE: test params are: codec=Lucene101, sim=Asserting(RandomSimilarity(queryNorm=false): {}), locale=kab-Latn-DZ, timezone=EST
  2> NOTE: Linux 6.11.0-1018-azure amd64/Azul Systems, Inc. 24.0.2 (64-bit)/cpus=4,threads=1,free=250646992,total=536870912
  2> NOTE: All tests run in this JVM: [HFAnalyzerIT, MLModelAutoReDeployerIT, RestBedRockInferenceIT, RestCohereInferenceIT, RestConnectorToolIT, RestMLCustomModelActionIT, RestMLCustomModelChunkActionIT, RestMLDeleteIndexInsightConfigIT]

xinyual and others added 2 commits August 29, 2025 11:32
@shuangli-z shuangli-z temporarily deployed to ml-commons-cicd-env-require-approval August 29, 2025 03:36 — with GitHub Actions Inactive
@shuangli-z shuangli-z temporarily deployed to ml-commons-cicd-env-require-approval August 29, 2025 03:36 — with GitHub Actions Inactive
@shuangli-z shuangli-z temporarily deployed to ml-commons-cicd-env-require-approval August 29, 2025 03:36 — with GitHub Actions Inactive
@shuangli-z shuangli-z temporarily deployed to ml-commons-cicd-env-require-approval August 29, 2025 03:36 — with GitHub Actions Inactive
@shuangli-z shuangli-z requested a deployment to ml-commons-cicd-env-require-approval August 29, 2025 07:16 — with GitHub Actions Waiting
@shuangli-z shuangli-z requested a deployment to ml-commons-cicd-env-require-approval August 29, 2025 07:16 — with GitHub Actions Waiting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] RFC: Index insight: A feature to enhance indices related AI features
5 participants