Add ml-commons passthrough post process function #4111

q-andy · 2025-08-13T23:13:19Z

Description

Adds a predefined post-process function to unwrap model output when making a remote call to second ml-commons cluster. When calling a remote second ml-commons predict API with a remote connector, the remote connector output will "double wrap" the second ml-commons. Tested compatibility with neural-search SparseEncodingProcessor and TextEmbeddingProcessor.

Predict call without post process function (double wrapped):

{
	"inference_results": [
		{
			"output": [
				{
					"name": "output",
					"dataAsMap": {
						"inference_results": [
							{
								"output": [
									{
										"name": "output",
										"dataAsMap": {
											"response": [
												{
													"this": 0.6228184700012207,
													"harrison": 0.5467907786369324,
													...
												},
												{
													"?": 0.1258760392665863,
													"day": 0.46702781319618225,
													...
												}
											]
										}
									}
								],
								"status_code": 200.0
							}
						]
					}
				}
			],
			"status_code": 200
		}
	]
}

After using post process function:

{
	"inference_results": [
		{
			"output": [
				{
					"name": "output",
					"dataAsMap": {
						"response": [
							{
								"increasingly": 0.028670792,
								"achievements": 0.4906937,
								...
							}
						]
					}
				},
				{
					"name": "output",
					"dataAsMap": {
						"response": [
							{
								"hesse": 0.07568395,
								"greeting": 0.296827,
								...
							}
						]
					}
				}
			],
			"status_code": 200
		}
	]
}

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

...ml/common/connector/functions/postprocess/RemoteMlCommonsPassthroughPostProcessFunction.java

dhrubo-os · 2025-08-14T19:04:55Z

@q-andy fix spotless :)

codecov · 2025-08-19T23:45:44Z

Codecov Report

❌ Patch coverage is 88.67925% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.82%. Comparing base (d8c6208) to head (f2f6a78).
⚠️ Report is 5 commits behind head on main.

Files with missing lines	Patch %	Lines
...RemoteMlCommonsPassthroughPostProcessFunction.java	88.00%	0 Missing and 6 partials ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #4111      +/-   ##
============================================
+ Coverage     81.80%   81.82%   +0.01%     
- Complexity     8847     8866      +19     
============================================
  Files           761      762       +1     
  Lines         38099    38152      +53     
  Branches       4250     4263      +13     
============================================
+ Hits          31168    31217      +49     
+ Misses         5110     5109       -1     
- Partials       1821     1826       +5

Flag	Coverage Δ
ml-commons	`81.82% <88.67%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Andy Qin <[email protected]>

q-andy · 2025-08-21T22:01:40Z

Most CI passed, just 2 flaky integ test failures are unrelated to this change. Could you take another look @dhrubo-os @Zhangxunmt

dhrubo-os · 2025-08-23T16:51:24Z

...ml/common/connector/functions/postprocess/RemoteMlCommonsPassthroughPostProcessFunction.java

+    }
+
+    @Override
+    public List<ModelTensor> process(Map<String, Object> input, MLResultDataType dataType) {


instead of Map<String, Object> input, Since this parameter represents ML Commons inference results or response data, some better name suggestions would be:

mlCommonsResponse inferenceResponse responseData modelResponse

Sure, didn't realize you could rename the parameter variables of an overridden function. Changed to mlCommonsResponse

dhrubo-os · 2025-08-23T16:52:07Z

...ml/common/connector/functions/postprocess/RemoteMlCommonsPassthroughPostProcessFunction.java

+/**
+ * A post-processing function for calling a remote ml commons instance that preserves the original neural sparse response structure
+ * to avoid double-wrapping when receiving responses from another ML-Commons instance.
+ */


Can we add example in the doc for better understanding?

dhrubo-os · 2025-08-23T16:52:51Z

common/src/main/java/org/opensearch/ml/common/connector/MLPostProcessFunction.java

@@ -35,6 +36,7 @@ public class MLPostProcessFunction {
    public static final String BEDROCK_RERANK = "connector.post_process.bedrock.rerank";
    public static final String DEFAULT_EMBEDDING = "connector.post_process.default.embedding";
    public static final String DEFAULT_RERANK = "connector.post_process.default.rerank";
+    public static final String ML_COMMONS_PASSTHROUGH = "connector.post_process.mlcommons.passthrough";


as this is a special kind, can we please add a comment here?

dhrubo-os · 2025-08-23T17:00:27Z

...ml/common/connector/functions/postprocess/RemoteMlCommonsPassthroughPostProcessFunction.java

+            if (dataTypeObj instanceof String) {
+                try {
+                    dataType = MLResultDataType.valueOf((String) dataTypeObj);
+                } catch (IllegalArgumentException e) {


IllegalArgumentException is a 500 level error. Should we treat this as a 5xx error?

Also what is the reason for leaving as null?

I left as null instead of treating like 5xx is because even if the model data type is invalid, there may be usecases where we can still parse and use the model data or dataAsMap. E.g. some model types like neural sparse or NER don't include data type as part of the model response, so null is still valid.

Right now since we're focused on neural sparse and dense, my thought process is its better to leave it flexible to be able to possible handle different model response formats. For example, in the future we me add a new datatype for dense models and we might inference that from an older version of ml-commons: perhaps we can still use the data by casting the data at the processor level. Updated the comment to explain this.

dhrubo-os · 2025-08-23T17:03:19Z

...ml/common/connector/functions/postprocess/RemoteMlCommonsPassthroughPostProcessFunction.java

+        long[] shape = null;
+        if (map.containsKey(ModelTensor.SHAPE_FIELD)) {
+            Object shapeObj = map.get(ModelTensor.SHAPE_FIELD);
+            if (shapeObj instanceof List<?> shapeList) {


In the else we are sending null for shape, is that expected?

Yes, e.g. neural sparse and dense models populate different fields, sparse leaves data_type, data, and shape as null.

Dense

{ "name": "sentence_embedding", "data_type": "FLOAT32", "shape": [ 768 ], "data": [...] }

Sparse

{ "name": "output", "dataAsMap": { "response": [ { ... } ] } }

dhrubo-os · 2025-08-23T17:03:56Z

...ml/common/connector/functions/postprocess/RemoteMlCommonsPassthroughPostProcessFunction.java

+        Number[] data = null;
+        if (map.containsKey(ModelTensor.DATA_FIELD)) {
+            Object dataObj = map.get(ModelTensor.DATA_FIELD);
+            if (dataObj instanceof List<?> dataList) {


what happens if we send data as null to the model Tensor?

See above comment, this is valid for neural sparse and other remote model format. Looks like data field is primarily used for vector/numerical info, and if a model output doesn't include that, then it uses dataAsMap instead and data field being null is valid.

dhrubo-os · 2025-08-23T17:05:54Z

...ml/common/connector/functions/postprocess/RemoteMlCommonsPassthroughPostProcessFunction.java

+
+        // Handle data array
+        Number[] data = null;
+        if (map.containsKey(ModelTensor.DATA_FIELD)) {


Seems like both of the underlying logic could be put in a common method like:

private <T> T[] processNumericArray(Object obj, Class<T> type) { if (obj instanceof List<?> list) { T[] result = (T[]) Array.newInstance(type, list.size()); // ... process the list return result; } return null; }

mingshl · 2025-08-25T07:36:25Z

after using the post processing function,

the name seems changed, the name used to be "response" but after the post processing become "output". Is this change intended? I am sure if the name is used somewhere but better not change the name if not intentionally.

{
	"inference_results": [
		{
			"output": [
				{
					"name": "output",
					"dataAsMap": {
						"response": [

Signed-off-by: Andy Qin <[email protected]>

q-andy · 2025-08-25T19:39:35Z

the name seems changed, the name used to be "response" but after the post processing become "output". Is this change intended? I am sure if the name is used somewhere but better not change the name if not intentionally.

Fixed, the name is relevant for different model types. I changed it so the name will be passthroughed as well, this is just a typo in the PR description.

q-andy requested review from b4sjoo, dhrubo-os, mingshl, jngz-es, model-collapse, rbhavna, ylwu-amzn, zane-neo, Zhangxunmt, austintlee, HenryL27 and xinyual as code owners August 13, 2025 23:13

Zhangxunmt reviewed Aug 13, 2025

View reviewed changes

...ml/common/connector/functions/postprocess/RemoteMlCommonsPassthroughPostProcessFunction.java Show resolved Hide resolved

...ml/common/connector/functions/postprocess/RemoteMlCommonsPassthroughPostProcessFunction.java Show resolved Hide resolved

q-andy force-pushed the sparse-passthrough branch from 1d97d7b to a89b235 Compare August 14, 2025 19:11

q-andy temporarily deployed to ml-commons-cicd-env-require-approval August 14, 2025 19:13 — with GitHub Actions Inactive

q-andy had a problem deploying to ml-commons-cicd-env-require-approval August 14, 2025 19:13 — with GitHub Actions Failure

q-andy had a problem deploying to ml-commons-cicd-env-require-approval August 14, 2025 19:13 — with GitHub Actions Error

q-andy temporarily deployed to ml-commons-cicd-env-require-approval August 14, 2025 19:13 — with GitHub Actions Inactive

Zhangxunmt previously approved these changes Aug 15, 2025

View reviewed changes

q-andy had a problem deploying to ml-commons-cicd-env-require-approval August 15, 2025 01:14 — with GitHub Actions Failure

dhrubo-os requested a deployment to ml-commons-cicd-env-require-approval August 19, 2025 00:54 — with GitHub Actions Waiting

q-andy temporarily deployed to ml-commons-cicd-env-require-approval August 19, 2025 22:42 — with GitHub Actions Inactive

q-andy added 2 commits August 21, 2025 11:35

Merge branch 'opensearch-project:main' into sparse-passthrough

5c70bc6

Apply spotless

cc7db23

Signed-off-by: Andy Qin <[email protected]>

q-andy temporarily deployed to ml-commons-cicd-env-require-approval August 21, 2025 19:29 — with GitHub Actions Inactive

q-andy had a problem deploying to ml-commons-cicd-env-require-approval August 21, 2025 20:43 — with GitHub Actions Failure

q-andy temporarily deployed to ml-commons-cicd-env-require-approval August 21, 2025 20:43 — with GitHub Actions Inactive

Zhangxunmt previously approved these changes Aug 21, 2025

View reviewed changes

dhrubo-os reviewed Aug 23, 2025

View reviewed changes

Add more comments and refactor

f2f6a78

Signed-off-by: Andy Qin <[email protected]>

q-andy dismissed Zhangxunmt’s stale review via f2f6a78 August 25, 2025 18:29

q-andy temporarily deployed to ml-commons-cicd-env-require-approval August 25, 2025 18:31 — with GitHub Actions Inactive

q-andy temporarily deployed to ml-commons-cicd-env-require-approval August 25, 2025 19:36 — with GitHub Actions Inactive

dhrubo-os approved these changes Aug 29, 2025

View reviewed changes

Zhangxunmt approved these changes Aug 29, 2025

View reviewed changes

Zhangxunmt merged commit a9151f4 into opensearch-project:main Aug 29, 2025
13 checks passed

Add ml-commons passthrough post process function #4111

Add ml-commons passthrough post process function #4111

Conversation

q-andy commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Check List

Uh oh!

Uh oh!

Uh oh!

dhrubo-os commented Aug 14, 2025

Uh oh!

codecov bot commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

q-andy commented Aug 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

q-andy Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mingshl commented Aug 25, 2025

Uh oh!

q-andy commented Aug 25, 2025

Uh oh!

Uh oh!

Uh oh!

q-andy commented Aug 13, 2025 •

edited

Loading

codecov bot commented Aug 19, 2025 •

edited

Loading

q-andy Aug 25, 2025 •

edited

Loading