[FEA] Support Triton ensemble runtime for SageMaker multi-model deployment

# 🚀 Feature request
 
I was wondering if it is possible to support the SageMaker multi-model deployment using the Triton ensemble of Merlin models.
 
SageMaker already supports [multilpe hosting modes for Model deployment with Triton Inference Server](https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-models-frameworks-triton.html), including the _Multi-model endpoints with ensemble_ hosting mode. I tried to use that hosting mode with the Triton ensembles of Merlin models, but according to the last update of the Merlin SageMaker example implementation #1040, the `--model-control-mode=explicit` control mode (required by multiple models hosting for dynamic model loading) was removed. 

I hypothesize that the cause of this incompatibility is due to the generated Merlin `executor_model` is not a proper [Triton ensemble](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/ensemble_models.html) (since its `config.pbtxt` file doesn't have the correct platform `platform: "ensemble"`, neither the required `ensemble_scheduling: {...}` section), but just another Triton model that executes the `0_transformworkflowtriton` and `1_predictpytorchtriton` steps internally. Therefore, the `executor_model` it's not automatically recognized as the ensemble of the `0_transformworkflowtriton` and `1_predictpytorchtriton` models to be executed.

**EDIT:** I realized that in [merlin-systems PR#255](https://github.com/NVIDIA-Merlin/systems/pull/255) the Triton ensemble runtime was deprecated and changed to the current executor model. It is possible to support the option of exporting the recommender system artifacts as a Triton ensemble, at least for [Transformers4rec](https://github.com/NVIDIA-Merlin/Transformers4Rec) systems deployment?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[FEA] Support Triton ensemble runtime for SageMaker multi-model deployment #1106

🚀 Feature request

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[FEA] Support Triton ensemble runtime for SageMaker multi-model deployment #1106

Description

🚀 Feature request

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions