-
Notifications
You must be signed in to change notification settings - Fork 116
Add support for multiple Google Model Garden providers for completion and chat_completion tasks #5532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Add support for multiple Google Model Garden providers for completion and chat_completion tasks #5532
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
90de26a
Add support for multiple Google Model Garden providers for completion…
Jan-Kazlouski-elastic 548c275
Merge remote-tracking branch 'origin/main' into google-model-garden-u…
Jan-Kazlouski-elastic 5c67d07
Merge remote-tracking branch 'origin/main' into google-model-garden-u…
Jan-Kazlouski-elastic 8c9c4ef
Add chat_completion and completion task examples for various Google M…
Jan-Kazlouski-elastic 918321e
Update CommonTypes.ts to clarify URL requirements for various providers
Jan-Kazlouski-elastic a955833
Merge remote-tracking branch 'origin/main' into google-model-garden-u…
Jan-Kazlouski-elastic 226e656
Merge remote-tracking branch 'origin/main' into google-model-garden-u…
Jan-Kazlouski-elastic 2c4bcb5
Add examples for chat_completion and completion tasks using various G…
Jan-Kazlouski-elastic 16c520d
Merge remote-tracking branch 'origin/main' into google-model-garden-u…
Jan-Kazlouski-elastic 547fd3c
Merge remote-tracking branch 'origin/main' into google-model-garden-u…
Jan-Kazlouski-elastic 8a122c8
Merge branch 'main' into google-model-garden-update
Jan-Kazlouski-elastic File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
13 changes: 13 additions & 0 deletions
13
...tion/inference/put_googlevertexai/examples/request/PutGoogleVertexAiRequestExample10.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| summary: A chat_completion task for Google Model Garden Meta shared endpoint with single streaming URL provided | ||
| description: Run `PUT _inference/chat_completion/google_model_garden_meta_chat_completion` to create an inference endpoint to perform a `chat_completion` task using Meta's model hosted on Google Model Garden shared endpoint with single streaming URL provided. See the endpoint's `Sample request` page for the variable values used in the URL. | ||
| method_request: 'PUT _inference/chat_completion/google_model_garden_meta_chat_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "googlevertexai", | ||
| "service_settings": { | ||
| "provider": "meta", | ||
| "service_account_json": "service-account-json", | ||
| "streaming_url": "https://%LOCATION_ID%-aiplatform.googleapis.com/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/endpoints/%ENDPOINT_ID%/chat/completions" | ||
| } | ||
| } |
13 changes: 13 additions & 0 deletions
13
...tion/inference/put_googlevertexai/examples/request/PutGoogleVertexAiRequestExample11.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| summary: A completion task for Google Model Garden Hugging Face dedicated endpoint with single URL provided for both streaming and non-streaming tasks | ||
| description: Run `PUT _inference/completion/google_model_garden_hugging_face_completion` to create an inference endpoint to perform a `completion` task using Hugging Face's model hosted on Google Model Garden dedicated endpoint with single URL provided for both streaming and non-streaming tasks. See the endpoint's `Sample request` page for the variable values used in the URL. | ||
| method_request: 'PUT _inference/completion/google_model_garden_hugging_face_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "googlevertexai", | ||
| "service_settings": { | ||
| "provider": "hugging_face", | ||
| "service_account_json": "service-account-json", | ||
| "url": "https://%ENDPOINT_ID%.%LOCATION_ID%-%PROJECT_ID%.prediction.vertexai.goog/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/endpoints/%ENDPOINT_ID%/chat/completions" | ||
| } | ||
| } |
13 changes: 13 additions & 0 deletions
13
...tion/inference/put_googlevertexai/examples/request/PutGoogleVertexAiRequestExample12.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| summary: A chat_completion task for Google Model Garden Hugging Face dedicated endpoint with single streaming URL provided | ||
| description: Run `PUT _inference/chat_completion/google_model_garden_hugging_face_chat_completion` to create an inference endpoint to perform a `chat_completion` task using Hugging Face's model hosted on Google Model Garden dedicated endpoint with single streaming URL provided. See the endpoint's `Sample request` page for the variable values used in the URL. | ||
| method_request: 'PUT _inference/chat_completion/google_model_garden_hugging_face_chat_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "googlevertexai", | ||
| "service_settings": { | ||
| "provider": "hugging_face", | ||
| "service_account_json": "service-account-json", | ||
| "streaming_url": "https://%ENDPOINT_ID%.%LOCATION_ID%-%PROJECT_ID%.prediction.vertexai.goog/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/endpoints/%ENDPOINT_ID%/chat/completions" | ||
| } | ||
| } |
13 changes: 13 additions & 0 deletions
13
...tion/inference/put_googlevertexai/examples/request/PutGoogleVertexAiRequestExample13.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| summary: A completion task for Google Model Garden Hugging Face shared endpoint with single URL provided for both streaming and non-streaming tasks | ||
| description: Run `PUT _inference/completion/google_model_garden_hugging_face_completion` to create an inference endpoint to perform a `completion` task using Hugging Face's model hosted on Google Model Garden shared endpoint with single URL provided for both streaming and non-streaming tasks. See the endpoint's `Sample request` page for the variable values used in the URL. | ||
| method_request: 'PUT _inference/completion/google_model_garden_hugging_face_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "googlevertexai", | ||
| "service_settings": { | ||
| "provider": "hugging_face", | ||
| "service_account_json": "service-account-json", | ||
| "url": "https://%LOCATION_ID%-aiplatform.googleapis.com/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/endpoints/%ENDPOINT_ID%/chat/completions" | ||
| } | ||
| } |
13 changes: 13 additions & 0 deletions
13
...tion/inference/put_googlevertexai/examples/request/PutGoogleVertexAiRequestExample14.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| summary: A chat_completion task for Google Model Garden Hugging Face shared endpoint with single streaming URL provided | ||
| description: Run `PUT _inference/chat_completion/google_model_garden_hugging_face_chat_completion` to create an inference endpoint to perform a `chat_completion` task using Hugging Face's model hosted on Google Model Garden shared endpoint with single streaming URL provided. See the endpoint's `Sample request` page for the variable values used in the URL. | ||
| method_request: 'PUT _inference/chat_completion/google_model_garden_hugging_face_chat_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "googlevertexai", | ||
| "service_settings": { | ||
| "provider": "hugging_face", | ||
| "service_account_json": "service-account-json", | ||
| "streaming_url": "https://%LOCATION_ID%-aiplatform.googleapis.com/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/endpoints/%ENDPOINT_ID%/chat/completions" | ||
| } | ||
| } |
15 changes: 15 additions & 0 deletions
15
...tion/inference/put_googlevertexai/examples/request/PutGoogleVertexAiRequestExample15.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| summary: A completion task for Google Model Garden Mistral serverless endpoint with separate URLs for streaming and non-streaming tasks | ||
| description: Run `PUT _inference/completion/google_model_garden_mistral_completion` to create an inference endpoint to perform a `completion` task using Mistral's serverless model hosted on Google Model Garden with separate URLs for streaming and non-streaming tasks. See the Mistral model documentation for instructions on how to construct URLs. | ||
| method_request: 'PUT _inference/completion/google_model_garden_mistral_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "googlevertexai", | ||
| "service_settings": { | ||
| "provider": "mistral", | ||
| "model_id": "mistral-small-2503", | ||
| "service_account_json": "service-account-json", | ||
| "url": "https://%LOCATION_ID%-aiplatform.googleapis.com/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/publishers/mistralai/models/%MODEL_ID%:rawPredict", | ||
| "streaming_url": "https://%LOCATION_ID%-aiplatform.googleapis.com/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/publishers/mistralai/models/%MODEL_ID%:streamRawPredict" | ||
| } | ||
| } |
14 changes: 14 additions & 0 deletions
14
...tion/inference/put_googlevertexai/examples/request/PutGoogleVertexAiRequestExample16.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| summary: A chat_completion task for Google Model Garden Mistral serverless endpoint with single streaming URL provided | ||
| description: Run `PUT _inference/chat_completion/google_model_garden_mistral_chat_completion` to create an inference endpoint to perform a `chat_completion` task using Mistral's serverless model hosted on Google Model Garden with single streaming URL provided. See the Mistral model documentation for instructions on how to construct the URL. | ||
| method_request: 'PUT _inference/chat_completion/google_model_garden_mistral_chat_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "googlevertexai", | ||
| "service_settings": { | ||
| "provider": "mistral", | ||
| "model_id": "mistral-small-2503", | ||
| "service_account_json": "service-account-json", | ||
| "streaming_url": "https://%LOCATION_ID%-aiplatform.googleapis.com/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/publishers/mistralai/models/%MODEL_ID%:streamRawPredict" | ||
| } | ||
| } |
13 changes: 13 additions & 0 deletions
13
...tion/inference/put_googlevertexai/examples/request/PutGoogleVertexAiRequestExample17.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| summary: A completion task for Google Model Garden Mistral dedicated endpoint with single URL provided for both streaming and non-streaming tasks | ||
| description: Run `PUT _inference/completion/google_model_garden_mistral_completion` to create an inference endpoint to perform a `completion` task using Mistral's model hosted on Google Model Garden dedicated endpoint with single URL provided for both streaming and non-streaming tasks. See the endpoint's `Sample request` page for the variable values used in the URL. | ||
| method_request: 'PUT _inference/completion/google_model_garden_mistral_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "googlevertexai", | ||
| "service_settings": { | ||
| "provider": "mistral", | ||
| "service_account_json": "service-account-json", | ||
| "url": "https://%ENDPOINT_ID%.%LOCATION_ID%-%PROJECT_ID%.prediction.vertexai.goog/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/endpoints/%ENDPOINT_ID%/chat/completions" | ||
| } | ||
| } |
13 changes: 13 additions & 0 deletions
13
...tion/inference/put_googlevertexai/examples/request/PutGoogleVertexAiRequestExample18.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| summary: A chat_completion task for Google Model Garden Mistral dedicated endpoint with single streaming URL provided | ||
| description: Run `PUT _inference/chat_completion/google_model_garden_mistral_chat_completion` to create an inference endpoint to perform a `chat_completion` task using Mistral's model hosted on Google Model Garden dedicated endpoint with single streaming URL provided. See the endpoint's `Sample request` page for the variable values used in the URL. | ||
| method_request: 'PUT _inference/chat_completion/google_model_garden_mistral_chat_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "googlevertexai", | ||
| "service_settings": { | ||
| "provider": "mistral", | ||
| "service_account_json": "service-account-json", | ||
| "streaming_url": "https://%ENDPOINT_ID%.%LOCATION_ID%-%PROJECT_ID%.prediction.vertexai.goog/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/endpoints/%ENDPOINT_ID%/chat/completions" | ||
| } | ||
| } |
13 changes: 13 additions & 0 deletions
13
...tion/inference/put_googlevertexai/examples/request/PutGoogleVertexAiRequestExample19.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| summary: A completion task for Google Model Garden Mistral shared endpoint with single URL provided for both streaming and non-streaming tasks | ||
| description: Run `PUT _inference/completion/google_model_garden_mistral_completion` to create an inference endpoint to perform a `completion` task using Mistral's model hosted on Google Model Garden shared endpoint with single URL provided for both streaming and non-streaming tasks. See the endpoint's `Sample request` page for the variable values used in the URL. | ||
| method_request: 'PUT _inference/completion/google_model_garden_mistral_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "googlevertexai", | ||
| "service_settings": { | ||
| "provider": "mistral", | ||
| "service_account_json": "service-account-json", | ||
| "url": "https://%LOCATION_ID%-aiplatform.googleapis.com/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/endpoints/%ENDPOINT_ID%/chat/completions" | ||
| } | ||
| } |
13 changes: 13 additions & 0 deletions
13
...tion/inference/put_googlevertexai/examples/request/PutGoogleVertexAiRequestExample20.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| summary: A chat_completion task for Google Model Garden Mistral shared endpoint with single streaming URL provided | ||
| description: Run `PUT _inference/chat_completion/google_model_garden_mistral_chat_completion` to create an inference endpoint to perform a `chat_completion` task using Mistral's model hosted on Google Model Garden shared endpoint with single streaming URL provided. See the endpoint's `Sample request` page for the variable values used in the URL. | ||
| method_request: 'PUT _inference/chat_completion/google_model_garden_mistral_chat_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "googlevertexai", | ||
| "service_settings": { | ||
| "provider": "mistral", | ||
| "service_account_json": "service-account-json", | ||
| "streaming_url": "https://%LOCATION_ID%-aiplatform.googleapis.com/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/endpoints/%ENDPOINT_ID%/chat/completions" | ||
| } | ||
| } |
14 changes: 14 additions & 0 deletions
14
...tion/inference/put_googlevertexai/examples/request/PutGoogleVertexAiRequestExample21.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| summary: A completion task for Google Model Garden AI21 serverless endpoint with separate URLs for streaming and non-streaming tasks | ||
| description: Run `PUT _inference/completion/google_model_garden_ai21_completion` to create an inference endpoint to perform a `completion` task using AI21's model hosted on Google Model Garden serverless endpoint with separate URLs for streaming and non-streaming tasks. See the AI21 model documentation for instructions on how to construct URLs. | ||
| method_request: 'PUT _inference/completion/google_model_garden_ai21_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "googlevertexai", | ||
| "service_settings": { | ||
| "provider": "ai21", | ||
| "service_account_json": "service-account-json", | ||
| "url": "https://%LOCATION_ID%-aiplatform.googleapis.com/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/publishers/ai21/models/%MODEL_ID%:rawPredict", | ||
| "streaming_url": "https://%LOCATION_ID%-aiplatform.googleapis.com/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/publishers/ai21/models/%MODEL_ID%:streamRawPredict" | ||
| } | ||
| } |
13 changes: 13 additions & 0 deletions
13
...tion/inference/put_googlevertexai/examples/request/PutGoogleVertexAiRequestExample22.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| summary: A chat_completion task for Google Model Garden AI21 serverless endpoint with single streaming URL provided | ||
| description: Run `PUT _inference/chat_completion/google_model_garden_ai21_chat_completion` to create an inference endpoint to perform a `chat_completion` task using AI21's model hosted on Google Model Garden serverless endpoint with single streaming URL provided. See the AI21 model documentation for instructions on how to construct URLs. | ||
| method_request: 'PUT _inference/chat_completion/google_model_garden_ai21_chat_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "googlevertexai", | ||
| "service_settings": { | ||
| "provider": "ai21", | ||
| "service_account_json": "service-account-json", | ||
| "streaming_url": "https://%LOCATION_ID%-aiplatform.googleapis.com/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/publishers/ai21/models/%MODEL_ID%:streamRawPredict" | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
14 changes: 14 additions & 0 deletions
14
...ation/inference/put_googlevertexai/examples/request/PutGoogleVertexAiRequestExample5.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| summary: A completion task for Google Model Garden Meta serverless endpoint with single URL provided for both streaming and non-streaming tasks | ||
| description: Run `PUT _inference/completion/google_model_garden_meta_completion` to create an inference endpoint to perform a `completion` task using Meta's serverless model hosted on Google Model Garden with single URL provided for both streaming and non-streaming tasks. See the Meta model documentation for instructions on how to construct the URL. | ||
| method_request: 'PUT _inference/completion/google_model_garden_meta_completion' | ||
| # type: "request" | ||
| value: |- | ||
| { | ||
| "service": "googlevertexai", | ||
| "service_settings": { | ||
| "provider": "meta", | ||
| "model_id": "meta/llama-3.3-70b-instruct-maas", | ||
| "service_account_json": "service-account-json", | ||
| "url": "https://%LOCATION_ID%-aiplatform.googleapis.com/v1/projects/%PROJECT_ID%/locations/%LOCATION_ID%/endpoints/openapi/chat/completions" | ||
| } | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add in the
urlandstreaming_urlfield comments which providers require which ones? (basically the same thing we did in the comment in the Java code).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.