Skip to content

Commit 84ed5f1

Browse files
authored
Merge pull request #117 from huggingface/updates-header
Minor updates to index and faq
2 parents 95d4d72 + f465b6c commit 84ed5f1

File tree

2 files changed

+10
-4
lines changed

2 files changed

+10
-4
lines changed

docs/source/faq.mdx

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,10 @@ A: This usually means that the port mapping is incorrect. Ensure your app is lis
110110

111111
### Q: I'm getting a 500 response in the beginning of my endpoint deployment or when scaling is happening
112112

113-
A: Confirm that you have a health route implemented in your app that returns a status code 200 when your application is ready to serve requests. Otherwise your app is considered ready as soon as the container has started, potentially resulting in 500s. You can configure the health route in the custom settings of your endpoint.
113+
A: Confirm that you have a health route implemented in your app that returns a status code 200 when your application is ready to serve requests. Otherwise your app is considered ready as soon as the container has started, potentially resulting in 500s. You can configure the health route in the Container Configuration of your Endpoint.
114+
115+
You can also add the 'X-Scale-Up-Timeout' header to your requests. This means that when the endpoint is scaling the proxy will hold requests until a replica is ready, or timeout after the specified amount of seconds.
116+
For example 'X-Scale-Up-Timeout: 600'
114117

115118
### Q: I see there's an option to select a Download Pattern under Instance Configuration. What does this mean?
116119

docs/source/index.mdx

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,14 @@
11
# Inference Endpoints
22

3-
Inference Endpoints offers a secure production solution to easily deploy any Transformers, Sentence-Transformers and Diffusers models from the Hub on dedicated and autoscaling infrastructure managed by Hugging Face.
3+
Inference Endpoints offers a secure production solution to easily deploy any model from the Hub on dedicated and autoscaling infrastructure managed by Hugging Face.
44

55
A Hugging Face Endpoint is built from a [Hugging Face Model Repository](https://huggingface.co/models). When an Endpoint is created, the service creates image artifacts that are either built from the model you select or a custom-provided container image. The image artifacts are completely decoupled from the Hugging Face Hub source repositories to ensure the highest security and reliability levels.
66

7-
Inference Endpoints support all of the [Transformers, Sentence-Transformers and Diffusers tasks](/docs/inference-endpoints/supported_tasks) as well as [custom tasks](/docs/inference-endpoints/guides/custom_handler) not supported by Transformers yet like speaker diarization and diffusion.
7+
Inference Endpoints support all of the [Transformers, Sentence-Transformers and Diffusers tasks](/docs/inference-endpoints/supported_tasks) as well as [custom tasks](/docs/inference-endpoints/guides/custom_handler) not supported by Transformers yet.
88

9-
In addition, Inference Endpoints gives you the option to use a custom container image managed on an external service, for instance, [Docker Hub](https://hub.docker.com/), [AWS ECR](https://aws.amazon.com/ecr/?nc1=h_ls), [Azure ACR](https://azure.microsoft.com/de-de/services/container-registry/), or [Google GCR](https://cloud.google.com/container-registry?hl=de).
9+
In addition, Inference Endpoints gives you the option to use a custom container image managed on an external service, for instance, [Docker Hub](https://hub.docker.com/), [AWS ECR](https://aws.amazon.com/ecr/?nc1=h_ls), [Azure ACR](https://azure.microsoft.com/en-gb/products/container-registry/), or [Google GCR](https://cloud.google.com/artifact-registry/docs?h%3A=en).
10+
11+
Inference Endpoints support all container types, for example: vLLM, TGI (text-generation-inference), TEI (text-embeddings-inference), llama.cpp, and more.
1012

1113
![creation-flow](https://raw.githubusercontent.com/huggingface/hf-endpoints-documentation/main/assets/creation_flow.png)
1214

@@ -34,6 +36,7 @@ In addition, Inference Endpoints gives you the option to use a custom container
3436
* [Access and view Metrics](/docs/inference-endpoints/guides/metrics)
3537
* [Change Organization or Account](/docs/inference-endpoints/guides/change_organization)
3638
* [Deploying a llama.cpp Container](/docs/inference-endpoints/guides/llamacpp_container)
39+
* [Connect Endpoints Metrics with your Internal Tool](/docs/inference-endpoints/guides/openmetrics)
3740

3841
### Others
3942

0 commit comments

Comments
 (0)