You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/torch.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -38,3 +38,7 @@ Here is a simple example to show how to use `tensorrt_llm.LLM` API with Llama mo
38
38
## Known Issues
39
39
40
40
- The PyTorch backend on SBSA is incompatible with bare metal environments like Ubuntu 24.04. Please use the [PyTorch NGC Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch) for optimal support on SBSA platforms.
41
+
42
+
## Prototype Features
43
+
44
+
-[AutoDeploy: Seamless Model Deployment from PyTorch to TensorRT-LLM](./torch/auto_deploy/auto-deploy.md)
AutoDeploy is integrated with the `trtllm-bench` performance benchmarking utility, enabling you to measure comprehensive performance metrics such as token throughput, request throughput, and latency for your AutoDeploy-optimized models.
4
+
5
+
## Getting Started
6
+
7
+
Before benchmarking with AutoDeploy, review the [TensorRT-LLM benchmarking guide](../../performance/perf-benchmarking.md#running-with-the-pytorch-workflow) to familiarize yourself with the standard trtllm-bench workflow and best practices.
8
+
9
+
## Basic Usage
10
+
11
+
Invoke the AutoDeploy backend by specifying `--backend _autodeploy` in your `trtllm-bench` command:
12
+
13
+
```bash
14
+
trtllm-bench \
15
+
--model meta-llama/Llama-3.1-8B \
16
+
throughput \
17
+
--dataset /tmp/synthetic_128_128.txt \
18
+
--backend _autodeploy
19
+
```
20
+
21
+
```{note}
22
+
As in the PyTorch workflow, AutoDeploy does not require a separate `trtllm-bench build` step. The model is automatically optimized during benchmark initialization.
23
+
```
24
+
25
+
## Advanced Configuration
26
+
27
+
For more granular control over AutoDeploy's behavior during benchmarking, use the `--extra_llm_api_options` flag with a YAML configuration file:
0 commit comments