You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/sample_blueprints/offline-inference-infra/README.md
+52-3Lines changed: 52 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -38,13 +38,20 @@ Offline inference is ideal for:
38
38
---
39
39
40
40
## Running the Benchmark
41
+
- Things need to run the benchmark
42
+
- Model checkpoints pre-downloaded and stored in an object storage.
43
+
- Make sure to get a PAR for the object storage where the models are saved. With listing, write and read perimissions
44
+
- A Bucket to save the outputs. This does not take a PAR, so should be a bucket in the same tenancy as to where you have your OCI blueprints stack
45
+
- Config `.yaml` file that has all the parameters required to run the benhcmark. This includes input_len, output_len, gpu_utilization value etc.
46
+
- Deployment `.json` to deploy your blueprint.
47
+
- Sample deployment and config files are provided below along with links.
41
48
42
49
This blueprint supports benchmark execution via a job-mode recipe using a YAML config file. The recipe mounts a model and config file from Object Storage, runs offline inference, and logs metrics.
43
50
44
51
Notes : Make sure your output object storage is in the same tenancy as your stack.
45
52
---
46
53
47
-
### Sample Recipe (Job Mode for Offline SGLang Inference)
54
+
### [Sample Blueprint (Job Mode for Offline SGLang Inference)](dhttps://github.com/oracle-quickstart/oci-ai-blueprints/blob/offline-inference-benchmark/docs/sample_blueprints/offline-inference-infra/offline_deployment_sglang.json)
48
55
49
56
```json
50
57
{
@@ -86,8 +93,50 @@ Notes : Make sure your output object storage is in the same tenancy as your stac
86
93
```
87
94
88
95
---
96
+
### [Sample Blueprint (Job Mode for Offline vLLM Inference)](dhttps://github.com/oracle-quickstart/oci-ai-blueprints/blob/offline-inference-benchmark/docs/sample_blueprints/offline-inference-infra/offline_deployment_sglang.json)
0 commit comments