Skip to content

Commit 13e490c

Browse files
addressed PR comments
1 parent da66e95 commit 13e490c

File tree

1 file changed

+52
-3
lines changed
  • docs/sample_blueprints/offline-inference-infra

1 file changed

+52
-3
lines changed

docs/sample_blueprints/offline-inference-infra/README.md

Lines changed: 52 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,13 +38,20 @@ Offline inference is ideal for:
3838
---
3939

4040
## Running the Benchmark
41+
- Things need to run the benchmark
42+
- Model checkpoints pre-downloaded and stored in an object storage.
43+
- Make sure to get a PAR for the object storage where the models are saved. With listing, write and read perimissions
44+
- A Bucket to save the outputs. This does not take a PAR, so should be a bucket in the same tenancy as to where you have your OCI blueprints stack
45+
- Config `.yaml` file that has all the parameters required to run the benhcmark. This includes input_len, output_len, gpu_utilization value etc.
46+
- Deployment `.json` to deploy your blueprint.
47+
- Sample deployment and config files are provided below along with links.
4148

4249
This blueprint supports benchmark execution via a job-mode recipe using a YAML config file. The recipe mounts a model and config file from Object Storage, runs offline inference, and logs metrics.
4350

4451
Notes : Make sure your output object storage is in the same tenancy as your stack.
4552
---
4653

47-
### Sample Recipe (Job Mode for Offline SGLang Inference)
54+
### [Sample Blueprint (Job Mode for Offline SGLang Inference)](dhttps://github.com/oracle-quickstart/oci-ai-blueprints/blob/offline-inference-benchmark/docs/sample_blueprints/offline-inference-infra/offline_deployment_sglang.json)
4855

4956
```json
5057
{
@@ -86,8 +93,50 @@ Notes : Make sure your output object storage is in the same tenancy as your stac
8693
```
8794

8895
---
96+
### [Sample Blueprint (Job Mode for Offline vLLM Inference)](dhttps://github.com/oracle-quickstart/oci-ai-blueprints/blob/offline-inference-benchmark/docs/sample_blueprints/offline-inference-infra/offline_deployment_sglang.json)
8997

90-
## Sample Config File (`example_sglang.yaml`)
98+
```json
99+
{
100+
"recipe_id": "offline_inference_vllm",
101+
"recipe_mode": "job",
102+
"deployment_name": "Offline Inference Benchmark vllm",
103+
"recipe_image_uri": "iad.ocir.io/iduyx1qnmway/corrino-devops-repository:llm-benchmark-0409-v4",
104+
"recipe_node_shape": "VM.GPU.A10.2",
105+
"input_object_storage": [
106+
{
107+
"par": "https://objectstorage.ap-melbourne-1.oraclecloud.com/p/0T99iRADcM08aVpumM6smqMIcnIJTFtV2D8ZIIWidUP9eL8GSRyDMxOb9Va9rmRc/n/iduyx1qnmway/b/mymodels/o/",
108+
"mount_location": "/models",
109+
"volume_size_in_gbs": 500,
110+
"include": [
111+
"offline_vllm_example.yaml",
112+
"NousResearch/Meta-Llama-3.1-8B"
113+
]
114+
}
115+
],
116+
"output_object_storage": [
117+
{
118+
"bucket_name": "inference_output",
119+
"mount_location": "/mlcommons_output",
120+
"volume_size_in_gbs": 200
121+
}
122+
],
123+
"recipe_container_command_args": [
124+
"/models/offline_vllm_example.yaml"
125+
],
126+
"recipe_replica_count": 1,
127+
"recipe_container_port": "8000",
128+
"recipe_nvidia_gpu_count": 2,
129+
"recipe_node_pool_size": 1,
130+
"recipe_node_boot_volume_size_in_gbs": 200,
131+
"recipe_ephemeral_storage_size": 100,
132+
"recipe_shared_memory_volume_size_limit_in_mb": 200
133+
}
134+
135+
```
136+
137+
---
138+
139+
## [Sample Config File SGlang - 1 (`new_example_sglang.yaml`)](https://github.com/oracle-quickstart/oci-ai-blueprints/blob/offline-inference-benchmark/docs/sample_blueprints/offline-inference-infra/new_example_sglang.yaml)
91140

92141
```yaml
93142
benchmark_type: offline
@@ -115,7 +164,7 @@ run_name: "llama3-8b-sglang-test"
115164
save_metrics_path: /mlcommons_output/benchmark_output_llama3_sglang.json
116165

117166
```
118-
167+
## [Sample Config File - 2 vLLM (`offline_vllm_example.yaml`)](https://github.com/oracle-quickstart/oci-ai-blueprints/blob/offline-inference-benchmark/docs/sample_blueprints/offline-inference-infra/offline_vllm_example.yaml)
119168
```yaml
120169
benchmark_type: offline
121170
model: /models/NousResearch/Meta-Llama-3.1-8B

0 commit comments

Comments
 (0)