Guidellm takes several minutes to create random requests with long prompts

**Describe the bug**
It takes several minutes to generate random requests

```bash
guidellm benchmark --target=$URL --model=$MODEL --rate-type=concurrent --rate=200 --max-requests=200 --output-path=~/llama-70b.json --processo
r=$MODEL --data='{"prompt_tokens":20000, "output_tokens":5000}'
```

This takes multiple minutes at:

```bash
Creating backend...
Backend openai_http connected to http://10.16.1.185:8000 for model meta-llama/Llama-3.3-70B-Instruct.
Creating request loader...
```

**Expected behavior**
- I would expect this to take a shorter amount of time

**Environment**
Include all relevant environment information:
1. OS [e.g. Ubuntu 20.04]:
2. Python version [e.g. 3.12.2]:

**To Reproduce**
Exact steps to reproduce the behavior:


**Errors**
If applicable, add a full print-out of any errors or exceptions that are raised or include screenshots to help explain your problem.

**Additional context**
Add any other context about the problem here. Also include any relevant files.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Guidellm takes several minutes to create random requests with long prompts #270

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Guidellm takes several minutes to create random requests with long prompts #270

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions