Prompt Test Instructions

git clone https://github.com/chigkim/prompt-test
cd prompt-test
pip install openai

Modify test.py for your setup.
Run python test.py

To test with the default prompt setup, launch your server with 36k context length. Otherwise, modify prompt.txt to fit your need.

Metrics

Metrics are calculated as follows:

Time to First Token (TTFT): Measured from the start of the streaming request to the first streaming event received.
Prompt Processing Speed (PP): Number of prompt tokens divided by TTFT.
Token Generation Speed (TG): Number of generated tokens divided by (total duration - TTFT).

The displayed results were truncated to two decimal places, but the calculations used full precision.

Some servers don't let you disable prompt caching. To work around this, I made the script to prepend 40% new material in the beginning of next longer prompt to avoid caching effect. Also it uses two system prompts with the same number of tokens to alternate every test.

Optimize Performance on Mac

Some Mac models, offer High Power mode that prevents performance throttling. Without this, speed results may fluctuate significantly. To enable high-power mode:

Open System Settings > Battery.
Under "Energy Mode" for "On Power Adapter," select High Power.

Even with this setting, I noticed there's some throttling. You might want to increase coolling to 120 in test.py.

Adjust GPU Memory Allocation

By default, macOS limits GPU memory usage to 2/3 or 3/4 of total system memory depending your model. To increase this limit, run the following command in the terminal before executing the script:

sudo sysctl iogpu.wired_limit_mb=57344

The setting will be reset on next reboot. To make it persistent every reboot, you can add iogpu.wired_limit_mb=57344 to /etc/sysctl.conf and reboot.

For a 64GB system, this allows the GPU to use up to 56GB, leaving 8GB for other processes.

Calculation: (64GB - 8GB) ?? 1024MB = 57344MB

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
prompt.txt		prompt.txt
pyproject.toml		pyproject.toml
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Prompt Test Instructions

Metrics

Optimize Performance on Mac

Adjust GPU Memory Allocation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

chigkim/prompt-test

Folders and files

Latest commit

History

Repository files navigation

Prompt Test Instructions

Metrics

Optimize Performance on Mac

Adjust GPU Memory Allocation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages