-
Notifications
You must be signed in to change notification settings - Fork 0
Update guidellm #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Chibukach
wants to merge
91
commits into
main
Choose a base branch
from
update_guidellm
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
91 commits
Select commit
Hold shift + click to select a range
f6aa8fe
simple change
18f41b1
test lmeval change
a425d43
update branch
6fc29f4
use main
956a12b
remove gcs
5e09fb7
readd gc
655f00e
remove gc
ba703b0
back to guidellm
b4deac8
simplified
6ed6862
simple vllm
b3f55bc
skip vllm
3a709da
pause vllm
02cac57
update benchmark report
a85bb4f
update ip
c3af0cf
update branch
ede7482
added base task param
87496ea
retry branch name
b64ffd8
repo branch
7dc5e48
readd branch
2d05c64
branch in base task
60e6e9e
optional branch
ee4d7c9
add branch choice
998a8bc
include benchmark
6944cb4
refactor default
6e4a5d5
moved generate text
41f3f21
test
850fd21
add debug
5e87674
add os lib
c9b63a8
use default scenario
4d68ea8
benchmark with scenario
0f07b28
overlap with guidellm vars
6a67050
check model and target
72094b4
add debugs
10180a3
list keys that overlap
9191f13
only replace model
1b0e4a4
update with scenario
7515a61
readd default scenario
e6318f5
readd default scenario
9f61d6e
pin to main
8c8c23e
readd vllm server
ec725d1
updated vllm server
5b22309
print the input vars
5e8053a
remove gpu count
af3ebaa
simple path
5c4f5b8
vllm print
b8a1e9f
added cwd
0365496
ensure setup uses branch
348fd82
add guide again
cb882af
readd gpu count
464591e
update vllm server
c0d0dba
revert target
81c62f7
install editable guidellm
97e36cb
print package list
063c8b9
added package print
d6ef266
older guidellm
8c64910
updated to use dev branch
7dee38b
redo with custom branch
263c2ff
repo override
90e461b
add packages to guidellm
4f00a5a
update setup.py
14f84ce
readd
ad2b423
before vllm
98eb6f8
removed vllm
10874d3
remove vllm
629d195
cleanup
768d135
back to base
09c3978
readd
e64fb12
readd start vllm server
873c222
use guidellm branch
16b83bc
base complete
432031e
test rag
e9117ea
clean up
9984a8c
base package as variable
b8b51e9
test default branch change
b99afec
update branch names
b2c2918
use main branch in config
d1e686b
print the scenario
5d3e3ff
modify tokens
3b0d86c
revert lmeval and setup.py, update vllm server log
a2d6eb5
readd default scenarios
81f5199
change default guidellm json
1550333
add config examples json
420137d
use original default
9d284c9
add log
e863516
include user scenario
3703e62
revert lmeval example
d1b985a
add file error handling
e60aab1
removed package prints
515a1db
default config
ac9ef63
readd output path
69638ea
onpremise settings
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,4 @@ | ||
DEFAULT_DOCKER_IMAGE = "498127099666.dkr.ecr.us-east-1.amazonaws.com/mlops/k8s-research-cuda12_5:latest" | ||
DEFAULT_OUTPUT_URI = "gs://neuralmagic-clearml" | ||
DEFAULT_DOCKER_IMAGE = "498127099666.dkr.ecr.us-east-1.amazonaws.com/mlops/k8s-research-cuda12_8:latest" | ||
DEFAULT_OUTPUT_URI = "gs://neuralmagic-clearml" | ||
DEFAULT_RESEARCH_BRANCH = "main" | ||
DEFAULT_GUIDELLM_SCENARIO = "chat" | ||
13 changes: 13 additions & 0 deletions
13
src/automation/standards/benchmarking/benchmarking_128k.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"rate_type": "sweep", | ||
"data": { | ||
"prompt_tokens": 128000, | ||
"prompt_tokens_stdev": 128, | ||
"prompt_tokens_min": 1, | ||
"prompt_tokens_max": 128000, | ||
"output_tokens": 2048, | ||
"output_tokens_stdev": 64, | ||
"output_tokens_min": 1, | ||
"output_tokens_max": 2048 | ||
} | ||
} |
13 changes: 13 additions & 0 deletions
13
src/automation/standards/benchmarking/benchmarking_16k.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"rate_type": "sweep", | ||
"data": { | ||
"prompt_tokens": 16000, | ||
"prompt_tokens_stdev": 128, | ||
"prompt_tokens_min": 1, | ||
"prompt_tokens_max": 16000, | ||
"output_tokens": 2048, | ||
"output_tokens_stdev": 64, | ||
"output_tokens_min": 1, | ||
"output_tokens_max": 2048 | ||
} | ||
} |
13 changes: 13 additions & 0 deletions
13
src/automation/standards/benchmarking/benchmarking_32k.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"rate_type": "sweep", | ||
"data": { | ||
"prompt_tokens": 32000, | ||
"prompt_tokens_stdev": 128, | ||
"prompt_tokens_min": 1, | ||
"prompt_tokens_max": 32000, | ||
"output_tokens": 2048, | ||
"output_tokens_stdev": 64, | ||
"output_tokens_min": 1, | ||
"output_tokens_max": 2048 | ||
} | ||
} |
13 changes: 13 additions & 0 deletions
13
src/automation/standards/benchmarking/benchmarking_64k.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"rate_type": "sweep", | ||
"data": { | ||
"prompt_tokens": 64000, | ||
"prompt_tokens_stdev": 128, | ||
"prompt_tokens_min": 1, | ||
"prompt_tokens_max": 64000, | ||
"output_tokens": 2048, | ||
"output_tokens_stdev": 64, | ||
"output_tokens_min": 1, | ||
"output_tokens_max": 2048 | ||
} | ||
} |
13 changes: 13 additions & 0 deletions
13
src/automation/standards/benchmarking/benchmarking_chat.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"rate_type": "sweep", | ||
"data": { | ||
"prompt_tokens": 512, | ||
"prompt_tokens_stdev": 128, | ||
"prompt_tokens_min": 1, | ||
"prompt_tokens_max": 512, | ||
"output_tokens": 256, | ||
"output_tokens_stdev": 64, | ||
"output_tokens_min": 1, | ||
"output_tokens_max": 256 | ||
} | ||
} |
13 changes: 13 additions & 0 deletions
13
src/automation/standards/benchmarking/benchmarking_code_completion.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"rate_type": "sweep", | ||
"data": { | ||
"prompt_tokens": 256, | ||
"prompt_tokens_stdev": 128, | ||
"prompt_tokens_min": 1, | ||
"prompt_tokens_max": 256, | ||
"output_tokens": 1024, | ||
"output_tokens_stdev": 64, | ||
"output_tokens_min": 1, | ||
"output_tokens_max": 1024 | ||
} | ||
} |
13 changes: 13 additions & 0 deletions
13
src/automation/standards/benchmarking/benchmarking_code_fixing.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"rate_type": "sweep", | ||
"data": { | ||
"prompt_tokens": 1024, | ||
"prompt_tokens_stdev": 128, | ||
"prompt_tokens_min": 1, | ||
"prompt_tokens_max": 1024, | ||
"output_tokens": 1024, | ||
"output_tokens_stdev": 64, | ||
"output_tokens_min": 1, | ||
"output_tokens_max": 1024 | ||
} | ||
} |
13 changes: 13 additions & 0 deletions
13
src/automation/standards/benchmarking/benchmarking_docstring_generation.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"rate_type": "sweep", | ||
"data": { | ||
"prompt_tokens": 768, | ||
"prompt_tokens_stdev": 128, | ||
"prompt_tokens_min": 1, | ||
"prompt_tokens_max": 768, | ||
"output_tokens": 128, | ||
"output_tokens_stdev": 64, | ||
"output_tokens_min": 1, | ||
"output_tokens_max": 128 | ||
} | ||
} |
13 changes: 13 additions & 0 deletions
13
src/automation/standards/benchmarking/benchmarking_instruction.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"rate_type": "sweep", | ||
"data": { | ||
"prompt_tokens": 256, | ||
"prompt_tokens_stdev": 128, | ||
"prompt_tokens_min": 1, | ||
"prompt_tokens_max": 256, | ||
"output_tokens": 128, | ||
"output_tokens_stdev": 64, | ||
"output_tokens_min": 1, | ||
"output_tokens_max": 128 | ||
} | ||
} |
13 changes: 13 additions & 0 deletions
13
src/automation/standards/benchmarking/benchmarking_long_rag.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"rate_type": "sweep", | ||
"data": { | ||
"prompt_tokens": 10240, | ||
"prompt_tokens_stdev": 128, | ||
"prompt_tokens_min": 1, | ||
"prompt_tokens_max": 10240, | ||
"output_tokens": 1536, | ||
"output_tokens_stdev": 64, | ||
"output_tokens_min": 1, | ||
"output_tokens_max": 1536 | ||
} | ||
} |
13 changes: 13 additions & 0 deletions
13
src/automation/standards/benchmarking/benchmarking_rag.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"rate_type": "sweep", | ||
"data": { | ||
"prompt_tokens": 1024, | ||
"prompt_tokens_stdev": 128, | ||
"prompt_tokens_min": 1, | ||
"prompt_tokens_max": 1024, | ||
"output_tokens": 128, | ||
"output_tokens_stdev": 64, | ||
"output_tokens_min": 1, | ||
"output_tokens_max": 128 | ||
} | ||
} |
13 changes: 13 additions & 0 deletions
13
src/automation/standards/benchmarking/benchmarking_summarization.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"rate_type": "sweep", | ||
"data": { | ||
"prompt_tokens": 1024, | ||
"prompt_tokens_stdev": 128, | ||
"prompt_tokens_min": 1, | ||
"prompt_tokens_max": 1024, | ||
"output_tokens": 128, | ||
"output_tokens_stdev": 64, | ||
"output_tokens_min": 1, | ||
"output_tokens_max": 128 | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"rate_type": "sweep", | ||
"data": { | ||
"prompt_tokens": 512, | ||
"prompt_tokens_stdev": 128, | ||
"prompt_tokens_min": 1, | ||
"prompt_tokens_max": 1024, | ||
"output_tokens": 256, | ||
"output_tokens_stdev": 64, | ||
"output_tokens_min": 1, | ||
"output_tokens_max": 1024 | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"rate_type": "sweep", | ||
"data": { | ||
"prompt_tokens": 4096, | ||
"prompt_tokens_stdev": 512, | ||
"prompt_tokens_min": 2048, | ||
"prompt_tokens_max": 6144, | ||
"output_tokens": 512, | ||
"output_tokens_stdev": 128, | ||
"output_tokens_min": 1, | ||
"output_tokens_max": 1024 | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -1,34 +1,41 @@ | ||||||
from clearml import Task | ||||||
from typing import Sequence, Optional | ||||||
from automation.configs import DEFAULT_OUTPUT_URI | ||||||
from automation.configs import DEFAULT_OUTPUT_URI, DEFAULT_RESEARCH_BRANCH | ||||||
from automation.standards import STANDARD_CONFIGS | ||||||
import yaml | ||||||
import os | ||||||
|
||||||
class BaseTask(): | ||||||
|
||||||
base_packages = ["git+https://github.com/neuralmagic/research.git"] | ||||||
#base_packages = ["git+https://github.com/neuralmagic/research.git"] | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove commented code |
||||||
#base_packages = ["git+https://github.com/neuralmagic/research.git@update_guidellm"] | ||||||
|
||||||
def __init__( | ||||||
self, | ||||||
project_name: str, | ||||||
task_name: str, | ||||||
docker_image: str, | ||||||
branch: Optional[str] = DEFAULT_RESEARCH_BRANCH, | ||||||
packages: Optional[Sequence[str]]=None, | ||||||
task_type: str="training", | ||||||
): | ||||||
branch_name = branch or DEFAULT_RESEARCH_BRANCH | ||||||
base_packages = [f"git+https://github.com/neuralmagic/research.git@{branch_name}"] | ||||||
|
||||||
if packages is not None: | ||||||
packages = list(set(packages + self.base_packages)) | ||||||
packages = list(set(packages + base_packages)) | ||||||
else: | ||||||
packages = self.base_packages | ||||||
packages = base_packages | ||||||
|
||||||
print(packages) | ||||||
|
||||||
self.project_name = project_name | ||||||
self.task_name = task_name | ||||||
self.docker_image = docker_image | ||||||
self.packages = packages | ||||||
self.task_type = task_type | ||||||
self.task = None | ||||||
self.branch= branch | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
self.script_path = None | ||||||
self.callable_artifacts = None | ||||||
|
||||||
|
@@ -50,8 +57,8 @@ def process_config(self, config): | |||||
return yaml.safe_load(open(STANDARD_CONFIGS[config], "r")) | ||||||
elif os.path.exists(config): | ||||||
return yaml.safe_load(open(config, "r")) | ||||||
elif os.path.exists(os.path.join("..", "standatrds", config)): | ||||||
return yaml.safe_load(open(os.path.join("..", "standatrds", config)), "r") | ||||||
elif os.path.exists(os.path.join("..", "standards", config)): | ||||||
return yaml.safe_load(open(os.path.join("..", "standards", config)), "r") | ||||||
else: | ||||||
return yaml.safe_load(config) | ||||||
|
||||||
|
@@ -91,7 +98,7 @@ def create_task(self): | |||||
add_task_init_call=True, | ||||||
script=self.script_path, | ||||||
repo="https://github.com/neuralmagic/research.git", | ||||||
branch="main", | ||||||
branch=self.branch, | ||||||
) | ||||||
self.task.output_uri = DEFAULT_OUTPUT_URI | ||||||
self.set_arguments() | ||||||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us not define a default guidellm scenario. The user must always specify what benchmarking it's doing