Skip to content

Conversation

YichenG170
Copy link
Contributor

This PR adds comprehensive support for parts of WenetSpeech datasets to lmms-eval.

🔧 Core Implementation
New Dataset: WenetSpeech
2 New Splits with around 25h testing audio data

📊 Example Usage
TASK="wenet_speech"
TASK_SUFFIX="${TASK//,/_}"

🧩 File Changes
New:
wenet_speech/utils.py - Evaluation methods for each subset
4 YAML files - Detailed evaluation information for each task & task group

Comment on lines +13 to +22
from lmms_eval.llm_judge import ServerConfig, get_server

API_TYPE = os.getenv("API_TYPE", "openai")
# Use JUDGE_MODEL_VERSION instead of MODEL_VERSION
JUDGE_MODEL_VERSION = os.getenv("JUDGE_MODEL_VERSION", "gpt-4o-mini")

server_config = ServerConfig(
model_name=JUDGE_MODEL_VERSION,
)
server = get_server(server_name=API_TYPE, config=server_config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is redundant

@kcz358
Copy link
Collaborator

kcz358 commented Sep 26, 2025

Hi, Thank you for the PR. I review the wenet part and I think most of the part LGTM. I think the commits is a bit chaos and the file changes include the file changes from your last PR. Do you mind only include the commit that contains the changes for wenet speech? Thanks!

You can do that by checkout from a new main and then cherry pick the commit. Thanks!

@YichenG170
Copy link
Contributor Author

Hi, Thank you for the PR. I review the wenet part and I think most of the part LGTM. I think the commits is a bit chaos and the file changes include the file changes from your last PR. Do you mind only include the commit that contains the changes for wenet speech? Thanks!

You can do that by checkout from a new main and then cherry pick the commit. Thanks!

Ahh sorry for this, I will make a new one!

@YichenG170 YichenG170 closed this Sep 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants