Elo HeLLM: new project for ranking language models #12969

JohannesGaessler · 2025-04-16T08:23:50Z

JohannesGaessler
Apr 16, 2025
Collaborator

I started a new project called Elo HeLLM for evaluating model quality using the llama.cpp HTTP server. I intend to co-develop this project with the llama.cpp training code in order to have quality control since llama-perplexity is not suitable for determining whether a finetune is actually any good. By comparison, since the methods I'm using rely on the generation of tokens instead of evaluating the model on a pre-existing text the performance bottleneck is much more severe. So I intend to also look into improving the performance of batched inference using the server, particularly for multiple GPUs.

ggerganov · 2025-04-16T11:44:23Z

ggerganov
Apr 16, 2025
Maintainer

If you think it could be useful to get more eyes on the project, feel free to add a link to it or to this discussion in the hot topics of the readme.

1 reply

JohannesGaessler Apr 16, 2025
Collaborator Author

I think as of right now I would still be getting comparatively little benefit from more attention. In the early stages it would be useful for me if someone could point out flaws in my methodology (if there are any) since fixing those now would be less work than fixing them down the line. But I've had IRL discussion with colleagues about the methodology and I think it will be fine. Long-term more attention will be useful to maybe crowd-source GPU time but I think a prerequisite for that is to make the code better so more people than just me can realistically run it. So I will maybe come back to that at some point but right now I think it doesn't really make a difference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Elo HeLLM: new project for ranking language models #12969

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Elo HeLLM: new project for ranking language models #12969

Uh oh!

JohannesGaessler Apr 16, 2025 Collaborator

Replies: 1 comment · 1 reply

Uh oh!

ggerganov Apr 16, 2025 Maintainer

Uh oh!

JohannesGaessler Apr 16, 2025 Collaborator Author

JohannesGaessler
Apr 16, 2025
Collaborator

Replies: 1 comment 1 reply

ggerganov
Apr 16, 2025
Maintainer

JohannesGaessler Apr 16, 2025
Collaborator Author