Skip to content

Conversation

pwilkin
Copy link
Contributor

@pwilkin pwilkin commented Aug 24, 2025

This one has been an absolute nightmare to implement (Seed OSS tool calling is basically Qwen Coder all over again), so I hope this actually works (testing this on my Q2_K_S which fails to call the tools properly every second time is a nightmare as well).

@github-actions github-actions bot added the testing Everything test related label Aug 24, 2025
@bfroemel
Copy link

fyi, small jinja template update (mostly comment translation, but one functional difference at the end): https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct/commit/497f1dca95ebdec98e41d517b9f060ee753c902f#d2h-526183

@pwilkin
Copy link
Contributor Author

pwilkin commented Aug 27, 2025

fyi, small jinja template update (mostly comment translation, but one functional difference at the end): https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct/commit/497f1dca95ebdec98e41d517b9f060ee753c902f#d2h-526183

Thanks, I'll retest with new template.

@pwilkin
Copy link
Contributor Author

pwilkin commented Aug 29, 2025

Okay, verified, both the tests and the real life tool calling test run on my trusty Q2_K_S quant pass fine.

@CISC any chance you could take a quick look at this? Apparently the model has gotten popular and some people want to get it merged :>

Copy link
Collaborator

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pwilkin
Copy link
Contributor Author

pwilkin commented Aug 29, 2025

Aight, should be good to go.

@CISC CISC merged commit 60e5eee into ggml-org:master Aug 29, 2025
47 of 48 checks passed
@ExtReMLapin
Copy link
Contributor

I could be missing something but your builder.add_rule("root", string_join(tool_rules, " | ")); seems to lack the reasoning grammar.

Which means using tool_choice = "required" disables thinking.

@ExtReMLapin
Copy link
Contributor

ExtReMLapin commented Aug 29, 2025

./build/bin/llama-server -ngl 999 -fa -m Seed-OSS-36B-Instruct-Q5_K_M.gguf -c 131071 --jinja --reasoning-format deepseek --slots --n-predict 131071 --no-context-shift -ctk q8_0 -ctv q8_0 --parallel 1 --port 2483 --host 192.168.25.25 --chat-template-file models/templates/ByteDance-Seed-OSS.jinja


Gave a quick try, tool calling results are terrible with very simple example, but it's probably the model's fault.


Gave another try with a tool we use , long context and it's really doing something off :

  1. I feel like it's not streaming correctly, it's sending the whole tool call at once
  2. there is something weird going on with reasoning (after tool call ???)
image

With tool_choice at required we should have a reasoning_content, no content, and BEFORE the tool call

image

to run the same query you can use node and this file, please note you might need multiple tries :
oss_js_node_debug.js

@pwilkin
Copy link
Contributor Author

pwilkin commented Aug 29, 2025

Why do the meaningful testing comments always start just as the PR gets merged? 😆

@ExtReMLapin I noticed this type of behavior too, but I'm not sure if this is a mistake or whether this is an intended feature of tool calls happening within the reasoning, which would mean we'd have to rework the grammar.

I've only tried tool calling on my Q2_K_S, but it works without any problems, at least on the simple tool calls I've tried (web search / execute shell command).

@ExtReMLapin
Copy link
Contributor

ExtReMLapin commented Aug 29, 2025

Why do the meaningful testing comments always start just as the PR gets merged? 😆

Because i'm a lazy man and pulling it on my own branch is already too much effort !

As for the reasoning, I'm already surprised it happens, here with qwen, right now on master it doesn't happen because it's not allowed by grammar (in required tool calling mode).
If you want an example I opened a PR to support thinking AND tool calling @ required

#15248

I've only tried tool calling on my Q2_K_S, but it works without any problems,

I don't see why everyone seems so happy with this model, from my tests it's VERY VERY meh, nothing close to Qwen3

Maybe i'm just doing something wrong, but either way we should NOT see the thinking tags, it should either not think or parse the thinking tags and shove the text into reasoning_content

@pwilkin
Copy link
Contributor Author

pwilkin commented Aug 29, 2025

Maybe i'm just doing something wrong, but either way we should NOT see the thinking tags, it should either not think or parse the thinking tags and shove the text into reasoning_content

True - are you using the official updated template? (from https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct/blob/main/chat_template.jinja)?

@ExtReMLapin
Copy link
Contributor

As you can see in the command I posted I'm using the one you pushed in your PR --chat-template-file models/templates/ByteDance-Seed-OSS.jinja

qnixsynapse pushed a commit to menloresearch/llama.cpp that referenced this pull request Aug 30, 2025
* Reasoning and tool-calling support for Seed OSS

* Fix grammar and partial parsing

* Whitespace

* New chat template

* Update common/chat.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update common/chat.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Remove unused 'purge_healing_marker' helper

---------

Co-authored-by: Sigbjørn Skjæret <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Everything test related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants