-
Notifications
You must be signed in to change notification settings - Fork 12.9k
Model: Seed OSS thinking + tool call support #15552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fyi, small jinja template update (mostly comment translation, but one functional difference at the end): https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct/commit/497f1dca95ebdec98e41d517b9f060ee753c902f#d2h-526183 |
Thanks, I'll retest with new template. |
Okay, verified, both the tests and the real life tool calling test run on my trusty Q2_K_S quant pass fine. @CISC any chance you could take a quick look at this? Apparently the model has gotten popular and some people want to get it merged :> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Aight, should be good to go. |
I could be missing something but your Which means using tool_choice = "required" disables thinking. |
Gave a quick try, tool calling results are terrible with very simple example, but it's probably the model's fault. Gave another try with a tool we use , long context and it's really doing something off :
![]() With tool_choice at ![]() to run the same query you can use |
Why do the meaningful testing comments always start just as the PR gets merged? 😆 @ExtReMLapin I noticed this type of behavior too, but I'm not sure if this is a mistake or whether this is an intended feature of tool calls happening within the reasoning, which would mean we'd have to rework the grammar. I've only tried tool calling on my Q2_K_S, but it works without any problems, at least on the simple tool calls I've tried (web search / execute shell command). |
Because i'm a lazy man and pulling it on my own branch is already too much effort ! As for the reasoning, I'm already surprised it happens, here with qwen, right now on master it doesn't happen because it's not allowed by grammar (in required tool calling mode).
I don't see why everyone seems so happy with this model, from my tests it's VERY VERY meh, nothing close to Qwen3 Maybe i'm just doing something wrong, but either way we should NOT see the thinking tags, it should either not think or parse the thinking tags and shove the text into |
True - are you using the official updated template? (from https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct/blob/main/chat_template.jinja)? |
As you can see in the command I posted I'm using the one you pushed in your PR |
* Reasoning and tool-calling support for Seed OSS * Fix grammar and partial parsing * Whitespace * New chat template * Update common/chat.cpp Co-authored-by: Sigbjørn Skjæret <[email protected]> * Update common/chat.cpp Co-authored-by: Sigbjørn Skjæret <[email protected]> * Remove unused 'purge_healing_marker' helper --------- Co-authored-by: Sigbjørn Skjæret <[email protected]>
This one has been an absolute nightmare to implement (Seed OSS tool calling is basically Qwen Coder all over again), so I hope this actually works (testing this on my Q2_K_S which fails to call the tools properly every second time is a nightmare as well).