-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Include apply_patch tool for oss models from gpt-oss providers with different naming convention (e.g. openai/gpt-oss-*
)
#2811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
All contributors have signed the CLA ✍️ ✅ |
I have read the CLA Document and I hereby sign the CLA |
…ere gpt-oss model names from different providers have a prefix like openai/gpt-oss/120b
fac5f52
to
119ed46
Compare
hey @dylan-hurd-oai @bolinfest - would love to get any feedback on this, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1-comment to discuss, but appreciate the simple suggestion!
codex-rs/core/src/model_family.rs
Outdated
@@ -96,7 +96,7 @@ pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> { | |||
slug, "gpt-4.1", | |||
needs_special_apply_patch_instructions: true, | |||
) | |||
} else if slug.starts_with("gpt-oss") { | |||
} else if slug.contains("gpt-oss") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable! My only concern is that (like most of our apply_patch setups), the apply_patch_tool is a relatively specific approach. I wonder if we instead want to do something like slug.starts_with("gpt-oss") || slug.starts_with("openai/gpt-oss")
. @andrewtlw how consistent across providers is the prefix pattern you described? Do we need the flexibility of contains
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dylan-hurd-oai. I did think the same initially. The prefix pattern openai/gpt-oss-*
is the most common (vertex ai, open router, hugging face inference, groq, etc), but there's also @cf/openai/gpt-oss-*
for Cloudflare Workers AI and openai.gpt-oss-*
on AWS bedrock.
I did some testing and CF and Bedrock actually don't support tool calling on the gpt-oss models, so likely aren't going to be used as providers for codex-cli anytime soon.
Will make the change to the suggested slug.starts_with("gpt-oss") || slug.starts_with("openai/gpt-oss")
Model providers like Groq, Openrouter, AWS Bedrock, VertexAI and others typically prefix the name of gpt-oss models with
openai
, e.g.openai/gpt-oss-120b
.This PR is to match the model name slug using
contains
instead ofstarts_with
to ensure that theapply_patch
tool is included in the tools for models names likeopenai/gpt-oss-120b
Without this, the gpt-oss models will often try to call the
apply_patch
tool directly instead of via theshell
command, leading to validation errors.I have run all the local checks.
Note: The gpt-oss models from non-Ollama providers are typically run via a profile with a different base_url (instead of with the
--oss
flag)