-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Fixes #2513: Add support for Free-Form Function Calling and Context Free Grammar constraints over tools #2572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fixes #2513: Add support for Free-Form Function Calling and Context Free Grammar constraints over tools #2572
Conversation
the response from the model is not correctly handled yet
now produces appropriate warnings when misconfigured
I can now call the model with the following: ``` from pydantic_ai import Agent import asyncio import json from pydantic_core import to_jsonable_python from pydantic_ai.models.openai import OpenAIResponsesModel agent = Agent(OpenAIResponsesModel("gpt-5-mini"), model_settings={"openai_reasoning_effort": "minimal"}) @agent.tool_plain(free_form=True) def execute_lucene_query(query: str) -> str: """Use this to run a lucene query against the system. YOU MUST ALWAYS RUN A QUERY BEFORE ANSWERING THE USER. Args: query: the lucene query to run Returns: the result of executing the query, or an error message """ return "The query failed to execute, the solr server is unavailable" async def run() -> None: response = await agent.run("Execute the lucene query text:IKEA and give me the results") history = response.all_messages() as_json = json.dumps(to_jsonable_python(history), indent=2) print(as_json) print(response.output) asyncio.run(run()) ```
I'm going to work on CI checks after getting something that at least supports the CFG tool calls. |
will also validate the grammar if the dependency is installed
@@ -977,6 +980,9 @@ def tool( | |||
require_parameter_descriptions: bool = False, | |||
schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, | |||
strict: bool | None = None, | |||
free_form: bool | None = None, | |||
grammar_syntax: Literal['regex', 'lark'] | None = None, | |||
grammar_definition: str | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matthewfranglen I know you still have some work planned on this PR before it's really ready for review, but please consider the API I proposed in #2513 (comment). I'd prefer one argument taking an object over 3 that need to be used together
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I missed your intent with that part. It's a very good idea I will certainly do that, and it will clean up what I have done so far. Thanks for the reminder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I have addressed this now
going to add the literal version now
it's provided by the init
pyright isn't happy with it
just coverage and documentation now, pending the actual review of course |
@matthewfranglen Just a heads-up that I'll be out this coming week and will be back the 25th. Assuming this is not urgent I'll review it then. If it is, please ping |
tool_defs already includes the output tools
Need to write the documentation. The code is ready for review. |
@@ -66,7 +66,7 @@ dependencies = [ | |||
|
|||
[tool.hatch.metadata.hooks.uv-dynamic-versioning.optional-dependencies] | |||
# WARNING if you add optional groups, please update docs/install.md | |||
logfire = ["logfire[httpx]>=3.14.1"] | |||
logfire = ["logfire[httpx]>=3.16.1"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was needed because the lowest version test results in the error:
Logfire.instrument_pydantic_ai() got an unexpected keyword argument 'version'
the example tests do check it!
@DouweM I've implemented the functionality, with tests and documentation. I believe this is ready for review. |
docs/models/openai.md
Outdated
|
||
#### Context Free Grammar | ||
|
||
Invoking tools using freeform function calling can result in errors when the tool expectations are not met. For example, a tool that queries an SQL database can only accept valid SQL. The freeform function calling of GPT-5 supports generation of valid SQL for this situation by constraining the generated text using a context free grammar. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invoking tools using freeform function calling can result in errors when the tool expectations are not met.
This is of course correct, but this is not unique to freeform functions and not all freeform functions will validate their inputs (maybe we just want plain text). I'd drop this sentence and present this as a powerful feature to further constraint freeform tool input, rather than presenting it as a workaround for errors.
docs/models/openai.md
Outdated
```python | ||
from pydantic_ai import Agent | ||
from pydantic_ai.models.openai import OpenAIResponsesModel | ||
from pydantic_ai.tools import FunctionTextFormat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make the new types importable from pydantic_ai
directly
|
||
A context‑free grammar is a collection of production rules that define which strings belong to a language. Each rule rewrites a non‑terminal symbol into a sequence of terminals (literal tokens) and/or other non‑terminals, independent of surrounding context—hence context‑free. CFGs can capture the syntax of most programming languages and, in OpenAI custom tools, serve as contracts that force the model to emit only strings that the grammar accepts. | ||
|
||
The grammar can be written as either a regular expression: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's have headings for Regular Expressions and LARK, so they're shown in the ToC on the right
docs/models/openai.md
Outdated
``` | ||
|
||
1. An inline SQL grammar definition would be quite extensive and so this simplified version has been written, you can find an example SQL grammar [in the openai example](https://cookbook.openai.com/examples/gpt-5/gpt-5_new_params_and_tools#33-example---sql-dialect--ms-sql-vs-postgresql). There are also example grammars in the [lark repo](https://github.com/lark-parser/lark/blob/master/examples/composition/json.lark). Remember that a simpler grammar that matches your DDL will be easier for GPT-5 to work with and will result in fewer semantically invalid results. | ||
2. Returning the input directly might seem odd, remember that it has been constrained to the provided grammar. This can be useful if you want GPT-5 to generate content according to a grammar that you then use extensively through your program. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need the function ToolOutput(str, text_format=...)
should also be able to work
1. An inline SQL grammar definition would be quite extensive and so this simplified version has been written, you can find an example SQL grammar [in the openai example](https://cookbook.openai.com/examples/gpt-5/gpt-5_new_params_and_tools#33-example---sql-dialect--ms-sql-vs-postgresql). There are also example grammars in the [lark repo](https://github.com/lark-parser/lark/blob/master/examples/composition/json.lark). Remember that a simpler grammar that matches your DDL will be easier for GPT-5 to work with and will result in fewer semantically invalid results. | ||
2. Returning the input directly might seem odd, remember that it has been constrained to the provided grammar. This can be useful if you want GPT-5 to generate content according to a grammar that you then use extensively through your program. | ||
|
||
##### Best Practices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to link to OpenAI's docs instead of having to keep this up to date
@@ -216,6 +218,57 @@ class DeferredToolResults: | |||
A = TypeVar('A') | |||
|
|||
|
|||
@dataclass | |||
class FunctionTextFormat: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently freeform tool calling and OpenAI-specific, but may not be in the future: #2623 (comment)
So I'd like this to be a bit more generic. What do you think of letting text_format
directly take a re.Pattern
or lark.Lark
instead of having this special object?
Edit: Instead of requiring lark
to be installed, we can also add and support our own LarkGrammar
object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed you commented on this, thank you for taking the time to review this PR. It's late here, however I wanted to respond to this specific comment.
I did consider using lark / regex objects directly. The problem is that you have to supply the text of the regular expression and you cannot export the text of a compiled re.Pattern
. It's not stored on the pattern object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that you have to supply the text of the regular expression and you cannot export the text of a compiled re.Pattern. It's not stored on the pattern object.
Interesting, I didn't realize that. Then an object with string values makes sense.
Let's at least call it TextFormat
to be less function-specific.
I also like the look of RegexTextFormat(str)
and LarkTextFormat(str)
better than TextFormat('regex', str)
, but I don't feel too strongly about that. I could see us having future syntax-specific additional flags though, and that'd be awkward to mix on the same object. The fact that the class is currently internally branching on the syntax to do validation also suggests that what we really want are subclasses.
Either way OpenAIModel
needs to verify that it supports the syntax in question, because we may add more in the future that OpenAI doesn't support but other models do.
|
||
When `None` (the default), the model invokes the tool in the normal way and parallel tool calls are possible. | ||
|
||
Note: this is currently only supported by OpenAI gpt-5 models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's capitalize gpt-5 here and elsewhere, and be consistent in free-form
vs freeform
.
return self.single_string_argument_name is not None | ||
|
||
@property | ||
def single_string_argument_name(self) -> str | None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we dedupe with this similar implementation somehow?
pydantic-ai/pydantic_ai_slim/pydantic_ai/_output.py
Lines 860 to 864 in 96d895d
arguments_schema = self._function_schema.json_schema.get('properties', {}) | |
argument_name = next(iter(arguments_schema.keys()), None) | |
if argument_name and arguments_schema.get(argument_name, {}).get('type') == 'string': | |
self._str_argument_name = argument_name | |
return |
Maybe make both use a new function from utils
@@ -98,6 +98,8 @@ ag-ui = ["ag-ui-protocol>=0.1.8", "starlette>=0.45.3"] | |||
retries = ["tenacity>=8.2.3"] | |||
# Temporal | |||
temporal = ["temporalio==1.17.0"] | |||
# free form function calling with lark context free grammar | |||
lark = ["lark>=1.2.2"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't think we need this because we're just passing the value on, we can skip doing our own validation, and save the user having to install something extra.
Co-authored-by: Douwe Maan <[email protected]>
Co-authored-by: Douwe Maan <[email protected]>
This is the PR to implement free-form function calling and context free constraints over tools. This implements:
I'm hoping to get a working implementation done today, after which we can have the rounds of feedback and improvement required before this can be merged. I will try to keep in mind the feedback that I got on the last PR.
Since there is a specific company voice that you wish to maintain, when suggesting changes to documentation et al please provide the exact wording that you desire.