Add an interface for surfacing tool calls #69

cpsievert · 2025-03-07T00:33:26Z

Addresses #33
Related posit-dev/shinychat#31

This PR adds a few things to help with surfacing tool requests and results in response content.

A new ToolResult() class, which allows for:
- Control over how results get formatted when sent to the model
- Yielding of additional content for the user (i.e., the downstream consumer of a .stream() or .chat()) to display when the tool is called.
A new on_request parameter for .register_tool(). When tool is requested, this callback executes, and the result is yielded to the user.
A new Chat.on_tool_request() method for registering a default tool request handler.

Here is a basic Shiny example:

import asyncio
from shiny.express import ui
from chatlas import ChatOpenAI, ToolResult

chat_model = ChatOpenAI()

async def get_current_temperature(latitude: float, longitude: float):
    # Simulate a slow operation
    await asyncio.sleep(4)

    return ToolResult(
        "72°F.",
        user="""
<details class="pb-3">
<summary>Tool result</summary>
The <code>get_current_temperature()</code> tool returned: 72°F.
</details>\n\n\n\n
""",
    )

chat_model.register_tool(
    get_current_temperature,
    on_request=lambda request: "Requesting the current temperature...\n\n",
)

chat = ui.Chat(id="chat", messages=["**Hello!** How can I help you today?"])
chat.ui()
chat.update_user_input(value="How is the weather in Duluth, MN today?")

@chat.on_user_submit
async def handle_user_input(user_input: str):
    response = await chat_model.stream_async(user_input)
    await chat.append_message_stream(response)

tool-call.mp4

TODO

Do we care about implications for MCP?
Documentations and examples
Tests

chatlas/_content.py

gadenbuie · 2025-03-07T14:56:26Z

I like the idea of a ToolResult class, but I'm definitely conflicted about having to compose ToolResult inside the actual tool function, but the simplicity of the approach is convincing.

For completeness and generality, I think ToolResult should have three key properties:

.data: The actual data from the tool call. E.g. if you called a weather API, this would be the dictionary with the API response.
.result or .text (unsure of the name): The text returned to the LLM. We could also use a __str__ method that looks for this first or calls str(self.data).
.html: The HTML display output, which could be raw HTML or htmltools.Tags, etc.

…nstead of str)

…ponse I don't it was necessary in the first place, and leads to inefficient use of memory

…opt-out, and better naming/docs

wch · 2025-03-11T18:12:54Z

I'm kind of late to the party for this one, but it would be very handy to be able to just send stuff to the chat UI directly, instead of having two different places where it happens.

Some issues I see with this:

There's two different places where UI content needs to provided for it to go to the user: one place is in .register_tool(), and the other is in the return value of the tool.
With the on_request function for .register_tool(), if you need to do anything that's not very simple, then you have to def another function.
It's not possible to stream content to the user as the tool call progresses.

For example, imagine a tool where you give it the name of a city, and the tool (1) looks up the coordinates for the city, and (2) look up the weather for those coordinates. If the conversation goes like this. Note that the parts in square brackets wouldn't be shown -- I'm just using them here to annotate what's going on under the hood.

User:
  I'm in New York. Is today a nice day for a walk?

Assistant:
  I'll look up the weather for you.
  [Tool call begins]
  Looking up coordinates for New York...
  New York is at 40.71° N, 74.01° W
  Looking up weather...
  Current conditions in New York: 62 degrees and sunny.
  [Tool call ends]
  Yes, today is a nice day for a walk in New York.

With the current code in this PR, you can't display the middle two lines of the tool call phase. But here's how the rest would look:

async def get_current_temperature(city_name: str):
    lat, lon = await find_coordinates(city_name)
    temp, sun = await find_weather(lat, lon)

    return ToolResult(
        {"temperature": temp, "sun": sun},
        user=f"Current conditions in {request["city_name"]}: {temp} degrees and {sun}\n\n",
    )

chat_model.register_tool(
    get_current_temperature,
    on_request=lambda request: f"Looking up coordinates for {request["city_name"]}...\n\n",
)

It would be nice if the tool itself had access to the stream and could yield to it, like:

async def get_current_temperature(city_name: str):
    f"Looking up coordinates for {request["city_name"]}...\n\n"
    lat, lon = await find_coordinates(city_name)
    yield f"{request["city_name"]} is at {lat}, {lon}\n\n"
    yield "Looking up weather...\n\n"
    temp, sun = await find_weather(lat, lon)
    yield f"Current conditions in {request["city_name"]}: {temp} degrees and {sun}.\n\n"

    return ToolResult({"temperature": temp, "sun": sun})

chat_model.register_tool(get_current_temperature)

Some other options instead of using yield:

A function get_current_chat() that uses some sort of context.
```
async def get_current_temperature(city_name: str):
    chat = get_current_chat()
    chat.emit(f"Looking up coordinates for {request["city_name"]}...\n\n")
    ...
```
And if the tool function is called in a non-chat context, it could just return a dummy chat object where chat.emit() is a no-op. Note that this is not the Shiny-level chat object -- this is at the chatlas layer, so that this tool would show progress in both a Shiny app and when used at the console.

Chatlas could pass in the chat object if there tool has a parameter named _chat.

    async def get_current_temperature(city_name: str, *, _chat: Chat):
        ...

Chatlas could pass in a generic metadata object if there tool has a parameter named _meta, and that object could have chat on it. And arbitrary other stuff could be put on the _meta object.
```
    async def get_current_temperature(city_name: str, *, _meta):
        _meta.chat.emit(f"Looking up coordinates for {request["city_name"]}...\n\n")
        ...
```

One final idea, which I kind of like: It could also be useful to let the user define other data to pass in. You could imagine a tool call where you want to provide the tool with some information, but you don't want to send that information to the LLM.

For example, in something like Sidebot, suppose you have a data set and you want the LLM to use tool calls to run SQL queries on the data. You want to send the user request and schema and the LLM, but you don't want to send it the data.

# Define the tool in a scope outside of the app's server code
def run_query(query: str, data: pd.DataFrame, chat: Chat):
    ...


def server(input, output):
    # Suppose the value of df can change over time
    df = pd.DataFrame(...)

    chat_model = ChatAnthropic(...)

    @chat_model.register_tool(run_query, extra_args = {
        "data": lambda: df
        "chat": lambda: chat_model
    })

In this case, the LLM only sees and uses the query parameter, and developer has full control over the extra args that are passed to the tool, namely data and chat.

In the example above, I used lambda for both of the args, because I was thinking that df could change over time. However, chat_model doesn't change over time, and maybe you'd want to support both dynamic and static arguments. In that case we could make it so the developer would mark the dynamic arguments, say, with a function called dynamic():

    @chat_model.register_tool(run_query, extra_args = {
        "data": dynamic(lambda: df)
        "chat": chat_model
    })

wch · 2025-03-11T19:52:49Z

Oh, and one more thing about being able to pass in arbitrary objects to the tool. Suppose you want to use the same tool in a Shiny app, and in a console app.

Let's say the tool looks like this, where it takes an emit argument which is a function:

# Define the tool in a scope outside of the app's server code
def run_query(query: str, data: pd.DataFrame, emit: Callable[[str], Awaitable[None]]):
    emit(f"Starting query: f{query}...")
    ...
    emit("Running query...")
    ...
    emit("Finished query...")
    return ...

The emit function needs to be provided to the tool, and it can take different forms.

In a console app, you might just pass in the chat_model.emit method:

df = pd.DataFrame(...)

chat_model = ChatAnthropic(...)

@chat_model.register_tool(run_query, extra_args = {
    "data": dynamic(lambda: df)
    "emit": chat_model.emit
})

But in Shiny, you might do something fancier with those messages. In this case, it might wrap them in <code> tags, and send it to the Shiny chat UI stream, instead of going through the chatlas message stream:

def server(input, output):
    # Suppose the value of df can change over time
    df = pd.DataFrame(...)

    # This is the Shiny chat object
    chat = ui.Chat(...)

    async def append_code_to_chat(txt: str):
        await chat.append_message_stream(ui.tags.code(txt))


    # The chatlas chat object
    chat_model = ChatAnthropic(...)

    @chat_model.register_tool(run_query, extra_args = {
        "data": dynamic(lambda: df)
        "emit": append_code_to_chat
    })

This would also the tool caller to define their own functions for many purposes, like displaying progress, or getting user input:

If you have a long computation that needs to display progress, then that progress could be implemented one way at the console, another way in Shiny, and yet another in Streamlit:

def long_computation(x: int, progress: Callable[[int], Awaitable[None]]):
    progress(0)
    ...
    progress(33)
    ...
    progress(66)
    ...
    progress(100)
    return ...

Or say the tool needs to get user input to confirm something

def ask_user_yes_no(
    msg: str,
    confirm: Callable[[str], Awaitable[bool]]
):
    user_response = await confirm(msg)
    return user_response

In a discussion with @JCheng about this, he pointed out that we can already do some of these things, with currying:

# The tool function, defined outside of the Shiny app
async def run_query(
    query: str,
    data: pd.DataFrame,
    emit = Callable[[str], Awaitable[None]]
):
    """Runs a SQL query on data"""
    await emit(f"Starting query: f{query}...")
    ...
    await emit("Running query...")
    ...
    await emit("Finished query...")
    return ...

# Function for wrapping data and emit
def make_run_query(
    data: Callable[[], pd.DataFrame],
    emit: Callable[[str], Awaitable[None]]
):
    def wrapped_run_query(query):
        run_query(query, data(), emit)

    return wrapped_run_query
    

## Using in Shiny
def server():
    async def append_code_to_chat(txt: str):
        await chat.append_message_stream(ui.tags.code(txt))

    chat_model.register_tool(make_run_query(df, append_code_to_chat))


## Using in the terminal
chat_model.register_tool(make_run_query(df, print)) # Prints to console

Both of the uses above, in Shiny and the terminal, output directly to their respective UIs. But if we want to modify the chatlas object's output stream, it might be something like this:

## Emit to chat_model's output stream, at the chatlas level
chat_model.register_tool(make_run_query(df, chat_model.emit))

And finally, one other possibility that we discussed, where if the tool function takes a parameter named _chat of type Chat, we automatically pass in the chatlas Chat object. So defining and registering the tool in this case is very simple:

# This version of run_query will emit to the chatlas object's output stream
async def run_query(query: str, _chat: Chat):
    """Runs a SQL query on data"""
    await _chat.emit(f"Starting query: f{query}...")
    ...
    await _chat.emit("Running query...")
    ...
    await _chat.emit("Finished query...")
    return ...


chat_model.register_tool(run_query)

cpsievert mentioned this pull request Mar 7, 2025

Add on_tool_result()/on_tool_request() callbacks #48

Closed

cpsievert force-pushed the surface-tool-calls branch from e482f6a to bb829b5 Compare March 7, 2025 01:09

cpsievert commented Mar 7, 2025

View reviewed changes

chatlas/_content.py Show resolved Hide resolved

First pass at an interface for surfacing tool calls

0b6e5cc

cpsievert force-pushed the surface-tool-calls branch from bb829b5 to 0b6e5cc Compare March 7, 2025 16:16

Add serialized_value to ToolResult(); tweak naming; refactor

418ca3a

This comment was marked as outdated.

Sign in to view

cpsievert added 2 commits March 7, 2025 11:19

Quote type

36a7097

Update tests

dda83ce

This comment was marked as outdated.

Sign in to view

cpsievert added 3 commits March 7, 2025 15:46

Update ChatResponse/ChatResponseAsync to be iterable of Stringable (i…

f78187f

…nstead of str)

Update .stream() and .stream_async() to return generators not ChatRes…

91ed99a

…ponse I don't it was necessary in the first place, and leads to inefficient use of memory

Add unit tests

973b7aa

This comment was marked as outdated.

Sign in to view

Smarter formatting default for tool results, better way to customize/…

3a1bd9c

…opt-out, and better naming/docs

cpsievert force-pushed the surface-tool-calls branch from 27d2887 to 3a1bd9c Compare March 10, 2025 16:02

cpsievert added 2 commits March 10, 2025 11:18

Merge branch 'main' into surface-tool-calls

3c2c85b

Update changelog

0ac76a8

cpsievert force-pushed the surface-tool-calls branch from b49c888 to 0ac76a8 Compare March 10, 2025 16:38

gadenbuie mentioned this pull request Mar 11, 2025

Tool/Function calling UI posit-dev/shinychat#31

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add an interface for surfacing tool calls #69

Add an interface for surfacing tool calls #69

cpsievert commented Mar 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

gadenbuie commented Mar 7, 2025

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

wch commented Mar 11, 2025

Uh oh!

wch commented Mar 11, 2025

Uh oh!

Uh oh!

Add an interface for surfacing tool calls #69

Are you sure you want to change the base?

Add an interface for surfacing tool calls #69

Conversation

cpsievert commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODO

Uh oh!

Uh oh!

gadenbuie commented Mar 7, 2025

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

wch commented Mar 11, 2025

Uh oh!

wch commented Mar 11, 2025

Uh oh!

Uh oh!

cpsievert commented Mar 7, 2025 •

edited

Loading