Skip to content

support images in tool call output with OpenAI #772

@simonpcouch

Description

@simonpcouch

This is a revival of #617.

In the docs for the completions API, the link to function calling heads to this page. Under "Tool call outputs - output we generate for the model", I see:

"The tool call output might return a JSON object (e.g., {"temperature": "25", "unit": "C"}, indicating a current temperature of 25 degrees), Image contents, or File contents."

That linked "Image contents" page is just the docs for the usual entry point from that used in user chat input.

I decided to give a go at using the same interface while working on an eval today, and am seeing convincing results; only counted as ~1000 tokens and reasonably strong visual understanding.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions