Skip to content

Conversation

rong-xyz
Copy link

@rong-xyz rong-xyz commented May 28, 2025

Basically, try send a base64 image (say from MCP) in ToolMessage you will easily get 900k token usage, which a lot of the user would assume it will just work if they transfermed from anthropic
This is a fix where the model behaves relatively good by faking role: function into a user but with tags
I made a similar fix to langchain-aws , where the model will simply skip translation if it is a ToolMessage, here everything is to_str() and image is easily 900k token usage langchain-ai/langchain-aws#495

As confimed by someone worked in Google:

Right now this is somewhat undefined, but model will probably figure it out if you send the image as a user role, and document this in the dev instruction. We have done bunch of work where wejust drop the image right after theFunctionResponse (so right after the\n and that seems to work well. More specifically the image has a nameso we print Wrote image: foo.pngas the textual function output then do
\n
foo.jpg

end quote.

The google data structure is a pain to use and work with compared to other providers, It will be too much work to write a test, as I will use this to patch our code base anyways, I will submit the PR here

Copy link
Collaborator

@lkuligin lkuligin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you fix linter and add a corresponding unit test, please?

@mdrxy
Copy link
Collaborator

mdrxy commented Sep 15, 2025

following up on this @rong-xyz

@mdrxy mdrxy changed the title [vertexai] Fix image support for ToolMessage in langchain-vertexai fix(vertex): image support for ToolMessage Sep 17, 2025
@mdrxy mdrxy added the vertex label Sep 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants