Skip to content

Conversation

douglas-reid
Copy link
Contributor

Adds support for locally-running Gemma3 models exposed via Ollama. This will enable developers to run fully-local agent workflows, using the larger two Gemma3 models.

Functionality is achieved by extending LiteLlm models and using a custom prompt template with some request/response pre/post processing.

As part of this work, the existing Gemma 3 support (via Gemini API) is refactored to clarify functionality and support broader reuse across both modes of interacting with the LLM.

A separate hello_world_gemma3_ollama example is provided to highlight local usage.

NOTE: Adds an optional dependency on instructor for finding and parsing JSON in Gemma3 response blocks.

Testing

Test Plan

  • add and run integration and unit tests
  • manual run of both hello_world_gemma and hellow_world_gemma3_ollama agents
  • manual run of multi_tool_agent from quickstart using new Gemma3Ollama LLM.

Automated Tests

Test Command Results
pytest ./tests/unittests 2779 passed, 2387 warnings in 63.03s
pytest ./tests/unittests/models/test_gemma_llm.py 15 passed in 4.06s
pytest ./tests/integration/models/test_gemma_llm.py 1 passed in 33.22s

Manual Tests

Log of running multi_agent_tool with a locally-built wheel:

[user]: what is the weather in new york?
15:12:24 - LiteLLM:INFO: utils.py:3373 -
LiteLLM completion() model= gemma3:12b; provider = ollama
15:12:28 - LiteLLM:INFO: utils.py:3373 -
LiteLLM completion() model= gemma3:12b; provider = ollama
[weather_time_agent]: The weather in New York is sunny with a temperature of 25 degrees Celsius (77 degrees Fahrenheit).

[user]: what is the time in new york?
15:12:43 - LiteLLM:INFO: utils.py:3373 -
LiteLLM completion() model= gemma3:12b; provider = ollama
15:12:48 - LiteLLM:INFO: utils.py:3373 -
LiteLLM completion() model= gemma3:12b; provider = ollama
[weather_time_agent]: The current time in New York is 2025-10-08 18:12:48 EDT-0400.

DEBUG log snippet of an agent run:

2025-10-08 15:32:33,322 - DEBUG - lite_llm.py:810 -
LLM Request:
-----------------------------------------------------------
System Instruction:

      You roll dice and answer questions about the outcome of the dice rolls.
...

You are an agent. Your internal name is "data_processing_agent".
...

-----------------------------------------------------------
Contents:
{"parts":[{"text":"Hi, introduce yourself."}],"role":"user"}
{"parts":[{"text":"I am data_processing_agent, a hello world agent that can roll a dice of 8 sides and check prime numbers."}],"role":"model"}
{"parts":[{"text":"Roll a die with 100 sides and check if it is prime"}],"role":"user"}
{"parts":[{"text":"{\"args\":{\"sides\":100},\"name\":\"roll_die\"}"}],"role":"model"}
{"parts":[{"text":"Invoking tool `roll_die` produced: `{\"result\": 26}`."}],"role":"user"}
{"parts":[{"text":"{\"args\":{\"nums\":[26]},\"name\":\"check_prime\"}"}],"role":"model"}
{"parts":[{"text":"Invoking tool `check_prime` produced: `{\"result\": \"No prime numbers found.\"}`."}],"role":"user"}
{"parts":[{"text":"Okay, the roll was 26, and it is not a prime number."}],"role":"model"}
{"parts":[{"text":"Roll it again."}],"role":"user"}
{"parts":[{"text":"{\"args\":{\"sides\":100},\"name\":\"roll_die\"}"}],"role":"model"}
{"parts":[{"text":"Invoking tool `roll_die` produced: `{\"result\": 69}`."}],"role":"user"}
{"parts":[{"text":"{\"args\":{\"nums\":[69]},\"name\":\"check_prime\"}"}],"role":"model"}
{"parts":[{"text":"Invoking tool `check_prime` produced: `{\"result\": \"No prime numbers found.\"}`."}],"role":"user"}
{"parts":[{"text":"The roll was 69, and it is not a prime number."}],"role":"model"}
{"parts":[{"text":"What numbers did I get?"}],"role":"user"}
-----------------------------------------------------------
Functions:

-----------------------------------------------------------

Adds support for locally-running Gemma3 models exposed via Ollama.
This will enable developers to run fully-local agent workflows, using
the larger two Gemma3 models.

Functionality is achieved by extending LiteLlm models and using a
custom prompt template with some request/response pre/post processing.

As part of this work, the existing Gemma 3 support (via Gemini API) is
refactored to clarify functionality and support broader reuse across
both modes of interacting with the LLM.

A separate `hello_world_gemma3_ollama` example is provided to highlight
local usage.

NOTE: Adds an optional dependency on `instructor` for finding and parsing
JSON in Gemma3 response blocks.

Testing

Test Plan

- add and run integration and unit tests
- manual run of both `hello_world_gemma` and `hellow_world_gemma3_ollama` agents
- manual run of `multi_tool_agent` from quickstart using new `Gemma3Ollama` LLM.

Automated Tests

| Test Command | Results |
|--------------|---------|
| pytest ./tests/unittests | 2779 passed, 2387 warnings in 63.03s |
| pytest ./tests/unittests/models/test_gemma_llm.py | 15 passed in 4.06s |
| pytest ./tests/integration/models/test_gemma_llm.py | 1 passed in 33.22s |

Manual Tests

Log of running `multi_agent_tool` with a locally-built wheel:

```
[user]: what is the weather in new york?
15:12:24 - LiteLLM:INFO: utils.py:3373 -
LiteLLM completion() model= gemma3:12b; provider = ollama
15:12:28 - LiteLLM:INFO: utils.py:3373 -
LiteLLM completion() model= gemma3:12b; provider = ollama
[weather_time_agent]: The weather in New York is sunny with a temperature of 25 degrees Celsius (77 degrees Fahrenheit).

[user]: what is the time in new york?
15:12:43 - LiteLLM:INFO: utils.py:3373 -
LiteLLM completion() model= gemma3:12b; provider = ollama
15:12:48 - LiteLLM:INFO: utils.py:3373 -
LiteLLM completion() model= gemma3:12b; provider = ollama
[weather_time_agent]: The current time in New York is 2025-10-08 18:12:48 EDT-0400.
```

`DEBUG` log snippet of an agent run:

```
2025-10-08 15:32:33,322 - DEBUG - lite_llm.py:810 -
LLM Request:
-----------------------------------------------------------
System Instruction:

      You roll dice and answer questions about the outcome of the dice rolls.
...

You are an agent. Your internal name is "data_processing_agent".
...

-----------------------------------------------------------
Contents:
{"parts":[{"text":"Hi, introduce yourself."}],"role":"user"}
{"parts":[{"text":"I am data_processing_agent, a hello world agent that can roll a dice of 8 sides and check prime numbers."}],"role":"model"}
{"parts":[{"text":"Roll a die with 100 sides and check if it is prime"}],"role":"user"}
{"parts":[{"text":"{\"args\":{\"sides\":100},\"name\":\"roll_die\"}"}],"role":"model"}
{"parts":[{"text":"Invoking tool `roll_die` produced: `{\"result\": 26}`."}],"role":"user"}
{"parts":[{"text":"{\"args\":{\"nums\":[26]},\"name\":\"check_prime\"}"}],"role":"model"}
{"parts":[{"text":"Invoking tool `check_prime` produced: `{\"result\": \"No prime numbers found.\"}`."}],"role":"user"}
{"parts":[{"text":"Okay, the roll was 26, and it is not a prime number."}],"role":"model"}
{"parts":[{"text":"Roll it again."}],"role":"user"}
{"parts":[{"text":"{\"args\":{\"sides\":100},\"name\":\"roll_die\"}"}],"role":"model"}
{"parts":[{"text":"Invoking tool `roll_die` produced: `{\"result\": 69}`."}],"role":"user"}
{"parts":[{"text":"{\"args\":{\"nums\":[69]},\"name\":\"check_prime\"}"}],"role":"model"}
{"parts":[{"text":"Invoking tool `check_prime` produced: `{\"result\": \"No prime numbers found.\"}`."}],"role":"user"}
{"parts":[{"text":"The roll was 69, and it is not a prime number."}],"role":"model"}
{"parts":[{"text":"What numbers did I get?"}],"role":"user"}
-----------------------------------------------------------
Functions:

-----------------------------------------------------------
```
Copy link

Summary of Changes

Hello @douglas-reid, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces support for Gemma3 models running locally via Ollama, refactors existing Gemma3 Gemini API integration, and adds a new example. Key changes include a new Gemma3Ollama class, renaming Gemma to Gemma3GeminiAPI, sharing function call processing logic, and using the instructor library for robust JSON parsing. The instructor library is now an optional dependency. The changes were verified through automated and manual tests.

Highlights

  • Intent: This pull request introduces support for Gemma3 models running locally via Ollama, enabling fully-local agent workflows. It also refactors the existing Gemma3 support (via Gemini API) for better clarity and reuse, and adds a new example for local usage.
  • Key Changes: A new Gemma3Ollama class has been added to integrate Gemma3 models with Ollama. The existing Gemma class has been renamed to Gemma3GeminiAPI and its function call processing logic has been refactored into reusable global helper functions. JSON parsing from model responses now uses the instructor library for improved robustness. A new example, hello_world_gemma3_ollama, demonstrates the local integration. The instructor library has been added as an optional dependency.
  • Detailed Changes: In src/google/adk/models/gemma_llm.py, the Gemma3Ollama class was introduced, inheriting from LiteLlm, and includes custom prompt template registration for Ollama. The former Gemma class was renamed to Gemma3GeminiAPI, and its methods for handling function calls (_move_function_calls_into_system_instruction, _extract_function_calls_from_response) were extracted into global helper functions. The JSON parsing within _extract_function_calls_from_response now utilizes instructor.utils.extract_json_from_codeblock. src/google/adk/models/__init__.py was updated to reflect the new class names and their registration. A new example, contributing/samples/hello_world_gemma3_ollama, was added, and the existing hello_world_gemma/agent.py was updated to use Gemma3GeminiAPI. The pyproject.toml file now includes instructor as an optional dependency. Unit and integration tests were updated to align with the class renames, model changes, and the new shared helper functions.
  • Testing: The changes were verified through unit and integration tests, specifically for test_gemma_llm.py. Manual runs of both the hello_world_gemma and the new hello_world_gemma3_ollama examples were performed. Additionally, the multi_tool_agent from the quickstart was manually run using the new Gemma3Ollama LLM, with logs demonstrating successful execution.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot adk-bot added bot triaged models [Component] Issues related to model support labels Oct 8, 2025
@adk-bot
Copy link
Collaborator

adk-bot commented Oct 8, 2025

Response from ADK Triaging Agent

Hello @douglas-reid, thank you for your contribution!

To help us track this new feature, could you please create a GitHub issue and link it to this PR? According to our contribution guidelines, all new features should have an associated issue.

Thanks for your help!

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a great feature addition that enables local agent workflows with Gemma3 models via Ollama. The refactoring of the existing Gemma support to share logic between the Gemini API and Ollama integrations is well-executed and improves code maintainability.

I've identified a few areas for improvement, including a critical issue in the tests, better handling of an optional dependency, and some minor enhancements for clarity in the new example code. Please see my detailed comments below.

Comment on lines +125 to +127
async def test_gemma_gemini_preprocess_request_with_tools(
llm_request_with_tools,
):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The @pytest.mark.asyncio decorator is missing for this async test function. Without it, pytest will not run this test correctly as a coroutine. This will likely result in a RuntimeWarning and the test's assertions will not be properly executed, as the coroutine is never awaited.

Suggested change
async def test_gemma_gemini_preprocess_request_with_tools(
llm_request_with_tools,
):
@pytest.mark.asyncio
async def test_gemma_gemini_preprocess_request_with_tools(
llm_request_with_tools,
):

Comment on lines +244 to +269
try:
import instructor

json_candidate = instructor.utils.extract_json_from_codeblock(response_text)

if not json_candidate:
return

function_call_parsed = Gemma3FunctionCallModel.model_validate_json(
json_candidate
)
function_call = types.FunctionCall(
name=function_call_parsed.name,
args=function_call_parsed.parameters,
)
function_call_part = Part(function_call=function_call)
llm_response.content.parts = [function_call_part]
except (json.JSONDecodeError, ValidationError) as e:
logger.debug(
f"Error attempting to parse JSON into function call. Leaving as text"
f" response. %s",
e,
)
except Exception as e:
logger.warning("Error processing Gemma function call response: %s", e)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation uses a broad except Exception which can catch an ImportError if the optional instructor package isn't installed. This would result in a generic and potentially confusing log message for the user.

To provide a better user experience, it's best to explicitly handle ImportError and provide a clear message explaining that the optional dependency is missing and how to install it.

  try:
    import instructor
  except ImportError:
    logger.warning(
        "The 'instructor' package is required for Gemma3 function calling but is not installed. "
        "Text response will be returned. To enable function calling, run: pip install \"google-adk[extensions]\""
    )
    return

  try:
    json_candidate = instructor.utils.extract_json_from_codeblock(response_text)

    if not json_candidate:
      return

    function_call_parsed = Gemma3FunctionCallModel.model_validate_json(
        json_candidate
    )
    function_call = types.FunctionCall(
        name=function_call_parsed.name,
        args=function_call_parsed.parameters,
    )
    function_call_part = Part(function_call=function_call)
    llm_response.content.parts = [function_call_part]
  except (json.JSONDecodeError, ValidationError) as e:
    logger.debug(
        f"Error attempting to parse JSON into function call. Leaving as text"
        f" response. %s",
        e,
    )
  except Exception as e:
    logger.warning("Error processing Gemma function call response: %s", e)

Comment on lines +73 to +76
You roll dice and answer questions about the outcome of the dice rolls.
You can roll dice of different sizes.
You can use multiple tools in parallel by calling functions in parallel(in one request and in one round).
It is ok to discuss previous dice roles, and comment on the dice rolls.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There are a couple of typos in the instruction string that could affect model behavior or confuse readers.

  • parallel(in should be parallel (in
  • dice roles should be dice rolls
Suggested change
You roll dice and answer questions about the outcome of the dice rolls.
You can roll dice of different sizes.
You can use multiple tools in parallel by calling functions in parallel(in one request and in one round).
It is ok to discuss previous dice roles, and comment on the dice rolls.
You roll dice and answer questions about the outcome of the dice rolls.
You can roll dice of different sizes.
You can use multiple tools in parallel by calling functions in parallel (in one request and in one round).
It is ok to discuss previous dice rolls, and comment on the dice rolls.

Comment on lines +44 to +46
session_11 = await session_service.create_session(
app_name=app_name, user_id=user_id_1
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better readability and consistency with user_id_1, consider renaming session_11 to session_1.

This change should also be applied to its usages on lines 64, 66, 68, and 69.

Suggested change
session_11 = await session_service.create_session(
app_name=app_name, user_id=user_id_1
)
session_1 = await session_service.create_session(
app_name=app_name, user_id=user_id_1
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

models [Component] Issues related to model support

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants