feat: add gemma3 ollama model support #3120

douglas-reid · 2025-10-08T22:41:57Z

Adds support for locally-running Gemma3 models exposed via Ollama. This will enable developers to run fully-local agent workflows, using the larger two Gemma3 models.

Functionality is achieved by extending LiteLlm models and using a custom prompt template with some request/response pre/post processing.

As part of this work, the existing Gemma 3 support (via Gemini API) is refactored to clarify functionality and support broader reuse across both modes of interacting with the LLM.

A separate hello_world_gemma3_ollama example is provided to highlight local usage.

NOTE: Adds an optional dependency on instructor for finding and parsing JSON in Gemma3 response blocks.

Testing

Test Plan

add and run integration and unit tests
manual run of both hello_world_gemma and hellow_world_gemma3_ollama agents
manual run of multi_tool_agent from quickstart using new Gemma3Ollama LLM.

Automated Tests

Test Command	Results
pytest ./tests/unittests	2779 passed, 2387 warnings in 63.03s
pytest ./tests/unittests/models/test_gemma_llm.py	15 passed in 4.06s
pytest ./tests/integration/models/test_gemma_llm.py	1 passed in 33.22s

Manual Tests

Log of running multi_agent_tool with a locally-built wheel:

[user]: what is the weather in new york?
15:12:24 - LiteLLM:INFO: utils.py:3373 -
LiteLLM completion() model= gemma3:12b; provider = ollama
15:12:28 - LiteLLM:INFO: utils.py:3373 -
LiteLLM completion() model= gemma3:12b; provider = ollama
[weather_time_agent]: The weather in New York is sunny with a temperature of 25 degrees Celsius (77 degrees Fahrenheit).

[user]: what is the time in new york?
15:12:43 - LiteLLM:INFO: utils.py:3373 -
LiteLLM completion() model= gemma3:12b; provider = ollama
15:12:48 - LiteLLM:INFO: utils.py:3373 -
LiteLLM completion() model= gemma3:12b; provider = ollama
[weather_time_agent]: The current time in New York is 2025-10-08 18:12:48 EDT-0400.

DEBUG log snippet of an agent run:

2025-10-08 15:32:33,322 - DEBUG - lite_llm.py:810 -
LLM Request:
-----------------------------------------------------------
System Instruction:

      You roll dice and answer questions about the outcome of the dice rolls.
...

You are an agent. Your internal name is "data_processing_agent".
...

-----------------------------------------------------------
Contents:
{"parts":[{"text":"Hi, introduce yourself."}],"role":"user"}
{"parts":[{"text":"I am data_processing_agent, a hello world agent that can roll a dice of 8 sides and check prime numbers."}],"role":"model"}
{"parts":[{"text":"Roll a die with 100 sides and check if it is prime"}],"role":"user"}
{"parts":[{"text":"{\"args\":{\"sides\":100},\"name\":\"roll_die\"}"}],"role":"model"}
{"parts":[{"text":"Invoking tool `roll_die` produced: `{\"result\": 26}`."}],"role":"user"}
{"parts":[{"text":"{\"args\":{\"nums\":[26]},\"name\":\"check_prime\"}"}],"role":"model"}
{"parts":[{"text":"Invoking tool `check_prime` produced: `{\"result\": \"No prime numbers found.\"}`."}],"role":"user"}
{"parts":[{"text":"Okay, the roll was 26, and it is not a prime number."}],"role":"model"}
{"parts":[{"text":"Roll it again."}],"role":"user"}
{"parts":[{"text":"{\"args\":{\"sides\":100},\"name\":\"roll_die\"}"}],"role":"model"}
{"parts":[{"text":"Invoking tool `roll_die` produced: `{\"result\": 69}`."}],"role":"user"}
{"parts":[{"text":"{\"args\":{\"nums\":[69]},\"name\":\"check_prime\"}"}],"role":"model"}
{"parts":[{"text":"Invoking tool `check_prime` produced: `{\"result\": \"No prime numbers found.\"}`."}],"role":"user"}
{"parts":[{"text":"The roll was 69, and it is not a prime number."}],"role":"model"}
{"parts":[{"text":"What numbers did I get?"}],"role":"user"}
-----------------------------------------------------------
Functions:

-----------------------------------------------------------

Adds support for locally-running Gemma3 models exposed via Ollama. This will enable developers to run fully-local agent workflows, using the larger two Gemma3 models. Functionality is achieved by extending LiteLlm models and using a custom prompt template with some request/response pre/post processing. As part of this work, the existing Gemma 3 support (via Gemini API) is refactored to clarify functionality and support broader reuse across both modes of interacting with the LLM. A separate `hello_world_gemma3_ollama` example is provided to highlight local usage. NOTE: Adds an optional dependency on `instructor` for finding and parsing JSON in Gemma3 response blocks. Testing Test Plan - add and run integration and unit tests - manual run of both `hello_world_gemma` and `hellow_world_gemma3_ollama` agents - manual run of `multi_tool_agent` from quickstart using new `Gemma3Ollama` LLM. Automated Tests | Test Command | Results | |--------------|---------| | pytest ./tests/unittests | 2779 passed, 2387 warnings in 63.03s | | pytest ./tests/unittests/models/test_gemma_llm.py | 15 passed in 4.06s | | pytest ./tests/integration/models/test_gemma_llm.py | 1 passed in 33.22s | Manual Tests Log of running `multi_agent_tool` with a locally-built wheel: ``` [user]: what is the weather in new york? 15:12:24 - LiteLLM:INFO: utils.py:3373 - LiteLLM completion() model= gemma3:12b; provider = ollama 15:12:28 - LiteLLM:INFO: utils.py:3373 - LiteLLM completion() model= gemma3:12b; provider = ollama [weather_time_agent]: The weather in New York is sunny with a temperature of 25 degrees Celsius (77 degrees Fahrenheit). [user]: what is the time in new york? 15:12:43 - LiteLLM:INFO: utils.py:3373 - LiteLLM completion() model= gemma3:12b; provider = ollama 15:12:48 - LiteLLM:INFO: utils.py:3373 - LiteLLM completion() model= gemma3:12b; provider = ollama [weather_time_agent]: The current time in New York is 2025-10-08 18:12:48 EDT-0400. ``` `DEBUG` log snippet of an agent run: ``` 2025-10-08 15:32:33,322 - DEBUG - lite_llm.py:810 - LLM Request: ----------------------------------------------------------- System Instruction: You roll dice and answer questions about the outcome of the dice rolls. ... You are an agent. Your internal name is "data_processing_agent". ... ----------------------------------------------------------- Contents: {"parts":[{"text":"Hi, introduce yourself."}],"role":"user"} {"parts":[{"text":"I am data_processing_agent, a hello world agent that can roll a dice of 8 sides and check prime numbers."}],"role":"model"} {"parts":[{"text":"Roll a die with 100 sides and check if it is prime"}],"role":"user"} {"parts":[{"text":"{\"args\":{\"sides\":100},\"name\":\"roll_die\"}"}],"role":"model"} {"parts":[{"text":"Invoking tool `roll_die` produced: `{\"result\": 26}`."}],"role":"user"} {"parts":[{"text":"{\"args\":{\"nums\":[26]},\"name\":\"check_prime\"}"}],"role":"model"} {"parts":[{"text":"Invoking tool `check_prime` produced: `{\"result\": \"No prime numbers found.\"}`."}],"role":"user"} {"parts":[{"text":"Okay, the roll was 26, and it is not a prime number."}],"role":"model"} {"parts":[{"text":"Roll it again."}],"role":"user"} {"parts":[{"text":"{\"args\":{\"sides\":100},\"name\":\"roll_die\"}"}],"role":"model"} {"parts":[{"text":"Invoking tool `roll_die` produced: `{\"result\": 69}`."}],"role":"user"} {"parts":[{"text":"{\"args\":{\"nums\":[69]},\"name\":\"check_prime\"}"}],"role":"model"} {"parts":[{"text":"Invoking tool `check_prime` produced: `{\"result\": \"No prime numbers found.\"}`."}],"role":"user"} {"parts":[{"text":"The roll was 69, and it is not a prime number."}],"role":"model"} {"parts":[{"text":"What numbers did I get?"}],"role":"user"} ----------------------------------------------------------- Functions: ----------------------------------------------------------- ```

gemini-code-assist · 2025-10-08T22:42:20Z

Summary of Changes

Hello @douglas-reid, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces support for Gemma3 models running locally via Ollama, refactors existing Gemma3 Gemini API integration, and adds a new example. Key changes include a new Gemma3Ollama class, renaming Gemma to Gemma3GeminiAPI, sharing function call processing logic, and using the instructor library for robust JSON parsing. The instructor library is now an optional dependency. The changes were verified through automated and manual tests.

Highlights

Intent: This pull request introduces support for Gemma3 models running locally via Ollama, enabling fully-local agent workflows. It also refactors the existing Gemma3 support (via Gemini API) for better clarity and reuse, and adds a new example for local usage.
Key Changes: A new Gemma3Ollama class has been added to integrate Gemma3 models with Ollama. The existing Gemma class has been renamed to Gemma3GeminiAPI and its function call processing logic has been refactored into reusable global helper functions. JSON parsing from model responses now uses the instructor library for improved robustness. A new example, hello_world_gemma3_ollama, demonstrates the local integration. The instructor library has been added as an optional dependency.
Detailed Changes: In src/google/adk/models/gemma_llm.py, the Gemma3Ollama class was introduced, inheriting from LiteLlm, and includes custom prompt template registration for Ollama. The former Gemma class was renamed to Gemma3GeminiAPI, and its methods for handling function calls (_move_function_calls_into_system_instruction, _extract_function_calls_from_response) were extracted into global helper functions. The JSON parsing within _extract_function_calls_from_response now utilizes instructor.utils.extract_json_from_codeblock. src/google/adk/models/__init__.py was updated to reflect the new class names and their registration. A new example, contributing/samples/hello_world_gemma3_ollama, was added, and the existing hello_world_gemma/agent.py was updated to use Gemma3GeminiAPI. The pyproject.toml file now includes instructor as an optional dependency. Unit and integration tests were updated to align with the class renames, model changes, and the new shared helper functions.
Testing: The changes were verified through unit and integration tests, specifically for test_gemma_llm.py. Manual runs of both the hello_world_gemma and the new hello_world_gemma3_ollama examples were performed. Additionally, the multi_tool_agent from the quickstart was manually run using the new Gemma3Ollama LLM, with logs demonstrating successful execution.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

adk-bot · 2025-10-08T22:42:52Z

Response from ADK Triaging Agent

Hello @douglas-reid, thank you for your contribution!

To help us track this new feature, could you please create a GitHub issue and link it to this PR? According to our contribution guidelines, all new features should have an associated issue.

Thanks for your help!

gemini-code-assist

Code Review

This is a great feature addition that enables local agent workflows with Gemma3 models via Ollama. The refactoring of the existing Gemma support to share logic between the Gemini API and Ollama integrations is well-executed and improves code maintainability.

I've identified a few areas for improvement, including a critical issue in the tests, better handling of an optional dependency, and some minor enhancements for clarity in the new example code. Please see my detailed comments below.

gemini-code-assist · 2025-10-08T22:44:32Z

tests/unittests/models/test_gemma_llm.py

+async def test_gemma_gemini_preprocess_request_with_tools(
+    llm_request_with_tools,
+):


The @pytest.mark.asyncio decorator is missing for this async test function. Without it, pytest will not run this test correctly as a coroutine. This will likely result in a RuntimeWarning and the test's assertions will not be properly executed, as the coroutine is never awaited.

Suggested change

async def test_gemma_gemini_preprocess_request_with_tools(

llm_request_with_tools,

):

@pytest.mark.asyncio

async def test_gemma_gemini_preprocess_request_with_tools(

llm_request_with_tools,

):

gemini-code-assist · 2025-10-08T22:44:33Z

src/google/adk/models/gemma_llm.py

+  try:
+    import instructor
+
+    json_candidate = instructor.utils.extract_json_from_codeblock(response_text)
+
+    if not json_candidate:
+      return
+
+    function_call_parsed = Gemma3FunctionCallModel.model_validate_json(
+        json_candidate
+    )
+    function_call = types.FunctionCall(
+        name=function_call_parsed.name,
+        args=function_call_parsed.parameters,
+    )
+    function_call_part = Part(function_call=function_call)
+    llm_response.content.parts = [function_call_part]
+  except (json.JSONDecodeError, ValidationError) as e:
+    logger.debug(
+        f"Error attempting to parse JSON into function call. Leaving as text"
+        f" response. %s",
+        e,
+    )
+  except Exception as e:
+    logger.warning("Error processing Gemma function call response: %s", e)
+


The current implementation uses a broad except Exception which can catch an ImportError if the optional instructor package isn't installed. This would result in a generic and potentially confusing log message for the user.

To provide a better user experience, it's best to explicitly handle ImportError and provide a clear message explaining that the optional dependency is missing and how to install it.

try: import instructor except ImportError: logger.warning( "The 'instructor' package is required for Gemma3 function calling but is not installed. " "Text response will be returned. To enable function calling, run: pip install \"google-adk[extensions]\"" ) return try: json_candidate = instructor.utils.extract_json_from_codeblock(response_text) if not json_candidate: return function_call_parsed = Gemma3FunctionCallModel.model_validate_json( json_candidate ) function_call = types.FunctionCall( name=function_call_parsed.name, args=function_call_parsed.parameters, ) function_call_part = Part(function_call=function_call) llm_response.content.parts = [function_call_part] except (json.JSONDecodeError, ValidationError) as e: logger.debug( f"Error attempting to parse JSON into function call. Leaving as text" f" response. %s", e, ) except Exception as e: logger.warning("Error processing Gemma function call response: %s", e)

gemini-code-assist · 2025-10-08T22:44:33Z

contributing/samples/hello_world_gemma3_ollama/agent.py

+      You roll dice and answer questions about the outcome of the dice rolls.
+      You can roll dice of different sizes.
+      You can use multiple tools in parallel by calling functions in parallel(in one request and in one round).
+      It is ok to discuss previous dice roles, and comment on the dice rolls.


There are a couple of typos in the instruction string that could affect model behavior or confuse readers.

parallel(in should be parallel (in

dice roles should be dice rolls

Suggested change

You roll dice and answer questions about the outcome of the dice rolls.

You can roll dice of different sizes.

You can use multiple tools in parallel by calling functions in parallel(in one request and in one round).

It is ok to discuss previous dice roles, and comment on the dice rolls.

You roll dice and answer questions about the outcome of the dice rolls.

You can roll dice of different sizes.

You can use multiple tools in parallel by calling functions in parallel (in one request and in one round).

It is ok to discuss previous dice rolls, and comment on the dice rolls.

gemini-code-assist · 2025-10-08T22:44:33Z

contributing/samples/hello_world_gemma3_ollama/main.py

+  session_11 = await session_service.create_session(
+      app_name=app_name, user_id=user_id_1
+  )


For better readability and consistency with user_id_1, consider renaming session_11 to session_1.

This change should also be applied to its usages on lines 64, 66, 68, and 69.

Suggested change

session_11 = await session_service.create_session(

app_name=app_name, user_id=user_id_1

)

session_1 = await session_service.create_session(

app_name=app_name, user_id=user_id_1

)

adk-bot added bot triaged models [Component] Issues related to model support labels Oct 8, 2025

gemini-code-assist bot reviewed Oct 8, 2025

View reviewed changes

boyangsvl removed the bot triaged label Oct 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add gemma3 ollama model support #3120

feat: add gemma3 ollama model support #3120

Uh oh!

douglas-reid commented Oct 8, 2025

Uh oh!

gemini-code-assist bot commented Oct 8, 2025

Uh oh!

adk-bot commented Oct 8, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Uh oh!

gemini-code-assist bot Oct 8, 2025

Uh oh!

gemini-code-assist bot Oct 8, 2025

Uh oh!

gemini-code-assist bot Oct 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: add gemma3 ollama model support #3120

Are you sure you want to change the base?

feat: add gemma3 ollama model support #3120

Uh oh!

Conversation

douglas-reid commented Oct 8, 2025

Testing

Test Plan

Automated Tests

Manual Tests

Uh oh!

gemini-code-assist bot commented Oct 8, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

adk-bot commented Oct 8, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants