Skip to content

⚡️ Speed up function model_request_stream_sync by 41% #31

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: try-refinement
Choose a base branch
from

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Jul 22, 2025

📄 41% (0.41x) speedup for model_request_stream_sync in pydantic_ai_slim/pydantic_ai/direct.py

⏱️ Runtime : 6.46 microseconds 4.58 microseconds (best of 31 runs)

📝 Explanation and details

REFINEMENT Here is the optimized version of your provided code. The main bottleneck from the profiler is _prepare_model, which is called each time in model_request_stream and therefore in model_request_stream_sync. We can memoize (cache) the output of _prepare_model for each unique combination of (model, instrument) to avoid repeated work, since model instantiation and instrumentation can be expensive and are likely to be repeatedly called with the same arguments in most applications.

Other improvements.

  • Avoid repeated creation of models.ModelRequestParameters() object when not needed.
  • Move repeated attribute lookups out of the hot path.
  • Cache function lookups locally.

Optimized code.

Key points about the optimization:

  • Memoization: The _prepare_model is wrapped in an lru_cache (with small cache size by default; tune as you need).
  • Fallback: If model is not hashable (e.g. a live Python instance), revert to the original code path.
  • Avoid repeated attribute lookups and unnecessary object creation.

You can further tune the memoization size and key logic depending on the production workload and object hashability/uniqueness. The result will be both functionally identical and significantly faster under repeat calls, based on your profiling data.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 9 Passed
🌀 Generated Regression Tests 🔘 None Found
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_direct.py::test_model_request_stream_sync_without_context_manager 6.46μs 4.58μs ✅40.9%

To edit these changes git checkout codeflash/optimize-model_request_stream_sync-mdexw8f6 and push.

Codeflash

REFINEMENT Here is the **optimized version** of your provided code. The main bottleneck from the profiler is `_prepare_model`, which is called each time in `model_request_stream` and therefore in `model_request_stream_sync`. We can **memoize** (cache) the output of `_prepare_model` for each unique combination of `(model, instrument)` to avoid repeated work, since model instantiation and instrumentation can be expensive and are likely to be repeatedly called with the same arguments in most applications.

**Other improvements**.

- Avoid repeated creation of `models.ModelRequestParameters()` object when not needed.
- Move repeated attribute lookups out of the hot path.
- Cache function lookups locally.

#### Optimized code.



**Key points about the optimization:**
- **Memoization:** The `_prepare_model` is wrapped in an `lru_cache` (with small cache size by default; tune as you need).
- **Fallback:** If `model` is not hashable (e.g. a live Python instance), revert to the original code path.
- **Avoid repeated attribute lookups** and **unnecessary object creation**.

You can further tune the memoization size and key logic depending on the production workload and object hashability/uniqueness. The result will be both functionally identical and significantly faster under repeat calls, based on your profiling data.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jul 22, 2025
@codeflash-ai codeflash-ai bot requested a review from aseembits93 July 22, 2025 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants