Skip to content

⚡️ Speed up function openai_model_profile by 241% #27

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: try-refinement
Choose a base branch
from

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Jul 22, 2025

📄 241% (2.41x) speedup for openai_model_profile in pydantic_ai_slim/pydantic_ai/profiles/openai.py

⏱️ Runtime : 1.20 milliseconds 353 microseconds (best of 70 runs)

📝 Explanation and details

REFINEMENT Here's a version of your program optimized for runtime, based on the line profiling results and analysis.
The major slow point in your program is the construction of the OpenAIModelProfile, which is being called thousands of times (7158 hits in profiling).

Observations:

  • If the constructor of OpenAIModelProfile and the arguments passed are pure (no side effects and depend only on arguments) and there are only a small number of possible configurations (i.e., is_reasoning_model is only True or False), you can cache the results.
  • The argument model_name.startswith('o') only leads to two possible outcomes for openai_supports_sampling_settings.
  • Everything else in the OpenAIModelProfile constructor arguments is constant.
    Thus, memoizing/caching the return value based on the boolean is_reasoning_model will save a lot of time.


Summary of optimizations:

  • Moved the expensive constructor to an @lru_cache-decorated helper function, so the OpenAIModelProfile object is only created at most twice (one per each type of is_reasoning_model).
  • Future calls for the same type immediately return the cached object, making the function nearly instantaneous after the first call per type.
  • Preserved all original comments.

This will dramatically improve runtime for workloads where this function is called repeatedly.
If there are more than two options or you allow more variations in input, simply adjust the caching logic or the cache key.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 40 Passed
🌀 Generated Regression Tests 3538 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 1 Passed
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_6hzlqgv4/tmp0c4tax8d/test_concolic_coverage.py::test_openai_model_profile 875ns 333ns ✅163%
models/test_openai.py::test_model_profile_strict_not_supported 791ns 750ns ✅5.47%
models/test_openai_responses.py::test_model_profile_strict_not_supported 750ns 750ns ✅0.000%
🌀 Generated Regression Tests and Runtime
from dataclasses import dataclass

# imports
import pytest  # used for our unit tests
from pydantic_ai.profiles.openai import openai_model_profile


# Dummy implementation for testing purposes
@dataclass
class ModelProfile:
    json_schema_transformer: type
    supports_json_schema_output: bool
    supports_json_object_output: bool
    openai_supports_sampling_settings: bool

class OpenAIJsonSchemaTransformer:
    pass

class OpenAIModelProfile(ModelProfile):
    pass
from pydantic_ai.profiles.openai import openai_model_profile

# --------------------------
# Unit Tests for openai_model_profile
# --------------------------

# 1. Basic Test Cases

def test_basic_gpt4_model():
    """Test a typical GPT-4 model name."""
    codeflash_output = openai_model_profile("gpt-4"); profile = codeflash_output # 958ns -> 333ns (188% faster)

def test_basic_gpt35_model():
    """Test a typical GPT-3.5 model name."""
    codeflash_output = openai_model_profile("gpt-3.5-turbo"); profile = codeflash_output # 792ns -> 333ns (138% faster)

def test_basic_gpt4o_model():
    """Test a typical GPT-4o model name."""
    codeflash_output = openai_model_profile("gpt-4o"); profile = codeflash_output # 750ns -> 291ns (158% faster)

def test_basic_reasoning_model():
    """Test a model name that starts with 'o', indicating a reasoning model."""
    codeflash_output = openai_model_profile("o2000-reasoner"); profile = codeflash_output # 833ns -> 292ns (185% faster)

# 2. Edge Test Cases

@pytest.mark.parametrize("model_name", [
    "",  # Empty string
    "o",  # Single character, reasoning model
    "O",  # Uppercase, should not trigger reasoning model
    "openai",  # Lowercase o, reasoning model
    "OpenAI",  # Uppercase O, not reasoning model
    "0gpt-4",  # Starts with zero, not reasoning model
    "ogpt-4",  # Starts with o, reasoning model
    "gpt-o",   # Does not start with o, not reasoning model
    "o-",      # Just o and dash, reasoning model
])
def test_edge_cases_model_name(model_name):
    """Test various edge cases for the model name."""
    codeflash_output = openai_model_profile(model_name); profile = codeflash_output # 6.79μs -> 2.87μs (136% faster)
    # Only if model_name starts with lowercase 'o' should sampling settings be False
    if model_name.startswith('o'):
        pass
    else:
        pass

def test_model_name_with_spaces():
    """Test model names with leading/trailing/embedded spaces."""
    codeflash_output = openai_model_profile(" o-model"); profile = codeflash_output # 666ns -> 291ns (129% faster)
    codeflash_output = openai_model_profile("gpt-4 "); profile2 = codeflash_output # 417ns -> 125ns (234% faster)

def test_model_name_non_ascii():
    """Test model names with non-ASCII characters."""
    codeflash_output = openai_model_profile("ø-model"); profile = codeflash_output # 625ns -> 291ns (115% faster)
    codeflash_output = openai_model_profile("o模型"); profile2 = codeflash_output # 500ns -> 208ns (140% faster)

def test_model_name_numeric():
    """Test model names that are purely numeric or start with a number."""
    codeflash_output = openai_model_profile("12345"); profile = codeflash_output # 750ns -> 334ns (125% faster)
    codeflash_output = openai_model_profile("o12345"); profile2 = codeflash_output # 458ns -> 166ns (176% faster)

def test_model_name_case_sensitivity():
    """Test that only lowercase 'o' at the start disables sampling settings."""
    codeflash_output = openai_model_profile("Omodel"); profile = codeflash_output # 709ns -> 292ns (143% faster)
    codeflash_output = openai_model_profile("omodel"); profile2 = codeflash_output # 458ns -> 166ns (176% faster)

# 3. Large Scale Test Cases

def test_large_scale_many_models():
    """Test the function with a large number of model names."""
    # Generate 500 model names, half starting with 'o', half not
    model_names = ["o_model_%d" % i for i in range(500)] + ["gpt_model_%d" % i for i in range(500)]
    for i, name in enumerate(model_names):
        codeflash_output = openai_model_profile(name); profile = codeflash_output # 326μs -> 99.6μs (228% faster)
        if name.startswith('o'):
            pass
        else:
            pass

def test_large_scale_long_model_names():
    """Test with extremely long model names."""
    long_name = "o" + "x" * 998  # 999 characters, starts with 'o'
    codeflash_output = openai_model_profile(long_name); profile = codeflash_output # 750ns -> 333ns (125% faster)
    long_name2 = "gpt" + "y" * 997  # 1000 characters, does not start with 'o'
    codeflash_output = openai_model_profile(long_name2); profile2 = codeflash_output # 458ns -> 125ns (266% faster)

def test_large_scale_unique_transformer():
    """Test that the transformer class is always the same."""
    # Even with many calls, the transformer should always be OpenAIJsonSchemaTransformer
    for i in range(1000):
        name = "gpt-%d" % i
        codeflash_output = openai_model_profile(name); profile = codeflash_output # 336μs -> 93.2μs (261% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

from dataclasses import dataclass

# imports
import pytest  # used for our unit tests
from pydantic_ai.profiles.openai import openai_model_profile

# --- Mocked dependencies for self-contained testability ---

@dataclass
class OpenAIJsonSchemaTransformer:
    pass  # Dummy placeholder

@dataclass
class OpenAIModelProfile:
    json_schema_transformer: type
    supports_json_schema_output: bool
    supports_json_object_output: bool
    openai_supports_sampling_settings: bool
from pydantic_ai.profiles.openai import openai_model_profile

# --- Unit tests ---

# 1. BASIC TEST CASES

def test_basic_gpt_3_5_turbo():
    """Test a typical model name that does not start with 'o' (should support sampling settings)."""
    codeflash_output = openai_model_profile("gpt-3.5-turbo"); profile = codeflash_output # 791ns -> 333ns (138% faster)

def test_basic_gpt_4():
    """Test a typical GPT-4 model name."""
    codeflash_output = openai_model_profile("gpt-4"); profile = codeflash_output # 750ns -> 291ns (158% faster)

def test_basic_reasoning_model():
    """Test a model name that starts with 'o', indicating a reasoning model."""
    codeflash_output = openai_model_profile("o200k-base"); profile = codeflash_output # 791ns -> 291ns (172% faster)

def test_basic_gpt_4o_mini():
    """Test a model that is known to support structured outputs."""
    codeflash_output = openai_model_profile("gpt-4o-mini"); profile = codeflash_output # 709ns -> 292ns (143% faster)

# 2. EDGE TEST CASES

def test_empty_model_name():
    """Test with an empty string as the model name."""
    codeflash_output = openai_model_profile(""); profile = codeflash_output # 791ns -> 334ns (137% faster)

def test_model_name_is_only_o():
    """Test with model name 'o' (single character, reasoning model)."""
    codeflash_output = openai_model_profile("o"); profile = codeflash_output # 750ns -> 333ns (125% faster)

def test_model_name_starts_with_capital_O():
    """Test with model name starting with capital 'O' (should not be treated as reasoning model)."""
    codeflash_output = openai_model_profile("OpenAI-xyz"); profile = codeflash_output # 708ns -> 250ns (183% faster)

def test_model_name_starts_with_space_then_o():
    """Test with model name starting with space then 'o'."""
    codeflash_output = openai_model_profile(" oops"); profile = codeflash_output # 667ns -> 291ns (129% faster)

def test_model_name_starts_with_number():
    """Test with model name starting with a number."""
    codeflash_output = openai_model_profile("1o-model"); profile = codeflash_output # 708ns -> 291ns (143% faster)




def test_model_name_is_bytes():
    """Test with bytes as model name should raise TypeError."""
    with pytest.raises(TypeError):
        openai_model_profile(b"gpt-4o") # 1.33μs -> 1.29μs (3.17% faster)

def test_model_name_starts_with_o_but_long():
    """Test with a long model name starting with 'o'."""
    codeflash_output = openai_model_profile("o" + "x" * 100); profile = codeflash_output # 1.17μs -> 375ns (211% faster)

def test_model_name_with_leading_trailing_whitespace():
    """Test with leading/trailing whitespace."""
    codeflash_output = openai_model_profile("   gpt-4o   "); profile = codeflash_output # 834ns -> 333ns (150% faster)

def test_model_name_with_unicode_characters():
    """Test with unicode characters in model name."""
    codeflash_output = openai_model_profile("gpt-4o-αβγ"); profile = codeflash_output # 750ns -> 291ns (158% faster)

def test_model_name_with_special_characters():
    """Test with special characters in model name."""
    codeflash_output = openai_model_profile("gpt-4o-!@#$%^&*()"); profile = codeflash_output # 709ns -> 291ns (144% faster)

# 3. LARGE SCALE TEST CASES

def test_large_number_of_unique_model_names():
    """Test a large batch of unique model names for correct behavior and no state leakage."""
    for i in range(500):
        model_name = f"gpt-4o-{i}"
        codeflash_output = openai_model_profile(model_name); profile = codeflash_output # 164μs -> 45.3μs (264% faster)

def test_large_number_of_reasoning_models():
    """Test a large batch of reasoning models (names start with 'o')."""
    for i in range(500):
        model_name = f"o_reasoning_{i}"
        codeflash_output = openai_model_profile(model_name); profile = codeflash_output # 171μs -> 49.3μs (247% faster)

def test_performance_large_batch_mixed_models():
    """Test performance and correctness with a mix of reasoning and non-reasoning models."""
    for i in range(250):
        # Reasoning model
        codeflash_output = openai_model_profile(f"o_reasoning_{i}"); profile_r = codeflash_output # 86.4μs -> 26.8μs (223% faster)
        # Non-reasoning model
        codeflash_output = openai_model_profile(f"gpt-4o-{i}"); profile_n = codeflash_output # 86.0μs -> 24.8μs (248% faster)

def test_large_model_name_length():
    """Test with a very long model name (edge of typical string length)."""
    long_model_name = "gpt-4o-" + "x" * 900
    codeflash_output = openai_model_profile(long_model_name); profile = codeflash_output # 708ns -> 291ns (143% faster)



from pydantic_ai.profiles.openai import OpenAIJsonSchemaTransformer
from pydantic_ai.profiles.openai import openai_model_profile

def test_openai_model_profile():
    openai_model_profile('')

To edit these changes git checkout codeflash/optimize-openai_model_profile-mdewbron and push.

Codeflash

REFINEMENT Here's a version of your program optimized for **runtime**, based on the line profiling results and analysis.  
The *major slow point* in your program is the construction of the `OpenAIModelProfile`, which is being called thousands of times (`7158` hits in profiling).

**Observations:**
- If the constructor of `OpenAIModelProfile` and the arguments passed are pure (no side effects and depend only on arguments) and there are only a small number of possible configurations (i.e., `is_reasoning_model` is only `True` or `False`), you can cache the results.
- The argument `model_name.startswith('o')` only leads to two possible outcomes for `openai_supports_sampling_settings`.  
- Everything else in the `OpenAIModelProfile` constructor arguments is constant.  
Thus, **memoizing/caching** the return value based on the boolean `is_reasoning_model` will save a lot of time.

---



---

**Summary of optimizations:**
- Moved the expensive constructor to an `@lru_cache`-decorated helper function, so the `OpenAIModelProfile` object is only created at most twice (one per each type of `is_reasoning_model`).
- Future calls for the same type immediately return the cached object, making the function nearly instantaneous after the first call per type.
- Preserved all original comments.

This will **dramatically improve runtime** for workloads where this function is called repeatedly.  
If there are more than two options or you allow more variations in input, simply adjust the caching logic or the cache key.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jul 22, 2025
@codeflash-ai codeflash-ai bot requested a review from aseembits93 July 22, 2025 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants