Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 7, 2025

📄 18% (0.18x) speedup for JiraDataSource.submit_bulk_watch in backend/python/app/sources/external/jira/jira.py

⏱️ Runtime : 1.61 milliseconds 1.36 milliseconds (best of 250 runs)

📝 Explanation and details

The optimization achieves an 18% runtime improvement by eliminating unnecessary object allocations and function calls in the submit_bulk_watch method, which is commonly used for bulk operations in Jira integrations.

Key Optimizations Applied

1. Pre-computed Empty Dictionary Reuse

  • Replaced repeated empty dict creation (_path: Dict[str, Any] = {}, _query: Dict[str, Any] = {}) with a shared constant _AS_STR_EMPTY_DICT
  • Eliminates 746 calls to _as_str_dict() for empty dictionaries (reduced from 1119 to 373 calls)
  • Saves ~36% of dictionary conversion overhead

2. Static URL Path Construction

  • Removed expensive _safe_format_url() call since _path is always empty, making URL formatting unnecessary
  • Direct string concatenation (self.base_url + rel_path) replaces template formatting
  • Eliminates 373 calls to _safe_format_url() and associated _SafeDict object creation

3. Conditional Header Processing

  • Only creates header dictionary copy when headers are actually provided
  • Most calls (370/373 in profiling) skip the dict(headers) allocation entirely
  • Uses direct assignment for the common case of no custom headers

Performance Impact

The line profiler shows the optimizations target the most expensive operations:

  • URL formatting dropped from 19% to 2.1% of runtime
  • Dictionary conversions reduced from 24.4% (12.3% + 12.1%) to 1.3% + 1.2% = 2.5% of runtime
  • Total function time decreased from 7.88ms to 5.10ms

Test Case Benefits

The optimization particularly benefits:

  • High-volume concurrent operations (50-200 requests) where object allocation overhead compounds
  • Large issue lists (500-1000 items) where the per-call savings scale significantly
  • Throughput scenarios where the 18% per-call improvement translates directly to higher request rates

This optimization is especially valuable for Jira integrations that perform bulk operations, as each saved allocation and function call reduces both CPU usage and memory pressure in high-throughput scenarios.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 399 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 92.3%
🌀 Generated Regression Tests and Runtime
import asyncio  # used to run async functions

import pytest  # used for our unit tests
from app.sources.external.jira.jira import JiraDataSource

# --- Minimal stubs for dependencies to allow isolated unit testing ---

class HTTPRequest:
    def __init__(self, method, url, headers, path_params, query_params, body):
        self.method = method
        self.url = url
        self.headers = headers
        self.path_params = path_params
        self.query_params = query_params
        self.body = body

class HTTPResponse:
    def __init__(self, data):
        self.data = data
    def json(self):
        return self.data

# Simulate the actual HTTP client behavior for unit testing
class DummyAsyncClient:
    def __init__(self):
        self.responses = []
        self.last_request = None

    async def request(self, method, url, **kwargs):
        # Save the request for inspection
        self.last_request = {
            "method": method,
            "url": url,
            "kwargs": kwargs
        }
        # Simulate a response object
        return DummyResponse({"status": "success", "url": url, "body": kwargs.get("json", {})})

class DummyResponse:
    def __init__(self, data):
        self._data = data
    def json(self):
        return self._data

class DummyHTTPClient:
    def __init__(self, base_url="https://jira.example.com", raise_on_execute=False):
        self.base_url = base_url
        self.client = DummyAsyncClient()
        self.raise_on_execute = raise_on_execute

    async def _ensure_client(self):
        return self.client

    async def execute(self, request, **kwargs):
        if self.raise_on_execute:
            raise RuntimeError("Dummy execute error")
        return HTTPResponse({
            "method": request.method,
            "url": request.url,
            "headers": request.headers,
            "body": request.body,
        })

    def get_base_url(self):
        return self.base_url

    async def __aenter__(self):
        return self

    async def __aexit__(self, exc_type, exc_val, exc_tb):
        pass

class JiraRESTClientViaToken(DummyHTTPClient):
    pass

class JiraClient:
    def __init__(self, client):
        self.client = client
    def get_client(self):
        return self.client
from app.sources.external.jira.jira import JiraDataSource

# ---- Unit Tests ----

# 1. Basic Test Cases

@pytest.mark.asyncio
async def test_submit_bulk_watch_basic_success():
    """Test basic async/await functionality with valid input."""
    client = JiraClient(JiraRESTClientViaToken(base_url="https://jira.example.com"))
    ds = JiraDataSource(client)
    issues = ["ISSUE-1", "ISSUE-2"]
    resp = await ds.submit_bulk_watch(issues)

@pytest.mark.asyncio
async def test_submit_bulk_watch_empty_issues():
    """Test with empty issue list (should succeed and send empty list)."""
    client = JiraClient(JiraRESTClientViaToken(base_url="https://jira.example.com"))
    ds = JiraDataSource(client)
    issues = []
    resp = await ds.submit_bulk_watch(issues)

@pytest.mark.asyncio
async def test_submit_bulk_watch_custom_headers():
    """Test with custom headers provided."""
    client = JiraClient(JiraRESTClientViaToken(base_url="https://jira.example.com"))
    ds = JiraDataSource(client)
    issues = ["ISSUE-1"]
    custom_headers = {"Authorization": "Bearer testtoken"}
    resp = await ds.submit_bulk_watch(issues, headers=custom_headers)

# 2. Edge Test Cases

@pytest.mark.asyncio
async def test_submit_bulk_watch_none_client_raises():
    """Test that initializing with a None client raises ValueError."""
    class DummyNoneClient:
        def get_client(self):
            return None
    with pytest.raises(ValueError, match="HTTP client is not initialized"):
        JiraDataSource(DummyNoneClient())

@pytest.mark.asyncio
async def test_submit_bulk_watch_client_no_base_url_method():
    """Test that client without get_base_url raises ValueError."""
    class NoBaseUrlClient:
        def get_client(self):
            return object()  # No get_base_url method
    with pytest.raises(ValueError, match="HTTP client does not have get_base_url method"):
        JiraDataSource(NoBaseUrlClient())

@pytest.mark.asyncio
async def test_submit_bulk_watch_execute_exception():
    """Test that if execute raises, the exception is propagated."""
    client = JiraClient(JiraRESTClientViaToken(base_url="https://jira.example.com", raise_on_execute=True))
    ds = JiraDataSource(client)
    with pytest.raises(RuntimeError, match="Dummy execute error"):
        await ds.submit_bulk_watch(["ISSUE-1"])

@pytest.mark.asyncio
async def test_submit_bulk_watch_concurrent_execution():
    """Test concurrent execution of submit_bulk_watch (asyncio.gather)."""
    client = JiraClient(JiraRESTClientViaToken(base_url="https://jira.example.com"))
    ds = JiraDataSource(client)
    issue_lists = [
        ["ISSUE-A"],
        ["ISSUE-B", "ISSUE-C"],
        ["ISSUE-D"]
    ]
    # Run three requests concurrently
    responses = await asyncio.gather(
        *(ds.submit_bulk_watch(issues) for issues in issue_lists)
    )
    # Each response should be valid and correspond to its input
    for resp, issues in zip(responses, issue_lists):
        pass

@pytest.mark.asyncio
async def test_submit_bulk_watch_non_string_issue_keys():
    """Test with non-string issue keys (should be stringified)."""
    client = JiraClient(JiraRESTClientViaToken(base_url="https://jira.example.com"))
    ds = JiraDataSource(client)
    issues = [123, True, None]
    resp = await ds.submit_bulk_watch(issues)

# 3. Large Scale Test Cases

@pytest.mark.asyncio
async def test_submit_bulk_watch_large_issue_list():
    """Test with a large number of issue keys (up to 1000)."""
    client = JiraClient(JiraRESTClientViaToken(base_url="https://jira.example.com"))
    ds = JiraDataSource(client)
    issues = [f"ISSUE-{i}" for i in range(1000)]
    resp = await ds.submit_bulk_watch(issues)

@pytest.mark.asyncio
async def test_submit_bulk_watch_many_concurrent_requests():
    """Test many concurrent submit_bulk_watch calls (10 concurrent)."""
    client = JiraClient(JiraRESTClientViaToken(base_url="https://jira.example.com"))
    ds = JiraDataSource(client)
    issue_lists = [[f"ISSUE-{i}"] for i in range(10)]
    responses = await asyncio.gather(
        *(ds.submit_bulk_watch(issues) for issues in issue_lists)
    )
    for i, resp in enumerate(responses):
        pass

# 4. Throughput Test Cases

@pytest.mark.asyncio
async def test_submit_bulk_watch_throughput_small_load():
    """Throughput: Test submit_bulk_watch with small load (5 requests)."""
    client = JiraClient(JiraRESTClientViaToken(base_url="https://jira.example.com"))
    ds = JiraDataSource(client)
    issue_lists = [["ISSUE-1"], ["ISSUE-2"], ["ISSUE-3"], ["ISSUE-4"], ["ISSUE-5"]]
    responses = await asyncio.gather(*(ds.submit_bulk_watch(issues) for issues in issue_lists))
    for resp, issues in zip(responses, issue_lists):
        pass

@pytest.mark.asyncio
async def test_submit_bulk_watch_throughput_medium_load():
    """Throughput: Test submit_bulk_watch with medium load (50 requests)."""
    client = JiraClient(JiraRESTClientViaToken(base_url="https://jira.example.com"))
    ds = JiraDataSource(client)
    issue_lists = [[f"ISSUE-{i}"] for i in range(50)]
    responses = await asyncio.gather(*(ds.submit_bulk_watch(issues) for issues in issue_lists))
    for i, resp in enumerate(responses):
        pass

@pytest.mark.asyncio
async def test_submit_bulk_watch_throughput_large_load():
    """Throughput: Test submit_bulk_watch with large load (200 requests)."""
    client = JiraClient(JiraRESTClientViaToken(base_url="https://jira.example.com"))
    ds = JiraDataSource(client)
    issue_lists = [[f"ISSUE-{i}"] for i in range(200)]
    responses = await asyncio.gather(*(ds.submit_bulk_watch(issues) for issues in issue_lists))
    for i, resp in enumerate(responses):
        pass

@pytest.mark.asyncio
async def test_submit_bulk_watch_throughput_varied_issue_lists():
    """Throughput: Test with varied size issue lists in concurrent requests."""
    client = JiraClient(JiraRESTClientViaToken(base_url="https://jira.example.com"))
    ds = JiraDataSource(client)
    issue_lists = [
        ["ISSUE-1"],
        ["ISSUE-2", "ISSUE-3"],
        ["ISSUE-4", "ISSUE-5", "ISSUE-6"],
        [],
        ["ISSUE-7"] * 50
    ]
    responses = await asyncio.gather(*(ds.submit_bulk_watch(issues) for issues in issue_lists))
    for resp, issues in zip(responses, issue_lists):
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import asyncio  # used to run async functions

import pytest  # used for our unit tests
from app.sources.external.jira.jira import JiraDataSource

# ---- Minimal stubs for required classes (so tests run without external dependencies) ----

class HTTPResponse:
    """Stub for HTTPResponse, used to simulate HTTP responses in tests."""
    def __init__(self, status_code: int, json_data: dict):
        self.status_code = status_code
        self._json_data = json_data

    def json(self):
        return self._json_data

class HTTPRequest:
    """Stub for HTTPRequest, used to simulate HTTP requests in tests."""
    def __init__(self, method, url, headers, path_params, query_params, body):
        self.method = method
        self.url = url
        self.headers = headers
        self.path_params = path_params
        self.query_params = query_params
        self.body = body

# ---- Minimal JiraRESTClientViaToken stub ----

class JiraRESTClientViaToken:
    """Stub for JiraRESTClientViaToken, simulating HTTP client behavior."""
    def __init__(self, base_url: str, token: str, token_type: str = "Bearer") -> None:
        self.base_url = base_url
        self.token = token
        self.token_type = token_type

    def get_base_url(self) -> str:
        return self.base_url

    async def execute(self, req: HTTPRequest):
        # Simulate a response based on input
        # For testing, return a HTTPResponse with status 200 and echo the request body
        return HTTPResponse(
            status_code=200,
            json_data={
                "received": req.body,
                "method": req.method,
                "url": req.url,
                "headers": req.headers
            }
        )

class JiraClient:
    """Stub for JiraClient, wraps the REST client."""
    def __init__(self, client: JiraRESTClientViaToken):
        self.client = client

    def get_client(self):
        return self.client
from app.sources.external.jira.jira import JiraDataSource

# ---- Unit tests ----

# ---- 1. Basic Test Cases ----

@pytest.mark.asyncio
async def test_submit_bulk_watch_basic():
    """Test basic functionality with a normal list of issue IDs."""
    client = JiraClient(JiraRESTClientViaToken("http://jira.example.com", "token"))
    ds = JiraDataSource(client)
    issue_ids = ["ABC-1", "XYZ-2"]
    resp = await ds.submit_bulk_watch(issue_ids)

@pytest.mark.asyncio
async def test_submit_bulk_watch_empty_list():
    """Test with an empty list of issue IDs."""
    client = JiraClient(JiraRESTClientViaToken("http://jira.example.com", "token"))
    ds = JiraDataSource(client)
    issue_ids = []
    resp = await ds.submit_bulk_watch(issue_ids)

@pytest.mark.asyncio
async def test_submit_bulk_watch_custom_headers():
    """Test with custom headers provided."""
    client = JiraClient(JiraRESTClientViaToken("http://jira.example.com", "token"))
    ds = JiraDataSource(client)
    issue_ids = ["ISSUE-123"]
    custom_headers = {"X-Test-Header": "test-value"}
    resp = await ds.submit_bulk_watch(issue_ids, headers=custom_headers)

# ---- 2. Edge Test Cases ----

@pytest.mark.asyncio
async def test_submit_bulk_watch_none_client_raises():
    """Test that ValueError is raised if client is None."""
    class DummyClient:
        def get_client(self):
            return None
    with pytest.raises(ValueError, match="HTTP client is not initialized"):
        JiraDataSource(DummyClient())

@pytest.mark.asyncio
async def test_submit_bulk_watch_missing_get_base_url_raises():
    """Test that ValueError is raised if client does not have get_base_url."""
    class BadClient:
        def get_client(self):
            class NoBaseUrl:
                pass
            return NoBaseUrl()
    with pytest.raises(ValueError, match="HTTP client does not have get_base_url method"):
        JiraDataSource(BadClient())

@pytest.mark.asyncio
async def test_submit_bulk_watch_concurrent_calls():
    """Test concurrent execution of submit_bulk_watch with different inputs."""
    client = JiraClient(JiraRESTClientViaToken("http://jira.example.com", "token"))
    ds = JiraDataSource(client)
    issue_lists = [
        ["A-1", "A-2"],
        ["B-1"],
        ["C-1", "C-2", "C-3"],
        []
    ]
    # Run all concurrently
    results = await asyncio.gather(
        *(ds.submit_bulk_watch(lst) for lst in issue_lists)
    )
    # Check each result
    for i, resp in enumerate(results):
        pass

@pytest.mark.asyncio
async def test_submit_bulk_watch_headers_override_content_type():
    """Test that Content-Type header is overridden if provided."""
    client = JiraClient(JiraRESTClientViaToken("http://jira.example.com", "token"))
    ds = JiraDataSource(client)
    issue_ids = ["D-1"]
    headers = {"Content-Type": "application/x-www-form-urlencoded"}
    resp = await ds.submit_bulk_watch(issue_ids, headers=headers)

# ---- 3. Large Scale Test Cases ----

@pytest.mark.asyncio
async def test_submit_bulk_watch_large_issue_list():
    """Test with a large list of issue IDs (500 elements)."""
    client = JiraClient(JiraRESTClientViaToken("http://jira.example.com", "token"))
    ds = JiraDataSource(client)
    issue_ids = [f"ISSUE-{i}" for i in range(500)]
    resp = await ds.submit_bulk_watch(issue_ids)

@pytest.mark.asyncio
async def test_submit_bulk_watch_many_concurrent_large_lists():
    """Test many concurrent calls with large lists."""
    client = JiraClient(JiraRESTClientViaToken("http://jira.example.com", "token"))
    ds = JiraDataSource(client)
    # Prepare 10 concurrent calls, each with 100 unique issue IDs
    issue_lists = [[f"ISSUE-{i}-{j}" for j in range(100)] for i in range(10)]
    results = await asyncio.gather(*(ds.submit_bulk_watch(lst) for lst in issue_lists))
    for i, resp in enumerate(results):
        pass

# ---- 4. Throughput Test Cases ----

@pytest.mark.asyncio
async def test_submit_bulk_watch_throughput_small_load():
    """Throughput test: small load, 5 concurrent requests."""
    client = JiraClient(JiraRESTClientViaToken("http://jira.example.com", "token"))
    ds = JiraDataSource(client)
    issue_lists = [[f"SMALL-{i}-{j}" for j in range(2)] for i in range(5)]
    results = await asyncio.gather(*(ds.submit_bulk_watch(lst) for lst in issue_lists))
    for i, resp in enumerate(results):
        pass

@pytest.mark.asyncio
async def test_submit_bulk_watch_throughput_medium_load():
    """Throughput test: medium load, 20 concurrent requests."""
    client = JiraClient(JiraRESTClientViaToken("http://jira.example.com", "token"))
    ds = JiraDataSource(client)
    issue_lists = [[f"MEDIUM-{i}-{j}" for j in range(5)] for i in range(20)]
    results = await asyncio.gather(*(ds.submit_bulk_watch(lst) for lst in issue_lists))
    for i, resp in enumerate(results):
        pass

@pytest.mark.asyncio
async def test_submit_bulk_watch_throughput_high_volume():
    """Throughput test: high volume, 50 concurrent requests with 10 issues each."""
    client = JiraClient(JiraRESTClientViaToken("http://jira.example.com", "token"))
    ds = JiraDataSource(client)
    issue_lists = [[f"HIGH-{i}-{j}" for j in range(10)] for i in range(50)]
    results = await asyncio.gather(*(ds.submit_bulk_watch(lst) for lst in issue_lists))
    for i, resp in enumerate(results):
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-JiraDataSource.submit_bulk_watch-mhot8y3d and push.

Codeflash Static Badge

The optimization achieves an **18% runtime improvement** by eliminating unnecessary object allocations and function calls in the `submit_bulk_watch` method, which is commonly used for bulk operations in Jira integrations.

## Key Optimizations Applied

**1. Pre-computed Empty Dictionary Reuse**
- Replaced repeated empty dict creation (`_path: Dict[str, Any] = {}`, `_query: Dict[str, Any] = {}`) with a shared constant `_AS_STR_EMPTY_DICT`
- Eliminates 746 calls to `_as_str_dict()` for empty dictionaries (reduced from 1119 to 373 calls)
- Saves ~36% of dictionary conversion overhead

**2. Static URL Path Construction**
- Removed expensive `_safe_format_url()` call since `_path` is always empty, making URL formatting unnecessary
- Direct string concatenation (`self.base_url + rel_path`) replaces template formatting
- Eliminates 373 calls to `_safe_format_url()` and associated `_SafeDict` object creation

**3. Conditional Header Processing**
- Only creates header dictionary copy when headers are actually provided
- Most calls (370/373 in profiling) skip the `dict(headers)` allocation entirely
- Uses direct assignment for the common case of no custom headers

## Performance Impact

The line profiler shows the optimizations target the most expensive operations:
- **URL formatting** dropped from 19% to 2.1% of runtime
- **Dictionary conversions** reduced from 24.4% (12.3% + 12.1%) to 1.3% + 1.2% = 2.5% of runtime
- **Total function time** decreased from 7.88ms to 5.10ms

## Test Case Benefits

The optimization particularly benefits:
- **High-volume concurrent operations** (50-200 requests) where object allocation overhead compounds
- **Large issue lists** (500-1000 items) where the per-call savings scale significantly  
- **Throughput scenarios** where the 18% per-call improvement translates directly to higher request rates

This optimization is especially valuable for Jira integrations that perform bulk operations, as each saved allocation and function call reduces both CPU usage and memory pressure in high-throughput scenarios.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 7, 2025 12:07
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant