Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 30, 2025

📄 11% (0.11x) speedup for cosine_similarity in src/statistics/similarity.py

⏱️ Runtime : 1.35 milliseconds 1.22 milliseconds (best of 587 runs)

📝 Explanation and details

The optimized code achieves a 10% speedup through two key improvements:

1. More efficient array conversion with np.asarray:

  • Replaced np.array() with np.asarray() which avoids unnecessary copying when inputs are already numpy arrays
  • This is particularly beneficial when working with mixed input types (lists and numpy arrays), as shown in the test cases

2. Faster norm calculation using np.einsum:

  • Replaced np.linalg.norm(X, axis=1) with np.sqrt(np.einsum('ij,ij->i', X, X))
  • np.einsum computes element-wise products and sums more efficiently than the general-purpose linalg.norm
  • Line profiler shows norm calculations dropped from 646ms + 404ms to 427ms + 259ms (35% faster for this operation)

The optimizations are most effective for:

  • Large dimensional vectors (37.9% speedup on 10×500 matrices)
  • Sparse vectors (29.6% speedup on sparse 100×50 matrices)
  • Mixed input types (14.2% speedup when combining lists and numpy arrays)

These improvements maintain identical numerical behavior while reducing computational overhead, especially benefiting scenarios with high-dimensional data or frequent numpy array conversions.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 42 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 3 Passed
🔮 Hypothesis Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from typing import List, Union

# function to test
import numpy as np
# imports
import pytest
from src.statistics.similarity import cosine_similarity

Matrix = Union[List[List[float]], List[np.ndarray], np.ndarray]
from src.statistics.similarity import cosine_similarity

# unit tests

# --- Basic Test Cases ---
def test_identical_vectors():
    # Cosine similarity between identical vectors should be 1
    X = [[1, 2, 3]]
    Y = [[1, 2, 3]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 17.6μs -> 15.7μs (12.5% faster)

def test_orthogonal_vectors():
    # Cosine similarity between orthogonal vectors should be 0
    X = [[1, 0]]
    Y = [[0, 1]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 17.2μs -> 15.2μs (12.9% faster)

def test_opposite_vectors():
    # Cosine similarity between opposite vectors should be -1
    X = [[1, 0]]
    Y = [[-1, 0]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 17.0μs -> 15.2μs (11.5% faster)

def test_multiple_vectors():
    # Test with multiple vectors in X and Y
    X = [[1, 0], [0, 1]]
    Y = [[1, 0], [0, 1]]
    expected = [[1.0, 0.0], [0.0, 1.0]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 18.4μs -> 16.5μs (11.4% faster)

def test_non_normalized_vectors():
    # Test with non-normalized vectors
    X = [[2, 0], [0, 3]]
    Y = [[4, 0], [0, 6]]
    expected = [[1.0, 0.0], [0.0, 1.0]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 18.0μs -> 16.3μs (10.2% faster)

def test_vectors_with_floats():
    # Test with floating point values
    X = [[0.5, 0.5]]
    Y = [[0.5, -0.5]]
    expected = [[0.0]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 15.2μs -> 13.9μs (9.59% faster)

# --- Edge Test Cases ---
def test_empty_X():
    # Test with empty X
    X = []
    Y = [[1, 2, 3]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 875ns -> 875ns (0.000% faster)

def test_empty_Y():
    # Test with empty Y
    X = [[1, 2, 3]]
    Y = []
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 958ns -> 917ns (4.47% faster)

def test_both_empty():
    # Test with both X and Y empty
    X = []
    Y = []
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 875ns -> 875ns (0.000% faster)

def test_dimension_mismatch():
    # Test with different number of columns
    X = [[1, 2, 3]]
    Y = [[1, 2]]
    with pytest.raises(ValueError):
        cosine_similarity(X, Y) # 3.38μs -> 3.29μs (2.52% faster)

def test_zero_vector():
    # Test with zero vector in X
    X = [[0, 0, 0]]
    Y = [[1, 2, 3]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 22.2μs -> 20.2μs (9.67% faster)

def test_zero_vector_in_Y():
    # Test with zero vector in Y
    X = [[1, 2, 3]]
    Y = [[0, 0, 0]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 21.5μs -> 19.7μs (9.30% faster)

def test_zero_vectors_both():
    # Test with both vectors as zero
    X = [[0, 0, 0]]
    Y = [[0, 0, 0]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 21.2μs -> 19.5μs (8.55% faster)

def test_single_element_vectors():
    # Test with single-element vectors
    X = [[2]]
    Y = [[-2]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 17.2μs -> 15.1μs (13.5% faster)

def test_negative_values():
    # Test with negative values
    X = [[-1, -2, -3]]
    Y = [[-1, -2, -3]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 17.0μs -> 15.1μs (12.7% faster)

def test_mixed_types():
    # Test with np.ndarray and list
    X = np.array([[1, 0], [0, 1]])
    Y = [[1, 0], [0, 1]]
    expected = [[1.0, 0.0], [0.0, 1.0]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 17.7μs -> 15.5μs (14.0% faster)

def test_highly_sparse_vectors():
    # Test with sparse vectors
    X = [[0, 0, 0, 1]]
    Y = [[0, 0, 1, 0]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 17.0μs -> 15.3μs (11.2% faster)

# --- Large Scale Test Cases ---
def test_large_number_of_vectors():
    # Test with 100 vectors of 10 dimensions each
    np.random.seed(42)
    X = np.random.rand(100, 10)
    Y = np.random.rand(100, 10)
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 48.6μs -> 43.3μs (12.3% faster)

def test_large_vector_dimensions():
    # Test with vectors of length 1000
    np.random.seed(123)
    X = np.random.rand(2, 1000)
    Y = np.random.rand(2, 1000)
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 22.7μs -> 20.4μs (11.2% faster)

def test_large_sparse_vectors():
    # Test with large sparse vectors
    X = np.zeros((10, 1000))
    Y = np.zeros((10, 1000))
    # Set one element in each row to 1
    for i in range(10):
        X[i, i] = 1
        Y[i, i] = 1
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 42.0μs -> 35.5μs (18.2% faster)
    # Diagonal should be 1, off-diagonal should be 0
    for i in range(10):
        for j in range(10):
            expected = 1.0 if i == j else 0.0

def test_performance_on_large_input():
    # Test that function runs efficiently on large input (not a strict timing test)
    X = np.random.rand(100, 100)
    Y = np.random.rand(100, 100)
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 75.3μs -> 57.9μs (30.0% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import math
from typing import List, Union

# function to test
import numpy as np
# imports
import pytest  # used for our unit tests
from src.statistics.similarity import cosine_similarity

Matrix = Union[List[List[float]], List[np.ndarray], np.ndarray]
from src.statistics.similarity import cosine_similarity

# unit tests

# ----------- BASIC TEST CASES -----------

def test_identical_vectors():
    # Identical vectors should have cosine similarity 1
    X = [[1, 0, 0]]
    Y = [[1, 0, 0]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 17.7μs -> 15.9μs (11.0% faster)

def test_orthogonal_vectors():
    # Orthogonal vectors should have cosine similarity 0
    X = [[1, 0]]
    Y = [[0, 1]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 17.1μs -> 15.3μs (12.0% faster)

def test_opposite_vectors():
    # Opposite vectors should have cosine similarity -1
    X = [[1, 0]]
    Y = [[-1, 0]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 16.9μs -> 15.2μs (11.0% faster)

def test_multiple_vectors():
    # Multiple vectors in X and Y
    X = [[1, 0], [0, 1]]
    Y = [[1, 0], [0, 1]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 18.2μs -> 16.6μs (9.77% faster)

def test_non_normalized_vectors():
    # Vectors not normalized, but cosine similarity should still be correct
    X = [[2, 0]]
    Y = [[0, 2]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 16.9μs -> 14.9μs (13.1% faster)

def test_float_vectors():
    # Test with float values
    X = [[1.5, 2.5]]
    Y = [[3.0, 5.0]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 15.3μs -> 13.8μs (10.9% faster)

# ----------- EDGE TEST CASES -----------

def test_empty_X():
    # X is empty, should return empty array
    X = []
    Y = [[1, 2]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 958ns -> 916ns (4.59% faster)

def test_empty_Y():
    # Y is empty, should return empty array
    X = [[1, 2]]
    Y = []
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 959ns -> 917ns (4.58% faster)

def test_empty_both():
    # Both X and Y are empty, should return empty array
    X = []
    Y = []
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 875ns -> 875ns (0.000% faster)

def test_dimension_mismatch():
    # X and Y have different number of columns, should raise ValueError
    X = [[1, 2, 3]]
    Y = [[4, 5]]
    with pytest.raises(ValueError):
        cosine_similarity(X, Y) # 3.33μs -> 3.33μs (0.000% faster)

def test_zero_vector_in_X():
    # Zero vector in X, should result in 0 similarity for that row
    X = [[0, 0], [1, 0]]
    Y = [[1, 0], [0, 1]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 23.5μs -> 21.7μs (8.65% faster)

def test_zero_vector_in_Y():
    # Zero vector in Y, should result in 0 similarity for that column
    X = [[1, 0], [0, 1]]
    Y = [[0, 0], [1, 0]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 22.7μs -> 20.9μs (8.56% faster)

def test_all_zero_vectors():
    # All vectors are zero, should result in all zeros
    X = [[0, 0], [0, 0]]
    Y = [[0, 0], [0, 0]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 22.4μs -> 20.6μs (8.69% faster)

def test_negative_values():
    # Vectors with negative values
    X = [[-1, -1]]
    Y = [[1, 1]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 17.2μs -> 15.2μs (13.4% faster)

def test_mixed_types():
    # X is list of lists, Y is numpy array
    X = [[1, 2], [3, 4]]
    Y = np.array([[1, 0], [0, 1]])
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 17.5μs -> 15.3μs (14.2% faster)

def test_single_element_vectors():
    # Vectors with a single element
    X = [[1], [-1], [0]]
    Y = [[1], [-1], [0]]
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 23.5μs -> 21.7μs (8.65% faster)

# ----------- LARGE SCALE TEST CASES -----------

def test_large_number_of_vectors():
    # Test with 500 vectors of dimension 10
    np.random.seed(42)
    X = np.random.randn(500, 10)
    Y = np.random.randn(500, 10)
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 476μs -> 461μs (3.31% faster)

def test_large_dimension_vectors():
    # Test with 10 vectors of dimension 500
    np.random.seed(123)
    X = np.random.randn(10, 500)
    Y = np.random.randn(10, 500)
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 33.0μs -> 24.0μs (37.9% faster)

def test_large_scale_identical():
    # All vectors in X and Y are identical, so similarity should be 1 on diagonal
    X = np.ones((100, 20))
    Y = np.ones((100, 20))
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 51.8μs -> 44.9μs (15.3% faster)
    # Diagonal should be 1
    for i in range(100):
        pass
    # Off-diagonal should also be 1 since all vectors are identical
    for i in range(100):
        for j in range(100):
            pass

def test_large_scale_zero_vectors():
    # All vectors are zero, should result in all zeros
    X = np.zeros((50, 30))
    Y = np.zeros((50, 30))
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 37.6μs -> 31.6μs (18.8% faster)

def test_large_scale_sparse_vectors():
    # Sparse vectors: mostly zeros, some random nonzero
    X = np.zeros((100, 50))
    Y = np.zeros((100, 50))
    for i in range(100):
        X[i, i % 50] = i + 1
        Y[i, i % 50] = i + 1
    codeflash_output = cosine_similarity(X, Y); result = codeflash_output # 62.5μs -> 48.2μs (29.6% faster)
    # Diagonal should be 1, off-diagonal should be 0
    for i in range(100):
        for j in range(100):
            if i != j:
                pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from src.statistics.similarity import cosine_similarity
import pytest

def test_cosine_similarity():
    cosine_similarity([[]], [[]])

def test_cosine_similarity_2():
    with pytest.raises(ValueError, match='Number\\ of\\ columns\\ in\\ X\\ and\\ Y\\ must\\ be\\ the\\ same\\.\\ X\\ has\\ shape\\ \\(1,\\ 0\\)\\ and\\ Y\\ has\\ shape\\ \\(1,\\ 1\\)\\.'):
        cosine_similarity([[]], [[0.0]])

def test_cosine_similarity_3():
    cosine_similarity([[]], [])
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_tok40xa7/tmpsg4ajw46/test_concolic_coverage.py::test_cosine_similarity 19.5μs 17.8μs 9.36%✅
codeflash_concolic_tok40xa7/tmpsg4ajw46/test_concolic_coverage.py::test_cosine_similarity_2 3.29μs 3.29μs 0.000%✅
codeflash_concolic_tok40xa7/tmpsg4ajw46/test_concolic_coverage.py::test_cosine_similarity_3 1.00μs 1.00μs 0.000%✅

To edit these changes git checkout codeflash/optimize-cosine_similarity-mhd3k710 and push.

Codeflash Static Badge

The optimized code achieves a **10% speedup** through two key improvements:

**1. More efficient array conversion with `np.asarray`:**
- Replaced `np.array()` with `np.asarray()` which avoids unnecessary copying when inputs are already numpy arrays
- This is particularly beneficial when working with mixed input types (lists and numpy arrays), as shown in the test cases

**2. Faster norm calculation using `np.einsum`:**
- Replaced `np.linalg.norm(X, axis=1)` with `np.sqrt(np.einsum('ij,ij->i', X, X))`
- `np.einsum` computes element-wise products and sums more efficiently than the general-purpose `linalg.norm`
- Line profiler shows norm calculations dropped from 646ms + 404ms to 427ms + 259ms (35% faster for this operation)

The optimizations are most effective for:
- **Large dimensional vectors** (37.9% speedup on 10×500 matrices)
- **Sparse vectors** (29.6% speedup on sparse 100×50 matrices) 
- **Mixed input types** (14.2% speedup when combining lists and numpy arrays)

These improvements maintain identical numerical behavior while reducing computational overhead, especially benefiting scenarios with high-dimensional data or frequent numpy array conversions.
@codeflash-ai codeflash-ai bot requested a review from KRRT7 October 30, 2025 07:22
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant