Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 5, 2025

📄 36% (0.36x) speedup for _truncate_label in optuna/visualization/_parallel_coordinate.py

⏱️ Runtime : 287 microseconds 211 microseconds (best of 250 runs)

📝 Explanation and details

The optimization replaces Python's .format() method with direct string concatenation using the + operator. Specifically, it changes "{}...".format(label[:17]) to label[:17] + "...".

Key Performance Impact:

  • .format() method involves overhead from format string parsing, placeholder substitution, and method call dispatch
  • Direct string concatenation with + is a primitive operation that Python handles more efficiently
  • The line profiler shows a 23% reduction in execution time per hit (515.8ns → 396.8ns per hit)

Why This Matters:
The optimization shows significant gains specifically for cases requiring truncation (labels ≥20 characters), with speedups ranging from 42-128% in the test results. Short labels (≤19 characters) show minimal performance difference since they bypass the truncation logic entirely.

Test Case Performance Patterns:

  • Short labels: Slight slowdown (2-11%) due to measurement noise, but negligible impact
  • Truncated labels: Substantial speedup (42-128%) where the optimization takes effect
  • Batch operations: 21-34% improvement when processing multiple labels requiring truncation

This optimization is particularly valuable in visualization contexts where label truncation occurs frequently, such as parallel coordinate plots with many parameter names that exceed the display threshold.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1059 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

# imports
import pytest  # used for our unit tests
from optuna.visualization._parallel_coordinate import _truncate_label

# unit tests

class TestTruncateLabel:
    # --- Basic Test Cases ---
    def test_short_label_returns_unchanged(self):
        # Label is shorter than 20 chars, should return unchanged
        codeflash_output = _truncate_label("short label") # 435ns -> 454ns (4.19% slower)

    def test_exactly_19_characters(self):
        # Label is exactly 19 chars, should return unchanged
        label = "a" * 19
        codeflash_output = _truncate_label(label) # 406ns -> 410ns (0.976% slower)

    def test_exactly_20_characters(self):
        # Label is exactly 20 chars, should be truncated
        label = "a" * 20
        expected = "a" * 17 + "..."
        codeflash_output = _truncate_label(label) # 1.29μs -> 686ns (87.6% faster)

    def test_long_label_truncation(self):
        # Label longer than 20 chars should be truncated
        label = "abcdefghijklmnopqrstuvwxyz"
        expected = "abcdefghijklmnopq..."
        codeflash_output = _truncate_label(label) # 1.05μs -> 675ns (56.0% faster)

    def test_label_with_spaces(self):
        # Label with spaces, longer than 20 chars
        label = "this is a very long label"
        expected = "this is a very lon..."
        codeflash_output = _truncate_label(label) # 993ns -> 636ns (56.1% faster)

    # --- Edge Test Cases ---
    def test_empty_string(self):
        # Empty string should return unchanged
        codeflash_output = _truncate_label("") # 392ns -> 440ns (10.9% slower)

    def test_single_character(self):
        # Single character label should return unchanged
        codeflash_output = _truncate_label("x") # 401ns -> 417ns (3.84% slower)

    def test_label_exactly_17_characters(self):
        # Label exactly 17 chars, no truncation
        label = "a" * 17
        codeflash_output = _truncate_label(label) # 385ns -> 400ns (3.75% slower)

    def test_label_exactly_18_characters(self):
        # Label exactly 18 chars, no truncation
        label = "b" * 18
        codeflash_output = _truncate_label(label) # 353ns -> 363ns (2.75% slower)

    def test_label_exactly_21_characters(self):
        # Label exactly 21 chars, truncation should occur
        label = "c" * 21
        expected = "c" * 17 + "..."
        codeflash_output = _truncate_label(label) # 1.14μs -> 666ns (71.2% faster)

    def test_label_with_unicode_characters(self):
        # Unicode label, longer than 20 chars
        label = "𝒜𝒷𝒸𝒹𝑒𝒻𝑔𝒽𝒾𝒿𝓀𝓁𝓂𝓃𝑜𝓅𝓆𝓇𝓈𝓉𝓊𝓋"
        # 22 unicode chars, should be truncated at 17
        expected = label[:17] + "..."
        codeflash_output = _truncate_label(label) # 1.23μs -> 539ns (128% faster)

    def test_label_with_newline(self):
        # Label with newline, longer than 20 chars
        label = "line1\nline2\nline3\nline4"
        expected = label[:17] + "..."
        codeflash_output = _truncate_label(label) # 840ns -> 514ns (63.4% faster)

    def test_label_with_tabs(self):
        # Label with tabs, longer than 20 chars
        label = "tab\tseparated\tlabel\tlong"
        expected = label[:17] + "..."
        codeflash_output = _truncate_label(label) # 878ns -> 507ns (73.2% faster)

    def test_label_with_mixed_whitespace(self):
        # Label with mixed whitespace, longer than 20 chars
        label = "a b\tc\nd e f g h i j k l m n o p"
        expected = label[:17] + "..."
        codeflash_output = _truncate_label(label) # 804ns -> 515ns (56.1% faster)

    # --- Large Scale Test Cases ---
    def test_very_long_label(self):
        # Very long label (1000 chars), should truncate to first 17 + "..."
        label = "x" * 1000
        expected = "x" * 17 + "..."
        codeflash_output = _truncate_label(label) # 943ns -> 633ns (49.0% faster)

    def test_very_long_label_with_varied_characters(self):
        # Very long label with varied characters
        label = "".join(chr(65 + (i % 26)) for i in range(1000))
        expected = label[:17] + "..."
        codeflash_output = _truncate_label(label) # 895ns -> 570ns (57.0% faster)

    def test_many_labels_in_batch(self):
        # Test truncation for a batch of labels of varying lengths
        labels = [
            "short",
            "medium length label",
            "a" * 17,
            "b" * 19,
            "c" * 20,
            "d" * 21,
            "e" * 50,
            "f" * 1000,
        ]
        expected = [
            "short",
            "medium length label",
            "a" * 17,
            "b" * 19,
            "c" * 17 + "...",
            "d" * 17 + "...",
            "e" * 17 + "...",
            "f" * 17 + "...",
        ]
        for lbl, exp in zip(labels, expected):
            codeflash_output = _truncate_label(lbl) # 2.39μs -> 1.98μs (20.8% faster)

    def test_performance_on_large_batch(self):
        # Performance test: truncate 1000 labels (all > 20 chars)
        labels = ["label_" + str(i) + "_" + "x" * 30 for i in range(1000)]
        for lbl in labels:
            codeflash_output = _truncate_label(lbl); result = codeflash_output # 245μs -> 182μs (33.9% faster)

    # --- Mutation-sensitive tests ---
    def test_truncation_exact_cut(self):
        # Test that truncation is exactly at 17 chars, not off-by-one
        label = "123456789012345678901234567890"
        expected = label[:17] + "..."
        codeflash_output = _truncate_label(label) # 1.24μs -> 598ns (107% faster)

    def test_truncation_preserves_first_17(self):
        # Ensure the first 17 chars are preserved exactly
        label = "abcdefgABCDEFG1234567890"
        expected = label[:17] + "..."
        codeflash_output = _truncate_label(label) # 870ns -> 492ns (76.8% faster)

    def test_truncation_does_not_add_extra_characters(self):
        # Ensure that the output is always exactly 20 chars for truncated labels
        label = "z" * 100
        codeflash_output = _truncate_label(label); result = codeflash_output # 774ns -> 570ns (35.8% faster)

    def test_truncation_with_non_ascii(self):
        # Non-ASCII chars, ensure correct truncation
        label = "你好世界" * 6  # 24 chars
        expected = label[:17] + "..."
        codeflash_output = _truncate_label(label) # 1.17μs -> 560ns (109% faster)

    def test_truncation_with_surrogate_pairs(self):
        # Emoji (surrogate pairs), ensure correct truncation
        label = "😀" * 25  # 25 emoji chars
        expected = label[:17] + "..."
        codeflash_output = _truncate_label(label) # 1.13μs -> 601ns (88.5% faster)
        codeflash_output = len(_truncate_label(label)) # 478ns -> 316ns (51.3% faster)

    # --- Determinism ---
    def test_deterministic_output(self):
        # Multiple calls with same input yield same output
        label = "deterministic test label which is long"
        codeflash_output = _truncate_label(label); out1 = codeflash_output # 840ns -> 592ns (41.9% faster)
        codeflash_output = _truncate_label(label); out2 = codeflash_output # 421ns -> 288ns (46.2% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest  # used for our unit tests
from optuna.visualization._parallel_coordinate import _truncate_label

# unit tests

# --- Basic Test Cases ---

def test_short_label_unchanged():
    # Label shorter than 20 characters should remain unchanged
    codeflash_output = _truncate_label("short label") # 495ns -> 408ns (21.3% faster)
    codeflash_output = _truncate_label("1234567890123456789") # 210ns -> 214ns (1.87% slower)

def test_exactly_20_characters_truncated():
    # Label with exactly 20 characters should be truncated
    label = "12345678901234567890"  # 20 chars
    expected = "12345678901234567..."
    codeflash_output = _truncate_label(label) # 1.19μs -> 631ns (87.8% faster)

def test_long_label_truncated():
    # Label longer than 20 characters should be truncated
    label = "abcdefghijklmnopqrstuvwxyz"  # 26 chars
    expected = "abcdefghijklmnopqrs..."
    codeflash_output = _truncate_label(label) # 1.02μs -> 649ns (56.7% faster)

def test_label_with_spaces():
    # Spaces should count toward length
    label = "a" * 10 + " " * 10  # 20 chars
    expected = "aaaaaaaaaa          ..."
    codeflash_output = _truncate_label(label + "x") # 1.01μs -> 606ns (66.5% faster)

def test_label_with_special_characters():
    # Special characters should be handled normally
    label = "!@#$%^&*()_+=-[]{}|;:,.<>/?"
    expected = "!@#$%^&*()_+=-[]{}..."
    codeflash_output = _truncate_label(label) # 883ns -> 621ns (42.2% faster)

# --- Edge Test Cases ---

def test_empty_string():
    # Empty string should remain unchanged
    codeflash_output = _truncate_label("") # 396ns -> 414ns (4.35% slower)

def test_one_character_string():
    # Single character string should remain unchanged
    codeflash_output = _truncate_label("a") # 386ns -> 387ns (0.258% slower)

def test_seventeen_characters():
    # 17 chars should remain unchanged
    label = "12345678901234567"
    codeflash_output = _truncate_label(label) # 355ns -> 385ns (7.79% slower)

def test_nineteen_characters():
    # 19 chars should remain unchanged
    label = "1234567890123456789"
    codeflash_output = _truncate_label(label) # 357ns -> 396ns (9.85% slower)

def test_twenty_characters():
    # 20 chars should be truncated
    label = "12345678901234567890"
    expected = "12345678901234567..."
    codeflash_output = _truncate_label(label) # 1.21μs -> 667ns (81.9% faster)

def test_unicode_characters():
    # Unicode characters should be counted as single characters
    label = "你好世界" * 5  # 20 unicode chars
    expected = "你好世界你好世界你好世界你好世..."
    codeflash_output = _truncate_label(label) # 1.54μs -> 944ns (63.3% faster)

def test_label_with_newline():
    # Newline characters should be counted
    label = "a\n" * 10  # 20 chars (including newlines)
    expected = "a\na\na\na\na\na\na\na\na..."
    codeflash_output = _truncate_label(label + "b") # 1.00μs -> 601ns (66.6% faster)

def test_label_with_tab():
    # Tab characters should be counted
    label = "a\t" * 10  # 20 chars (including tabs)
    expected = "a\ta\ta\ta\ta\ta\ta\ta\ta..."
    codeflash_output = _truncate_label(label + "b") # 879ns -> 597ns (47.2% faster)

def test_label_with_only_spaces():
    # Label of 21 spaces should be truncated
    label = " " * 21
    expected = "                 ..."
    codeflash_output = _truncate_label(label) # 874ns -> 605ns (44.5% faster)

# --- Large Scale Test Cases ---

def test_very_long_label():
    # Test label much longer than 20 characters
    label = "a" * 1000
    expected = "a" * 17 + "..."
    codeflash_output = _truncate_label(label) # 969ns -> 660ns (46.8% faster)

def test_large_label_with_special_characters():
    # Test label with 1000 special characters
    label = "!@#" * 333 + "!"  # 1000 chars
    expected = "!@#!@#!@#!@#!@#!@#!@#!@..."
    codeflash_output = _truncate_label(label) # 904ns -> 649ns (39.3% faster)

def test_large_label_with_unicode():
    # Test label with 1000 unicode characters
    label = "你" * 1000
    expected = "你" * 17 + "..."
    codeflash_output = _truncate_label(label) # 1.40μs -> 928ns (50.6% faster)

def test_large_label_with_mixed_characters():
    # Test label with a mix of ascii, unicode, and special chars
    label = ("abc你好!@#" * 100)[:1000]
    expected = label[:17] + "..."
    codeflash_output = _truncate_label(label) # 1.02μs -> 561ns (82.5% faster)

def test_label_length_just_below_cutoff():
    # 19 characters should not be truncated
    label = "x" * 19
    codeflash_output = _truncate_label(label) # 391ns -> 401ns (2.49% slower)

def test_label_length_just_above_cutoff():
    # 21 characters should be truncated
    label = "y" * 21
    expected = "y" * 17 + "..."
    codeflash_output = _truncate_label(label) # 993ns -> 631ns (57.4% faster)

# --- Determinism Test ---

def test_determinism():
    # The function should always return the same result for the same input
    label = "abcdefg" * 3  # 21 chars
    codeflash_output = _truncate_label(label); result1 = codeflash_output # 921ns -> 581ns (58.5% faster)
    codeflash_output = _truncate_label(label); result2 = codeflash_output # 461ns -> 313ns (47.3% faster)

# --- Type Robustness Test ---

def test_non_string_input_raises():
    # The function should raise a TypeError for non-string input
    with pytest.raises(TypeError):
        _truncate_label(123)
    with pytest.raises(TypeError):
        _truncate_label(None)
    with pytest.raises(TypeError):
        _truncate_label(["a", "b", "c"])

# --- Mutation Sensitivity Test ---

def test_mutation_sensitivity():
    # Changing the truncation logic should fail this test
    label = "abcdefghijklmnopqrstuvwx"
    expected = "abcdefghijklmnopqrs..."
    codeflash_output = _truncate_label(label) # 1.53μs -> 753ns (103% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-_truncate_label-mhmj7cls and push.

Codeflash Static Badge

The optimization replaces Python's `.format()` method with direct string concatenation using the `+` operator. Specifically, it changes `"{}...".format(label[:17])` to `label[:17] + "..."`.

**Key Performance Impact:**
- `.format()` method involves overhead from format string parsing, placeholder substitution, and method call dispatch
- Direct string concatenation with `+` is a primitive operation that Python handles more efficiently
- The line profiler shows a 23% reduction in execution time per hit (515.8ns → 396.8ns per hit)

**Why This Matters:**
The optimization shows significant gains specifically for cases requiring truncation (labels ≥20 characters), with speedups ranging from 42-128% in the test results. Short labels (≤19 characters) show minimal performance difference since they bypass the truncation logic entirely.

**Test Case Performance Patterns:**
- **Short labels**: Slight slowdown (2-11%) due to measurement noise, but negligible impact
- **Truncated labels**: Substantial speedup (42-128%) where the optimization takes effect
- **Batch operations**: 21-34% improvement when processing multiple labels requiring truncation

This optimization is particularly valuable in visualization contexts where label truncation occurs frequently, such as parallel coordinate plots with many parameter names that exceed the display threshold.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 5, 2025 21:50
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant