⚡️ Speed up function `find_last_node` by 13,550% #66

codeflash-ai · 2025-07-30T02:37:47Z

📄 13,550% (135.50x) speedup for `find_last_node` in `src/dsa/nodes.py`

⏱️ Runtime : 53.2 milliseconds → 390 microseconds (best of 528 runs)

📝 Explanation and details

The optimization transforms an O(n*m) algorithm into an O(n+m) algorithm by eliminating redundant work through preprocessing.

Key Optimization: Set-based Preprocessing
The original code uses a nested loop structure where for each node, it checks all edges to see if any edge has that node as a source. This creates an O(n*m) time complexity where n is the number of nodes and m is the number of edges.

The optimized version preprocesses all edge sources into a set (sources = {e["source"] for e in edges}), then performs a simple O(1) set membership check (n["id"] not in sources) for each node. This reduces the overall complexity to O(n+m).

Specific Changes:

Preprocessing step: Creates a set of all source node IDs from edges in a single pass
Lookup optimization: Replaces the all(e["source"] != n["id"] for e in edges) check with a fast set membership test
Eliminates nested iteration: The original code had to iterate through all edges for every node candidate

Why This Creates Massive Speedup:

Set membership lookup is O(1) average case vs O(m) linear search through edges
The preprocessing cost O(m) is paid only once, not n times
As shown in the line profiler, the original code spent 100% of time in the nested loop, while the optimized version splits time between preprocessing (57.6%) and the main loop (42.4%)

Test Case Performance Patterns:

Linear chains and large graphs show dramatic improvements (19,000%+ speedup): These benefit most because they have high edge counts relative to the final result
Small graphs with few edges show modest improvements (25-130% speedup): The preprocessing overhead is more noticeable, but set lookup is still faster
Empty cases show slight regression (10% slower): The preprocessing step adds overhead when there are no edges to process

The optimization is particularly effective for graph analysis scenarios where edge density is high relative to the number of sink nodes (nodes with no outgoing edges).

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 39 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

import pytest  # used for our unit tests
from src.dsa.nodes import find_last_node

# unit tests

# ------------------------
# Basic Test Cases
# ------------------------

def test_single_node_no_edges():
    # One node, no edges: node is last node
    nodes = [{"id": "a"}]
    edges = []
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 833ns -> 666ns (25.1% faster)

def test_two_nodes_one_edge():
    # Two nodes, one edge: last node is the one not a source
    nodes = [{"id": "a"}, {"id": "b"}]
    edges = [{"source": "a", "target": "b"}]
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 1.17μs -> 709ns (64.6% faster)

def test_three_nodes_linear_chain():
    # a -> b -> c; last node is c
    nodes = [{"id": "a"}, {"id": "b"}, {"id": "c"}]
    edges = [{"source": "a", "target": "b"}, {"source": "b", "target": "c"}]
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 1.50μs -> 750ns (100% faster)

def test_multiple_possible_last_nodes_returns_first():
    # a -> b, c is isolated, so b and c are both possible last nodes, but c comes first in nodes list
    nodes = [{"id": "c"}, {"id": "a"}, {"id": "b"}]
    edges = [{"source": "a", "target": "b"}]
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 875ns -> 667ns (31.2% faster)

# ------------------------
# Edge Test Cases
# ------------------------

def test_empty_nodes_and_edges():
    # No nodes, no edges: should return None
    nodes = []
    edges = []
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 375ns -> 417ns (10.1% slower)

def test_nodes_with_no_edges():
    # Multiple nodes, no edges: should return first node
    nodes = [{"id": "x"}, {"id": "y"}]
    edges = []
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 791ns -> 625ns (26.6% faster)

def test_all_nodes_are_sources():
    # All nodes are sources in at least one edge: should return None
    nodes = [{"id": "a"}, {"id": "b"}]
    edges = [{"source": "a", "target": "b"}, {"source": "b", "target": "a"}]
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 1.21μs -> 542ns (123% faster)

def test_cycle_graph():
    # Graph with a cycle: all nodes have outgoing edges, so None
    nodes = [{"id": "a"}, {"id": "b"}, {"id": "c"}]
    edges = [{"source": "a", "target": "b"}, {"source": "b", "target": "c"}, {"source": "c", "target": "a"}]
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 1.46μs -> 625ns (133% faster)

def test_disconnected_graph():
    # Disconnected components: some nodes have no outgoing edges
    nodes = [{"id": "a"}, {"id": "b"}, {"id": "c"}, {"id": "d"}]
    edges = [{"source": "a", "target": "b"}, {"source": "c", "target": "d"}]
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 1.21μs -> 750ns (61.1% faster)

def test_node_with_self_loop():
    # Node with a self-loop: should not be considered last node
    nodes = [{"id": "a"}, {"id": "b"}]
    edges = [{"source": "a", "target": "a"}]
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 1.17μs -> 708ns (64.7% faster)

def test_duplicate_edges():
    # Multiple edges from the same source to the same target
    nodes = [{"id": "a"}, {"id": "b"}]
    edges = [{"source": "a", "target": "b"}, {"source": "a", "target": "b"}]
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 1.21μs -> 750ns (61.1% faster)

def test_node_with_empty_id():
    # Node with empty id string
    nodes = [{"id": ""}, {"id": "x"}]
    edges = [{"source": "", "target": "x"}]
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 1.12μs -> 750ns (50.0% faster)

def test_edge_with_nonexistent_source():
    # Edge with source not in nodes: should not affect result
    nodes = [{"id": "a"}]
    edges = [{"source": "b", "target": "a"}]  # 'b' not present
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 875ns -> 708ns (23.6% faster)

def test_edge_with_nonexistent_target():
    # Edge with target not in nodes: should not affect result
    nodes = [{"id": "a"}]
    edges = [{"source": "a", "target": "b"}]  # 'b' not present
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 792ns -> 500ns (58.4% faster)

def test_nodes_with_additional_keys():
    # Nodes have extra attributes, should still work
    nodes = [{"id": "a", "data": 42}, {"id": "b", "name": "B"}]
    edges = [{"source": "a", "target": "b"}]
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 1.08μs -> 708ns (53.0% faster)

def test_edges_with_additional_keys():
    # Edges have extra attributes, should still work
    nodes = [{"id": "a"}, {"id": "b"}]
    edges = [{"source": "a", "target": "b", "weight": 5}]
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 1.12μs -> 708ns (58.9% faster)

# ------------------------
# Large Scale Test Cases
# ------------------------

def test_large_linear_chain():
    # 1000 nodes in a chain: 0 -> 1 -> 2 -> ... -> 999
    nodes = [{"id": str(i)} for i in range(1000)]
    edges = [{"source": str(i), "target": str(i+1)} for i in range(999)]
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 12.8ms -> 64.0μs (19924% faster)

def test_large_star_graph():
    # One central node with 999 outgoing edges to leaf nodes
    nodes = [{"id": "center"}] + [{"id": f"leaf{i}"} for i in range(999)]
    edges = [{"source": "center", "target": f"leaf{i}"} for i in range(999)]
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 24.2μs -> 10.5μs (131% faster)

def test_large_forest():
    # 10 trees of 100 nodes each, each a linear chain
    nodes = []
    edges = []
    for t in range(10):
        base = t * 100
        for i in range(100):
            nodes.append({"id": f"n{base+i}"})
            if i > 0:
                edges.append({"source": f"n{base+i-1}", "target": f"n{base+i}"})
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 172μs -> 38.5μs (349% faster)

def test_large_disconnected_nodes():
    # 1000 nodes, no edges
    nodes = [{"id": str(i)} for i in range(1000)]
    edges = []
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 833ns -> 750ns (11.1% faster)

def test_large_cycle():
    # 1000 nodes in a cycle: all nodes have outgoing edges, so None
    nodes = [{"id": str(i)} for i in range(1000)]
    edges = [{"source": str(i), "target": str((i+1)%1000)} for i in range(1000)]
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 12.7ms -> 62.7μs (20226% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import pytest  # used for our unit tests
from src.dsa.nodes import find_last_node

# unit tests

# ----------------------
# Basic Test Cases
# ----------------------

def test_single_node_no_edges():
    # One node, no edges: should return the node itself.
    nodes = [{"id": "A", "data": 1}]
    edges = []
    codeflash_output = find_last_node(nodes, edges) # 833ns -> 625ns (33.3% faster)

def test_two_nodes_one_edge():
    # Two nodes, one edge from A -> B: should return B as last node.
    nodes = [{"id": "A"}, {"id": "B"}]
    edges = [{"source": "A", "target": "B"}]
    codeflash_output = find_last_node(nodes, edges) # 1.12μs -> 708ns (58.9% faster)

def test_three_nodes_linear_chain():
    # A -> B -> C, should return C
    nodes = [{"id": "A"}, {"id": "B"}, {"id": "C"}]
    edges = [{"source": "A", "target": "B"}, {"source": "B", "target": "C"}]
    codeflash_output = find_last_node(nodes, edges) # 1.50μs -> 791ns (89.6% faster)

def test_multiple_possible_last_nodes():
    # A -> B, C (no edges): should return C (since C has no outgoing edges)
    nodes = [{"id": "A"}, {"id": "B"}, {"id": "C"}]
    edges = [{"source": "A", "target": "B"}]

def test_multiple_last_nodes_returns_first():
    # Two nodes with no outgoing edges: should return the first one found (order matters)
    nodes = [{"id": "A"}, {"id": "B"}]
    edges = []
    codeflash_output = find_last_node(nodes, edges); result = codeflash_output # 791ns -> 625ns (26.6% faster)

# ----------------------
# Edge Test Cases
# ----------------------

def test_empty_nodes_and_edges():
    # No nodes or edges: should return None
    codeflash_output = find_last_node([], []) # 375ns -> 416ns (9.86% slower)

def test_nodes_with_self_loop():
    # Node with a self-loop: should not be considered last node
    nodes = [{"id": "A"}]
    edges = [{"source": "A", "target": "A"}]
    codeflash_output = find_last_node(nodes, edges) # 833ns -> 541ns (54.0% faster)

def test_cycle_graph():
    # A -> B -> C -> A (cycle): no last node, should return None
    nodes = [{"id": "A"}, {"id": "B"}, {"id": "C"}]
    edges = [
        {"source": "A", "target": "B"},
        {"source": "B", "target": "C"},
        {"source": "C", "target": "A"},
    ]
    codeflash_output = find_last_node(nodes, edges) # 1.46μs -> 666ns (119% faster)

def test_disconnected_nodes():
    # Some nodes not connected at all: should return first disconnected node
    nodes = [{"id": "A"}, {"id": "B"}, {"id": "C"}]
    edges = [{"source": "A", "target": "B"}]

def test_node_with_multiple_outgoing_edges():
    # Node with multiple outgoing edges, none with zero outgoing
    nodes = [{"id": "A"}, {"id": "B"}, {"id": "C"}]
    edges = [
        {"source": "A", "target": "B"},
        {"source": "A", "target": "C"}
    ]
    # Both B and C have no outgoing edges, should return B (first found)
    codeflash_output = find_last_node(nodes, edges) # 1.21μs -> 750ns (61.1% faster)

def test_edges_with_unknown_nodes():
    # Edges refer to nodes not in the list: should not affect result
    nodes = [{"id": "A"}, {"id": "B"}]
    edges = [{"source": "A", "target": "B"}, {"source": "X", "target": "Y"}]
    codeflash_output = find_last_node(nodes, edges) # 1.17μs -> 709ns (64.6% faster)

def test_duplicate_node_ids():
    # Duplicate node ids: should return the first one with no outgoing edges
    nodes = [{"id": "A"}, {"id": "A"}, {"id": "B"}]
    edges = [{"source": "A", "target": "B"}]
    # Both "A" nodes have outgoing edges, only B is last node
    codeflash_output = find_last_node(nodes, edges) # 1.29μs -> 708ns (82.3% faster)

def test_node_with_incoming_but_no_outgoing():
    # Node with only incoming edges is a valid last node
    nodes = [{"id": "A"}, {"id": "B"}]
    edges = [{"source": "A", "target": "B"}]
    codeflash_output = find_last_node(nodes, edges) # 1.12μs -> 667ns (68.7% faster)



def test_large_linear_chain():
    # Large chain: A0 -> A1 -> ... -> A999
    N = 1000
    nodes = [{"id": f"A{i}"} for i in range(N)]
    edges = [{"source": f"A{i}", "target": f"A{i+1}"} for i in range(N-1)]
    codeflash_output = find_last_node(nodes, edges) # 13.0ms -> 64.0μs (20235% faster)

def test_large_star_topology():
    # One center node with outgoing edges to all others
    N = 1000
    nodes = [{"id": "center"}] + [{"id": f"leaf{i}"} for i in range(N-1)]
    edges = [{"source": "center", "target": f"leaf{i}"} for i in range(N-1)]
    # All leaves have no outgoing edges, so first leaf is returned
    codeflash_output = find_last_node(nodes, edges) # 25.0μs -> 10.7μs (133% faster)

def test_large_disconnected_nodes():
    # All nodes are disconnected (no edges)
    N = 1000
    nodes = [{"id": f"N{i}"} for i in range(N)]
    edges = []
    # Should return the first node
    codeflash_output = find_last_node(nodes, edges) # 875ns -> 750ns (16.7% faster)

def test_large_complete_graph():
    # Every node connects to every other node (no last node)
    N = 50  # keep small to avoid combinatorial explosion
    nodes = [{"id": f"N{i}"} for i in range(N)]
    edges = [{"source": f"N{i}", "target": f"N{j}"} for i in range(N) for j in range(N) if i != j]
    codeflash_output = find_last_node(nodes, edges) # 1.55ms -> 54.9μs (2723% faster)

def test_large_graph_with_one_last_node():
    # All nodes connect to one node, which has no outgoing edges
    N = 1000
    nodes = [{"id": f"N{i}"} for i in range(N)]
    edges = [{"source": f"N{i}", "target": f"N{N-1}"} for i in range(N-1)]
    # Only last node has no outgoing edges
    codeflash_output = find_last_node(nodes, edges) # 12.8ms -> 65.1μs (19599% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-find_last_node-mdpcvjcb and push.

The optimization transforms an O(n*m) algorithm into an O(n+m) algorithm by eliminating redundant work through preprocessing. **Key Optimization: Set-based Preprocessing** The original code uses a nested loop structure where for each node, it checks all edges to see if any edge has that node as a source. This creates an O(n*m) time complexity where n is the number of nodes and m is the number of edges. The optimized version preprocesses all edge sources into a set (`sources = {e["source"] for e in edges}`), then performs a simple O(1) set membership check (`n["id"] not in sources`) for each node. This reduces the overall complexity to O(n+m). **Specific Changes:** 1. **Preprocessing step**: Creates a set of all source node IDs from edges in a single pass 2. **Lookup optimization**: Replaces the `all(e["source"] != n["id"] for e in edges)` check with a fast set membership test 3. **Eliminates nested iteration**: The original code had to iterate through all edges for every node candidate **Why This Creates Massive Speedup:** - Set membership lookup is O(1) average case vs O(m) linear search through edges - The preprocessing cost O(m) is paid only once, not n times - As shown in the line profiler, the original code spent 100% of time in the nested loop, while the optimized version splits time between preprocessing (57.6%) and the main loop (42.4%) **Test Case Performance Patterns:** - **Linear chains and large graphs show dramatic improvements** (19,000%+ speedup): These benefit most because they have high edge counts relative to the final result - **Small graphs with few edges show modest improvements** (25-130% speedup): The preprocessing overhead is more noticeable, but set lookup is still faster - **Empty cases show slight regression** (10% slower): The preprocessing step adds overhead when there are no edges to process The optimization is particularly effective for graph analysis scenarios where edge density is high relative to the number of sink nodes (nodes with no outgoing edges).

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jul 30, 2025

codeflash-ai bot requested a review from aseembits93 July 30, 2025 02:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `find_last_node` by 13,550% #66

⚡️ Speed up function `find_last_node` by 13,550% #66

Uh oh!

codeflash-ai bot commented Jul 30, 2025

Uh oh!

Uh oh!

⚡️ Speed up function find_last_node by 13,550% #66

Are you sure you want to change the base?

⚡️ Speed up function find_last_node by 13,550% #66

Uh oh!

Conversation

codeflash-ai bot commented Jul 30, 2025

📄 13,550% (135.50x) speedup for find_last_node in src/dsa/nodes.py

📝 Explanation and details

Uh oh!

Uh oh!

⚡️ Speed up function `find_last_node` by 13,550% #66

⚡️ Speed up function `find_last_node` by 13,550% #66

📄 13,550% (135.50x) speedup for `find_last_node` in `src/dsa/nodes.py`