⚡️ Speed up function cartesian_product by 41%
#299
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 41% (0.41x) speedup for
cartesian_productinpandas/core/indexes/multi.py⏱️ Runtime :
4.81 milliseconds→3.40 milliseconds(best of141runs)📝 Explanation and details
The optimized version achieves a 41% speedup through several targeted micro-optimizations that reduce overhead in NumPy array operations:
Key Optimizations:
Replaced
np.fromiter()with list comprehension: Changednp.fromiter((len(x) for x in X), dtype=np.intp)tonp.array([len(x) for x in X], dtype=np.intp). This eliminates generator dispatch overhead sincelen()is fast and the input sizes are typically small-to-medium.Eliminated
np.roll()operation: Replaced the expensivenp.roll(cumprodX, 1)with manual array allocation (np.empty_like()) and slice assignment. Rolling an entire array involves copying all elements, while the optimized version just copies a slice, reducing memory operations.Integer division instead of float division: Changed
b = cumprodX[-1] / cumprodXtob = prod_total // cumprodX. This avoids float conversion overhead and maintains integer precision, which is more efficient for the subsequentnp.tile/np.repeatoperations that expect integer arguments.Early array conversion: Added
np.asarray(xi)to ensure inputs are converted to arrays once per iteration, optimizing downstream NumPy operations.Performance Impact:
The line profiler shows the most significant gains come from eliminating the costly
np.roll()operation (17.5% of original runtime) and reducing overhead in array creation. The optimizations are particularly effective for the common use cases shown in tests - small-to-medium cartesian products with 2-3 dimensions, where the overhead reductions provide substantial relative benefits.Test Case Performance:
The optimization shows consistent 50-80% speedups across most test cases, with particularly strong performance on basic cases (2-3 lists) and edge cases with mixed types, demonstrating the robustness of the optimizations across different input scenarios.
✅ Correctness verification report:
⚙️ Existing Unit Tests and Runtime
indexes/multi/test_util.py::TestCartesianProduct.test_datetimeindexindexes/multi/test_util.py::TestCartesianProduct.test_emptyindexes/multi/test_util.py::TestCartesianProduct.test_empty_inputindexes/multi/test_util.py::TestCartesianProduct.test_exceed_product_spaceindexes/multi/test_util.py::TestCartesianProduct.test_invalid_inputindexes/multi/test_util.py::TestCartesianProduct.test_simple🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-cartesian_product-mholdb8rand push.