Skip to content

Conversation

mandlinsarah
Copy link
Owner

This PR introduces an optimization to the interleave_kv method in CacheView aimed at reducing memory overhead and improving performance. By leveraging in-place operations and avoiding unnecessary list conversions, the updated method maintains the functionality while being more efficient. This small but impactful change should improve the overall performance of the caching mechanism in inference scenarios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant