-
Notifications
You must be signed in to change notification settings - Fork 389
[Store] Import Offset Allocator #641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- remove SliceGuard and SliceBuffer RAII classes - replace SimpleAllocator with offset_allocator::Allocator - use AllocationHandle for automatic memory management - simplify buffer allocation with single contiguous allocations - add offset_allocator implementation in new directory - update CMakeLists to include new allocator source files - add comprehensive test suite for offset_allocator Signed-off-by: Jinyang Su <[email protected]>
One thing is important, related to a lifetime bug (#639). I believe we could accept a unique_ptr resource with the allocator, as they should share the same lifetime. And we can hold a shared_ptr in the handle. This way, we can ensure the handle is always valid? |
This bug occurs because |
Fine, so I would wrap the allocator to bind any managed allocator and handle the lifetime of the resources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Separating the responsibilities of the Allocator from the actual allocation and deallocation of physical memory via malloc is reasonable. On both the Master and Client sides, the lifecycle of the Allocator may not align with the lifecycle of the actual physical memory. For example:
Currently, memory allocated by the Client itself is managed by SegmentDeleter, but any other memory (such as that allocated elsewhere) must still be managed by whoever performed the malloc for it. |
@@ -0,0 +1,176 @@ | |||
# Allocator Memory Utilization Benchmark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Move it to doc dir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Will do in next PR
Content
Import a new offset allocator to address the following issue:
Compared with the original implementation, the imported OffsetAllocator is wrapped to support registering or allocating with a size larger than 3.75 GB, and is further optimized for LLM inference scenario. The optimization will:
a) slightly decrease the memory utilization ratio in general cases;
b) makes no difference when the allocated size is equal to a bin size;
c) largely improve the memory utilization ratio when the allocated size is mostly uniform and not equal to any bin size.
The LLM inference tasks, in most cases, will put objects with mostly uniform sizes, which falls into the latter two cases.
See the following benchmark reports for details.
Result
Uniform size, size equals power of 2
OffsetAllocator (After Optimization)
OffsetAllocator (Before Optimization)
Uniform size, size equals power of 2 +/- 17
OffsetAllocator (After Optimization)
OffsetAllocator (Before Optimization)
Uniform size, size equals power of 2 multiply 0.9 or 1.1
OffsetAllocator (After Optimization)
OffsetAllocator (Before Optimization)
Random Size
OffsetAllocator (After Optimization)
OffsetAllocator (Before Optimization)