update

qscqesze · qscqesze · commit 9298ff97e701 · 2025-06-26T14:38:32.000+08:00
Signed-off-by: qingjun &lt;qingjun@minimaxi.com&gt;
diff --git a/_posts/2025-06-26-minimax-m1.md b/_posts/2025-06-26-minimax-m1.md
@@ -127,9 +127,18 @@ These enhancements further boost the low-level computation efficiency of MiniMax
 
 As a cutting-edge attention mechanism, Lightning Attention is implemented in vLLM via Triton, leveraging its flexibility and high-performance computing features. A Triton-based execution framework fully supports Lightning Attention's core computation logic, enabling seamless integration and deployment within the vLLM ecosystem.
 
+## Future Work
+
+Looking ahead, further optimizations for hybrid architecture support are actively being explored within the vLLM community. Notably, the development of a hybrid allocator is expected to enable even more efficient memory management tailored to the unique requirements of models like MiniMax-M1.
+
+In addition, full support for vLLM v1 is planned, with the hybrid model architecture expected to be migrated into the v1 framework. These advancements are anticipated to unlock further performance improvements and provide a more robust foundation for future developments.
+
 ## Conclusion
 
 The hybrid architecture of MiniMax-M1 paves the way for the next generation of large language models, offering powerful capabilities in long-context reasoning and complex task inference. vLLM complements this with highly optimized memory handling, robust batch request management, and deeply tuned backend performance.
 
 Together, MiniMax-M1 and vLLM form a strong foundation for efficient and scalable AI applications. As the ecosystem evolves, we anticipate this synergy will power more intelligent, responsive, and capable solutions across a wide range of use cases, including code generation, document analysis, and conversational AI.
 
+## Acknowledgement
+
+We would like to express our sincere gratitude to the vLLM community for their invaluable support and collaboration. In particular, we thank [Tyler Michael Smith](https://github.com/tlrmchlsmth), [Simon Mo](https://github.com/simon-mo), [Cyrus Leung](https://github.com/DarkLight1337), [Roger Wang](https://github.com/ywang96) and [Kaichao You](https://github.com/youkaichao) for their significant contributions. We also appreciate the efforts of the MiniMax engineering team, especially [Gangying Qing](https://github.com/ZZBoom), [Jun Qing](https://github.com/qscqesze), and [Jiaren Cai](https://github.com/sriting), whose dedication made this work possible.