Skip to content

Conversation

youkaichao
Copy link
Member

move #76 here to enable previews

Copy link

cloudflare-workers-and-pages bot commented Sep 1, 2025

Deploying vllm-blog-source with  Cloudflare Pages  Cloudflare Pages

Latest commit: 87c5b92
Status: ✅  Deploy successful!
Preview URL: https://9c8df877.vllm-blog-source.pages.dev
Branch Preview URL: https://add-vsr-blog.vllm-blog-source.pages.dev

View logs

@Xunzhuo Xunzhuo changed the title Add vsr blog Add vLLM Semantic Router Blog Sep 1, 2025
@rootfs
Copy link

rootfs commented Sep 1, 2025

@youkaichao thank you for reviewing, is it ready to go for publishing today? Thank you!

Copy link

@windsonsea windsonsea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing AI-generated tags and style can make the content more inviting for human to read.

Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
@Xunzhuo
Copy link
Member

Xunzhuo commented Sep 4, 2025

Preview is here: https://c8e293a4.vllm-blog-source.pages.dev/2025/09/01/semantic-router

Copy link

vercel bot commented Sep 10, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
vllm-project-github-io Ready Ready Preview Comment Sep 10, 2025 3:57am

@simon-mo
Copy link
Contributor

I reviewed the blog. Two comments

  • I would suggest tuning down the "business value" of semantic router. I believe (1) it is a speculation (2) it distract the technical value of this post
  • Can you run the ModernBERT with vLLM? If so, we should put that down as future work, if not, we should clearly explain why

@Xunzhuo
Copy link
Member

Xunzhuo commented Sep 10, 2025

cool, thanks for the review @simon-mo, yep, for the first one, I think we need to emphasize the technique part.

And for the second one, I think you raised a key point which is in our roadmap: pluggable embedding model architecture, so for the modernBERT, that is something lightweight and embedded inside the router, and for other embedding model which can be deployed by vLLM engine and that can be also integrated with vsr with external call.

@rootfs
Copy link

rootfs commented Sep 10, 2025

Thank you @simon-mo for the review!

Can you run the ModernBERT with vLLM? If so, we should put that down as future work, if not, we should clearly explain why

At the moment, the semantic router uses the modernBERT for internal classification. However, we will explore more ways to get text embedding for semantic cache. Many of these models can be hosted by vLLM and I believe this will be more extensible.

We'll detail these directions and use cases in the upcoming revision!

Signed-off-by: bitliu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants