Skip to content

Conversation

alogfans
Copy link
Collaborator

@alogfans alogfans commented Aug 8, 2025

No description provided.

@ShangmingCai
Copy link
Collaborator

@adiprerepa Seems like we added the sync in the wrong place. We think this new update might work.

@adiprerepa
Copy link

hmm, this didn't work for me either. I also added it before and after every transfer, and even did a time.sleep() in some places, none of this fixed it. I'm not sure if this is a race, this is peculiar.

@alogfans
Copy link
Collaborator Author

@adiprerepa I found that changing cudaDeviceSynchronize() to cudaEventSynchronize(events[id]) is effective in your sample. But I don't know how to construct a patch non-intrusively.

@alogfans
Copy link
Collaborator Author

@adiprerepa Perhaps you can try the new approach.

@ShangmingCai
Copy link
Collaborator

I have also opened a PR to transfer metadata through TCP to temporarily avoid small blocks nvlink transport: sgl-project/sglang#9261

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants