-
Notifications
You must be signed in to change notification settings - Fork 215
fix(actix-ws): wake task when multiple continuation frames exist in WebSocket stream #607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I think this is the wrong solution to this problem. The bug is not that the message stream fails to produce a value when asked. The problem is that the AggregatedMessageStream isn't asking enough. I took a quick look through the AggregatedMessageStream code and it does seem like rather than returning Poll::Pending for continuations, we should either loop in the current poll_next or recursively call poll_next. That way we could continue to process existing messages without inserting calls to wake that would affect more than just this case. |
|
Thank you for your review. I actually considered the approach you suggested at first, but ultimately decided to go with the current solution for the following reasons:
|
|
But the bug is not with MessageStream. If you poll a message stream it will give you an item if it has one. If another future is not polling MessageStream, then it isn't MessageStream's job to decide to wake that other future. For basic implementations like the following, inserting a call to wake here will result in potentially waking this task when there is nothing to do, which could incur a performance cost. For an application handling many websockets, this could lead to a large number of needless wakes. while let Some(message) = message_stream.try_next().await? {
do_other_stuff(message).await;
} |
|
You're absolutely right, and I appreciate you pointing this out. On reflection, I agree that my original fix isn't ideal. It introduces an extra wake in I was overly focused on fixing the I now agree that the cleaner solution is to modify I'll update the PR with this approach. Thanks again for the thoughtful feedback! |
|
I also don't appreciate the use of chatgpt to talk to me in this PR. I'd rather you just talk to me yourself. |
|
很抱歉,我可以看懂英文,但不足以准确表达我的想法,所以我借助了 AI 翻译了我的内容,并反复校对。我可以保证的是,我发布的内容都是我个人真实的表达,而不是 AI 生成的(当然, AI 可能会使用一些不像人类的表达习惯,这可能是我无法分辨的) |
|
我能理解你的感受,和机器人说话的确很浪费时间,后续我会直接使用中文发布内容,并使用 AI 严格翻译,不加润色 I understand how you feel—it really is a waste of time talking to a bot. Going forward, I’ll post content directly in Chinese and use AI for strict translation without any polishing or embellishment. |
|
我更新了 PR,使用了前面所说的方式修复了问题,我使用自己的 web 项目进行了简单测试,的确解决了问题 I've updated the PR, fixing the issue using the approach mentioned earlier. I performed a quick test with my own web project, and it indeed resolved the problem. |
asonix
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, thanks!
JohnTitor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the implementation itself makes sense, but it'd also be great if you could add a regression test in some form, e.g. queueing and simulating messages like your issue. Otherwise a regresion could be caused in a future development.
This includes an ignored failing test for actix#607
|
There is now an ignored test for this in the main branch. |
|
Sorry for the long silence—my work situation had some serious changes recently, and I just didn’t have energy to focus on anything else. Do I still need to do anything here? |
|
I think what's left to do is merge main into this branch, then enable the test and see if it passes |
PR Type
Bug Fix
PR Checklist
cargo +nightly fmt).Overview
Current Behavior (Bug)
When using
AggregatedMessageStreamto process WebSocket continuation frames, the stream can become permanently stuck if multiple continuation frames are received in a single payload. Specifically:FirstText+Continue+Last) arrives in one TCP packetMessageStreamdecodes all frames and stores them in its internalmessagesqueueAggregatedMessageStream.poll_next()receives theContinueframe and returnsPoll::Pendingwhile waiting for theLastframeLastframe remains inMessageStream.messagesqueue unprocessedMessageStreamdoesn't wake the task when returning the first message,AggregatedMessageStreamnever gets polled againAggregatedMessageStream.next().awaitNew Behavior (Fix)
This PR fixes the issue by ensuring the task is woken up when
MessageStreamhas additional messages queued after returning the first message. The change is minimal and surgical:Technical Impact
AggregatedMessageStreamwhen processing multiple continuation framesRoot Cause Analysis
The bug occurs due to a missing wakeup signal in the async coordination between:
MessageStream): Has buffered continuation frames readyAggregatedMessageStream): Waits forLastframe after processingContinueWithout the wakeup, the consumer never knows more data is available in the producer's buffer.
Affected Scenarios
AggregatedMessageStreamfor handling continuation framesRisk Assessment
Risk Level: Very Low