-
Notifications
You must be signed in to change notification settings - Fork 85
Add RACK to SCTP #390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add RACK to SCTP #390
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #390 +/- ##
==========================================
+ Coverage 83.74% 83.83% +0.08%
==========================================
Files 51 52 +1
Lines 4527 3779 -748
==========================================
- Hits 3791 3168 -623
+ Misses 597 453 -144
- Partials 139 158 +19
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
3709326 to
877b27d
Compare
|
@enobufs Hi, I think you've previously tried to implement RACK for SCTP, would you happen to have any pointers for whether my current work looks like it's on the right track, or any notes/comments on things that you had in mind? If you also have a branch for me to use as reference if you've worked on it before then that would also be very appreciated! |
eb3bd88 to
686e5ec
Compare
|
To test this I used this branch's SCTP version in the following examples:
Each of the examples (and the listed versions) worked as the behavior matched the expected behavior, so I think it's looking good so far :) |
|
@philipch07 this is beyond amazing. You are doing something phenomenal and making it look easy :) Anything I can do to help? In the back of your head think about writing a blogpost.
|
6c3e8ba to
3914804
Compare
|
Thank you @Sean-Der! I think the only thing I need right now regarding this PR is more reviews and testing (ideally in real applications with some latency). I'm excited but also cautious about this since it seems to be a pretty desired feature! The next thing I have in mind at the moment for RACK would be #394 (adaptive burst mitigation) which only further improves RACK's performance. I hope that the two features would make for a compelling upgrade for users :) I'm also brainstorming some ideas for the blog post as you've mentioned but it may be tied closer to open source contributing as a whole instead of just about RACK (especially since this type of feature is a bit less flashy). With some time and luck, I think I can create an interesting narrative about my motives, interests, and journey through the contributions that I've made so far! |
|
Also I performed a short test with Linux's traffic control in my WSL setup to simulate 500ms latency for outbound traffic (so anything going from server -> client would be delayed) using the data-channels example from webrtc (latest version, so v4.1.6) and it didn't drop any messages in either direction, even when I sent a burst of messages to the server (all of which were instant), and with intermittent messages to the server. |
3914804 to
cc4e9d0
Compare
|
@philipch07 Could you test this with https://github.com/pion/example-webrtc-applications/tree/master/ebiten-game |
|
also this is off topic, but @philipch07 are you in the discord server? I'd like to talk you more casually about SCTP/datachannel stuff. |
This is the flashiest thing that could happen. If you can put up a graph that shows 'Under these network conditions file transfers happen X faster' and call out stuff that use DataChannels + Go (WebTorrent and Tor Snowflake) that is a huge thing. I would love to introduce you to companies with a flashy intro of 'This is the person that made WebRTC Datachannels n% faster'. This and ICE Renomination that @kevmo314 working on I think have biggest change of just instantly making peoples WebRTC sessions better with no effort on their side. |
Just finished testing this and confirmed that this works! I didn't test it with added latency though.
That's a fantastic idea, that makes me think of creating some sort of visual simulation to display the impact of RACK vs non-RACK implementation in practice!
That makes sense to me, and I agree that "free" performance with 0 changes/settings toggled is always a win for a user :) |
|
Whoops forgot to @Sean-Der in the above comment^ |
|
Chiming in to say that this is really cool! The paper is really interesting, so cool to see this in pion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really good, I reviewed it over days, there are no obvious issues in the code logic.
Really excited to see the benchmark.
Thank you a lot <3
8fc1911 to
d7f2b58
Compare
d7f2b58 to
63ec7e1
Compare
|
Going from low RTT to high RTT causes the packets to drop, but that's fixed after commit #244b97f It needed to go to extreme values tho, and it was more of a manual test, I'll try to get it as a CI test, but It's very hard to make in a way that's not flaky and under ~20s before: After: Thank you. |
|
Also there are many network conditions that makes the current sctp implementation drop packets but this implementation handles it so well. really amazing work. |
83a3b73 to
bb8671d
Compare
at-wat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me! Added some minor comments
570c8e0 to
eee6720
Compare
at-wat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
eee6720 to
e7436d8
Compare
|
@sirzooro Let me know if you're interested in reviewing this PR before I merge it and if you have any eta if so! |
e7436d8 to
914a271
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements RACK (Recent ACKnowledgement) loss detection and Tail Loss Probe (TLP) mechanisms for SCTP, providing more efficient and responsive loss recovery compared to traditional timeout-based retransmission. RACK is enabled by default but can be disabled via configuration.
Key changes:
- Adds windowedMin data structure to track minimum RTT over a sliding window
- Implements RACK state tracking and loss detection logic in Association
- Introduces unified timer system for RACK and PTO deadlines
- Updates test assertions to accommodate RACK-based recovery patterns
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| windowedmin.go | New data structure implementing sliding window minimum for RTT tracking |
| windowedmin_test.go | Comprehensive tests for windowedMin implementation |
| association.go | Core RACK/TLP implementation including state variables, timer management, and loss detection logic |
| association_test.go | New unit tests for RACK behavior and updated congestion control test expectations |
| vnet_test.go | Uncommented assertions to verify RACK reduces fast retransmits |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
914a271 to
c53eef3
Compare
c53eef3 to
a74c9fe
Compare
|
During some SCP testing, @JoeTurki found that the CPU utilization was a little higher than expected. Luckily there's in RFC 8985 sec 6.2 there's a section that I originally overlooked which addresses the concern:
The new commits address the issue and will be tested to see whether the more involved fix is worth it over the simple fix regarding CPU usage among other stats. |
3bc39ff to
aba8cd2
Compare
aba8cd2 to
b2c3746
Compare
Description
Adds RACK to SCTP (see the paper that this is based on here: https://icnp20.cs.ucr.edu/proceedings/nipaa/RACK%20for%20SCTP.pdf)
I would also like to add the variable burst mitigation as described in the above paper (section 5a), but that will be done in a separate PR as it doesn't require RACK to be implemented.
Reference issue
Closes #389
Upd: expanded version of the paper above: https://duepublico2.uni-due.de/servlets/MCRFileNodeServlet/duepublico_derivate_00073893/Diss_Weinrank.pdf#page=97