-
Notifications
You must be signed in to change notification settings - Fork 387
Fix TCP Transport Handshake Daemon Initialization #846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix TCP Transport Handshake Daemon Initialization #846
Conversation
- Implement startHandshakeDaemon method in TcpTransport class - Call handshake daemon during transport installation process Signed-off-by: staryxchen <[email protected]>
TCP transport probably doesn't require starting the handshake daemon. |
Without handshake daemon, notification cannot be sent to the peer when transfer task is finished. Mooncake/mooncake-transfer-engine/src/transfer_metadata.cpp Lines 720 to 739 in 785e939
The daemon needs to listen the rpc port and register callback to process notify message. |
I see. So we can support p2phandshake based TCP transport with this feature? |
No, even if we merge this patch, we still cannot support it. The root cause is the following code segment: Mooncake/mooncake-transfer-engine/src/transfer_engine.cpp Lines 103 to 118 in 785e939
I think there is no need to find another available tcp port when rpc_binding_method is P2PHANDSHAKE mode. Simply bind to the port resolved from the local server name is enough. Like the following:
What do you think? |
Problem
TCP transport was missing handshake daemon initialization during installation, preventing the RPC port from listening for incoming connections. This caused peer notify messages to be undeliverable, breaking inter-node communication.
Target's log
Note that currently only the log is being printed; port listening has not actually begun.
Initiator's log
Solution