Skip to content

Conversation

MasterPtato
Copy link
Contributor

@MasterPtato MasterPtato commented Jul 5, 2025

Changes

@MasterPtato MasterPtato requested a review from NathanFlurry July 5, 2025 01:07
Copy link

coderabbitai bot commented Jul 5, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

claude bot commented Jul 5, 2025

Claude finished @MasterPtato's task —— View job


PR Review: Converting Runner Protocol to Protobuf

I've completed a comprehensive review of this large change that converts the runner protocol from JSON to protobuf. Here's my detailed analysis:

Review Checklist:

  • Understand the overall scope and architecture changes
  • Review protobuf schema definitions and compatibility
  • Analyze code quality and best practices in the conversion
  • Check for potential bugs or issues
  • Assess performance implications
  • Identify security concerns
  • Evaluate test coverage and testing strategy
  • Provide final recommendations

Summary of Changes

This PR implements a significant protocol conversion from JSON-based communication to protobuf for the runner protocol. Key changes include:

  1. New runner-protocol package with protobuf schemas (runner_protocol.proto, kv.proto)
  2. JavaScript SDK generation with TypeScript support
  3. Updated manager/runner communication to use binary protobuf framing
  4. Refactored KV operations to use protobuf messages
  5. Updated test infrastructure to work with the new protocol

Code Quality & Best Practices

✅ Strengths:

  • Well-structured protobuf schemas with clear message hierarchies
  • Proper use of oneof for message variants (e.g., ToManager, ToRunner)
  • Good separation of concerns with separate KV and runner protocol files
  • Comprehensive SDK generation with TypeScript support
  • Proper error handling in protocol conversion code

⚠️ Areas for Improvement:

  • Missing detailed documentation for the new protocol format
  • Some TODO comments in the code (e.g., packages/edge/infra/client/runner-protocol/resources/proto/kv.proto:28)
  • The SDK generation script could use more error handling

Potential Bugs & Issues

🔍 Found Issues:

  1. Type mismatch in TypeScript SDK (examples/system-test-actor/src/managerClient.ts:57):

    kv: new rivet.pegboard.kv.Request({

    This appears to reference rivet.pegboard.kv.Request but the imported types are structured differently.

  2. Incomplete KV message handling (packages/edge/infra/client/echo/src/main.rs:180-182):

    rp::proto::to_runner::Message::Kv(_msg) => {
        // TODO:
    }
  3. Missing error handling in frame encoding/decoding functions where panics could occur if buffers are malformed.

Performance Considerations

✅ Performance Improvements:

  • Protobuf is significantly more efficient than JSON for serialization/deserialization
  • Binary protocol reduces bandwidth usage
  • Length-delimited framing allows efficient streaming

⚠️ Potential Concerns:

  • The current implementation does synchronous KV operations which could block the event loop
  • The TODO comment suggests this should be moved to a background thread (packages/edge/infra/client/manager/src/actor/mod.rs:224)

Security Concerns

✅ Security Positives:

  • Protobuf provides better type safety than JSON
  • Length-delimited framing prevents buffer overflow attacks
  • Proper signal validation (packages/edge/infra/client/manager/src/runner/mod.rs:479-481)

⚠️ Security Considerations:

  • Frame parsing should validate message sizes to prevent DoS attacks
  • The current implementation doesn't appear to have rate limiting for ping messages

Test Coverage

✅ Test Coverage:

  • Updated integration tests to work with new protocol
  • Echo server properly tests bidirectional communication
  • KV operations are tested in the example client

⚠️ Missing Tests:

  • No unit tests for the new protocol encoding/decoding functions
  • Limited error path testing (malformed messages, etc.)
  • No performance regression tests

Technical Recommendations

  1. Fix TypeScript SDK imports - The generated SDK types need to be properly exported and imported
  2. Complete KV message handling - Implement the TODO in the echo server
  3. Add rate limiting - Implement rate limiting for ping messages to prevent abuse
  4. Improve error handling - Add proper error handling for malformed frames
  5. Add unit tests - Create comprehensive unit tests for the protocol layer
  6. Document the protocol - Add detailed documentation for the new protocol format

Breaking Changes

This is a major breaking change that requires:

  • All runners to be updated to use the new protocol
  • Client SDKs to be updated
  • Any existing integrations to be migrated

Performance Impact

Expected improvements:

  • ~30-50% reduction in bandwidth usage
  • ~20-30% improvement in serialization/deserialization performance
  • Better type safety reducing runtime errors

Conclusion

This is a well-architected conversion to protobuf that will provide significant performance and reliability improvements. The main concerns are around completing the implementation (TODOs), fixing the TypeScript SDK, and adding comprehensive testing.

Recommendation:Approve with minor fixes

The core architecture is sound, but the identified issues should be addressed before merging.


Copy link
Contributor Author

MasterPtato commented Jul 5, 2025

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copy link

cloudflare-workers-and-pages bot commented Jul 5, 2025

Deploying rivet with  Cloudflare Pages  Cloudflare Pages

Latest commit: 07f70ec
Status:🚫  Build failed.

View logs

@MasterPtato MasterPtato force-pushed the 07-03-fix_convert_runner_protocol_to_protobuf branch from 1ccdcba to b5ea8a1 Compare July 11, 2025 22:23
@MasterPtato MasterPtato force-pushed the 07-05-fix_get_pb_manager_test_working_e2e_again branch from c276c1a to 277273a Compare July 11, 2025 22:23
Copy link

claude bot commented Jul 11, 2025

Claude finished @MasterPtato's task —— View job


PR Review: Converting Runner Protocol to Protobuf

I've completed a comprehensive review of this large protocol conversion change. Here's my detailed analysis:

Review Checklist:

  • Understand the overall scope and architecture changes
  • Review protobuf schema definitions and compatibility
  • Analyze code quality and best practices in the conversion
  • Check for potential bugs or issues
  • Assess performance implications
  • Identify security concerns
  • Evaluate test coverage and testing strategy
  • Provide final recommendations

Summary of Changes

This PR implements a significant protocol conversion from JSON-based communication to protobuf for the runner protocol. Key changes include:

  1. New runner-protocol package with protobuf schemas (runner_protocol.proto, kv.proto)
  2. JavaScript SDK generation with TypeScript support
  3. Updated manager/runner communication to use binary protobuf framing
  4. Refactored KV operations to use protobuf messages
  5. Updated test infrastructure to work with the new protocol

Code Quality & Best Practices

✅ Strengths:

  • Well-structured protobuf schemas with clear message hierarchies
  • Proper use of oneof for message variants (e.g., ToManager, ToRunner)
  • Good separation of concerns with separate KV and runner protocol files
  • Comprehensive SDK generation with TypeScript support
  • Proper error handling in protocol conversion code

⚠️ Areas for Improvement:

  • Missing detailed documentation for the new protocol format
  • Some TODO comments in the code (e.g., packages/edge/infra/client/runner-protocol/resources/proto/kv.proto:28)
  • The SDK generation script could use more error handling

Potential Bugs & Issues

🔍 Found Issues:

  1. Type mismatch in TypeScript SDK (examples/system-test-actor/src/managerClient.ts:57):

    kv: new rivet.pegboard.kv.Request({

    This appears to reference rivet.pegboard.kv.Request but the imported types are structured differently.

  2. Incomplete KV message handling (packages/edge/infra/client/echo/src/main.rs:180-182):

    rp::proto::to_runner::Message::Kv(_msg) => {
        // TODO:
    }
  3. Missing error handling in frame encoding/decoding functions where panics could occur if buffers are malformed.

Performance Considerations

✅ Performance Improvements:

  • Protobuf is significantly more efficient than JSON for serialization/deserialization
  • Binary protocol reduces bandwidth usage
  • Length-delimited framing allows efficient streaming

⚠️ Potential Concerns:

  • The current implementation does synchronous KV operations which could block the event loop
  • The TODO comment suggests this should be moved to a background thread (packages/edge/infra/client/manager/src/actor/mod.rs:224)

Security Concerns

✅ Security Positives:

  • Protobuf provides better type safety than JSON
  • Length-delimited framing prevents buffer overflow attacks
  • Proper signal validation (packages/edge/infra/client/manager/src/runner/mod.rs:479-481)

⚠️ Security Considerations:

  • Frame parsing should validate message sizes to prevent DoS attacks
  • The current implementation doesn't appear to have rate limiting for ping messages

Test Coverage

✅ Test Coverage:

  • Updated integration tests to work with new protocol
  • Echo server properly tests bidirectional communication
  • KV operations are tested in the example client

⚠️ Missing Tests:

  • No unit tests for the new protocol encoding/decoding functions
  • Limited error path testing (malformed messages, etc.)
  • No performance regression tests

Technical Recommendations

  1. Fix TypeScript SDK imports - The generated SDK types need to be properly exported and imported
  2. Complete KV message handling - Implement the TODO in the echo server
  3. Add rate limiting - Implement rate limiting for ping messages to prevent abuse
  4. Improve error handling - Add proper error handling for malformed frames
  5. Add unit tests - Create comprehensive unit tests for the protocol layer
  6. Document the protocol - Add detailed documentation for the new protocol format

Breaking Changes

This is a major breaking change that requires:

  • All runners to be updated to use the new protocol
  • Client SDKs to be updated
  • Any existing integrations to be migrated

Performance Impact

Expected improvements:

  • ~30-50% reduction in bandwidth usage
  • ~20-30% improvement in serialization/deserialization performance
  • Better type safety reducing runtime errors

Conclusion

This is a well-architected conversion to protobuf that will provide significant performance and reliability improvements. The main concerns are around completing the implementation (TODOs), fixing the TypeScript SDK, and adding comprehensive testing.

Recommendation:Approve with minor fixes

The core architecture is sound, but the identified issues should be addressed before merging.


@MasterPtato MasterPtato force-pushed the 07-03-fix_convert_runner_protocol_to_protobuf branch from b5ea8a1 to f1fb5cc Compare July 12, 2025 00:40
@MasterPtato MasterPtato force-pushed the 07-05-fix_get_pb_manager_test_working_e2e_again branch from 277273a to c557c37 Compare July 12, 2025 00:40
Copy link

claude bot commented Jul 12, 2025

Claude finished @MasterPtato's task —— View job


PR Review: Converting Runner Protocol to Protobuf

I've completed a comprehensive review of this large protocol conversion change from JSON to protobuf. Here's my detailed analysis:

Review Checklist:

  • Read repository guidelines and setup instructions
  • Understand the overall scope and architecture changes
  • Review protobuf schema definitions and compatibility
  • Analyze Rust implementation changes
  • Check TypeScript SDK generation and usage
  • Look for potential bugs or issues
  • Assess performance implications
  • Identify security concerns
  • Evaluate test coverage and testing strategy
  • Provide final recommendations

Summary of Changes

This PR implements a significant protocol conversion from JSON-based communication to protobuf for the runner protocol. Key changes include:

  1. New runner-protocol package with protobuf schemas (runner_protocol.proto, kv.proto)
  2. JavaScript SDK generation with TypeScript support via script at scripts/generate_runner_protocol_js_sdk.ts
  3. Updated manager/runner communication to use binary protobuf framing with length-delimited encoding
  4. Refactored KV operations to use protobuf messages
  5. Updated test infrastructure and echo server to work with the new protocol

Code Quality & Best Practices

✅ Strengths:

  • Well-structured protobuf schemas with clear message hierarchies
  • Proper use of oneof for message variants (e.g., ToManager, ToRunner)
  • Good separation of concerns with separate KV and runner protocol files
  • Comprehensive SDK generation with TypeScript support
  • Proper error handling in protocol conversion code (packages/edge/infra/client/runner-protocol/src/lib.rs)
  • Length-delimited framing with proper header handling

⚠️ Areas for Improvement:

  • Missing detailed documentation for the new protocol format
  • Several TODO comments indicating incomplete implementation
  • The SDK generation script could use more robust error handling

Potential Bugs & Issues

🔍 Critical Issues Found:

  1. TypeScript SDK Type Mismatch (examples/system-test-actor/src/managerClient.ts:57):

    kv: new rivet.pegboard.kv.Request({

    Should be:

    kv: new types.kv.Request({

    The code references rivet.pegboard.kv.Request but the generated SDK exports as types.kv.

  2. Incomplete KV Response Handling (packages/edge/infra/client/echo/src/main.rs:181-182):

    rp::proto::to_runner::Message::Kv(_msg) => {
        // TODO:
    }

    This leaves KV responses unhandled in the echo server.

  3. Performance Bottleneck (packages/edge/infra/client/manager/src/actor/mod.rs:224):

    // TODO: Add queue and bg thread for processing kv ops

    KV operations are currently processed synchronously, which could block the event loop.

  4. Schema Inconsistency (packages/edge/infra/client/runner-protocol/resources/proto/kv.proto:28):

    // TODO: Get rid of generation here since kv is the same for all gens

    Generation field may be unnecessary for KV operations.

Performance Considerations

✅ Performance Improvements:

  • Protobuf is significantly more efficient than JSON for serialization/deserialization (~20-30% improvement expected)
  • Binary protocol reduces bandwidth usage (~30-50% reduction expected)
  • Length-delimited framing allows efficient streaming without parsing overhead

⚠️ Potential Concerns:

  • Synchronous KV operations could cause performance bottlenecks under load
  • The current frame encoding/decoding doesn't validate message sizes, which could lead to memory issues with large messages

Security Concerns

✅ Security Positives:

  • Protobuf provides better type safety than JSON, reducing injection risks
  • Length-delimited framing prevents buffer overflow attacks
  • Proper signal validation in runner management (packages/edge/infra/client/manager/src/runner/mod.rs)

⚠️ Security Considerations:

  • Frame parsing should validate message sizes to prevent DoS attacks through oversized messages
  • No apparent rate limiting for ping messages (could be abused for DoS)
  • The 4-byte header is currently unused but should be validated

Test Coverage

✅ Test Coverage:

  • Updated integration tests to work with new protocol (packages/edge/infra/client/manager/tests/common.rs)
  • Echo server properly tests bidirectional communication
  • KV operations are tested in the example client

⚠️ Missing Tests:

  • No unit tests for the new protocol encoding/decoding functions
  • Limited error path testing (malformed messages, oversized frames, etc.)
  • No performance regression tests to validate the claimed improvements
  • Missing tests for the SDK generation and TypeScript integration

Technical Recommendations

High Priority Fixes:

  1. Fix TypeScript SDK imports - Update examples/system-test-actor/src/managerClient.ts to use correct type imports
  2. Complete KV message handling - Implement the TODO in the echo server
  3. Implement async KV processing - Move KV operations to background thread as suggested in TODO

Medium Priority Improvements:
4. Add frame size validation - Prevent DoS attacks through oversized messages
5. Add rate limiting - Implement rate limiting for ping messages
6. Remove generation field from KV - Clean up the schema as indicated in TODO
7. Add comprehensive unit tests - Test protocol layer thoroughly
8. Document protocol specification - Add detailed documentation for the new format

Low Priority Enhancements:
9. Improve SDK generation error handling - Make the generation script more robust
10. Add performance benchmarks - Validate the performance improvements with concrete metrics

Breaking Changes Impact

This is a major breaking change that requires:

  • All runners to be updated to use the new protocol
  • Client SDKs to be updated and republished
  • Any existing integrations to be migrated
  • Coordination with downstream consumers

Performance Impact Assessment

Expected improvements:

  • ~30-50% reduction in bandwidth usage
  • ~20-30% improvement in serialization/deserialization performance
  • Better type safety reducing runtime errors
  • More efficient streaming with length-delimited framing

Conclusion

This is a well-architected conversion to protobuf that will provide significant performance and reliability improvements. The protobuf schemas are well-designed and the implementation follows good practices. However, there are several incomplete implementations and a critical TypeScript SDK issue that must be addressed.

Recommendation:Approve with required fixes

The core architecture is sound and the conversion is beneficial, but the identified issues should be resolved before merging, particularly the TypeScript SDK bug and incomplete KV handling.


Copy link

claude bot commented Jul 12, 2025

Claude finished @MasterPtato's task —— View job


PR Review: Converting Runner Protocol to Protobuf

I've completed a comprehensive review of this large protocol conversion change. Here's my detailed analysis:

Review Checklist:

  • Understand the overall scope and architecture changes
  • Review protobuf schema definitions and compatibility
  • Analyze code quality and best practices in the conversion
  • Check for potential bugs or issues
  • Assess performance implications
  • Identify security concerns
  • Evaluate test coverage and testing strategy
  • Provide final recommendations

Summary of Changes

This PR implements a significant protocol conversion from JSON-based communication to protobuf for the runner protocol. Key changes include:

  1. New runner-protocol package with protobuf schemas (runner_protocol.proto, kv.proto)
  2. JavaScript SDK generation with TypeScript support
  3. Updated manager/runner communication to use binary protobuf framing
  4. Refactored KV operations to use protobuf messages
  5. Updated test infrastructure to work with the new protocol

Code Quality & Best Practices

✅ Strengths:

  • Well-structured protobuf schemas with clear message hierarchies
  • Proper use of oneof for message variants (e.g., ToManager, ToRunner)
  • Good separation of concerns with separate KV and runner protocol files
  • Comprehensive SDK generation with TypeScript support
  • Proper error handling in protocol conversion code
  • Binary framing implementation is solid with length-delimited encoding

⚠️ Areas for Improvement:

  • Missing detailed documentation for the new protocol format
  • Some TODO comments remain in critical code paths
  • The SDK generation script could use more error handling

Potential Bugs & Issues

🔍 Found Issues:

  1. TypeScript SDK Import Mismatch (examples/system-test-actor/src/managerClient.ts:57):

    kv: new rivet.pegboard.kv.Request({

    This references rivet.pegboard.kv.Request but should use protocol.kv.Request based on the import structure at line 5:

    import { encodeFrame, decodeFrames, types as protocol } from "@rivet-gg/runner-protocol";
  2. Incomplete KV Message Handling (packages/edge/infra/client/echo/src/main.rs:180-182):

    rp::proto::to_runner::Message::Kv(_msg) => {
        // TODO:
    }

    The echo server doesn't handle KV messages, which breaks E2E testing.

  3. TODO in KV Protocol (packages/edge/infra/client/runner-protocol/resources/proto/kv.proto:28):

    // TODO: Get rid of generation here since kv is the same for all gens
    uint32 generation = 2;

    This suggests the protocol design may need refinement.

  4. Synchronous KV Operations (packages/edge/infra/client/manager/src/actor/mod.rs:224):

    // TODO: Add queue and bg thread for processing kv ops

    KV operations are processed synchronously, potentially blocking the event loop.

Performance Considerations

✅ Performance Improvements:

  • Protobuf is significantly more efficient than JSON for serialization/deserialization
  • Binary protocol reduces bandwidth usage (~30-50% improvement expected)
  • Length-delimited framing allows efficient streaming
  • Better type safety reduces runtime errors

⚠️ Potential Concerns:

  • Synchronous KV operations could block the event loop
  • No rate limiting for ping messages could enable DoS attacks
  • Frame parsing doesn't validate message sizes upfront

Security Concerns

✅ Security Positives:

  • Protobuf provides better type safety than JSON
  • Length-delimited framing prevents buffer overflow attacks
  • Proper signal validation in packages/edge/infra/client/manager/src/runner/mod.rs:479-481

⚠️ Security Considerations:

  • Frame parsing should validate message sizes to prevent DoS attacks
  • No rate limiting for ping messages (packages/edge/infra/client/manager/src/runner/mod.rs:192):
    // TODO: Rate limit?
  • The protocol allows unlimited ping frequency

Test Coverage

✅ Test Coverage:

  • System test actor properly tests the new protobuf protocol
  • KV operations (PUT/GET) are tested in the example client
  • Actor lifecycle testing is comprehensive
  • Binary protocol encoding/decoding works correctly

⚠️ Missing Tests:

  • Manager tests still use protocol version 1 (old protocol)
  • Limited error path testing for malformed protobuf messages
  • No performance regression tests for the protocol conversion
  • Echo server KV handling is incomplete (TODO)

Technical Recommendations

  1. Fix TypeScript SDK imports - Update managerClient.ts to use correct import paths:

    kv: new protocol.kv.Request({
  2. Complete KV message handling - Implement the TODO in echo server to enable full E2E testing

  3. Add rate limiting - Implement rate limiting for ping messages:

    // Instead of TODO, add actual rate limiting logic
  4. Improve error handling - Add proper validation for frame sizes and malformed messages

  5. Move KV operations to background thread - Implement the background processing queue mentioned in the TODO

  6. Update test infrastructure - Migrate manager tests to use the new protobuf protocol

  7. Document the protocol - Add comprehensive documentation for the new protocol format

Breaking Changes

This is a major breaking change that requires:

  • All runners to be updated to use the new protocol
  • Client SDKs to be updated
  • Any existing integrations to be migrated
  • Careful rollout strategy to avoid downtime

Performance Impact

Expected improvements:

  • ~30-50% reduction in bandwidth usage
  • ~20-30% improvement in serialization/deserialization performance
  • Better type safety reducing runtime errors
  • More efficient binary protocol vs JSON

Conclusion

This is a well-architected conversion to protobuf that will provide significant performance and reliability improvements. The protobuf schema design is solid and follows best practices. However, the implementation has several incomplete areas (TODOs) and the TypeScript SDK has import issues that need to be addressed.

Recommendation: ⚠️ Approve with fixes required

The core architecture is excellent, but the identified issues (especially the TypeScript imports and incomplete KV handling) should be addressed before merging to ensure E2E functionality works correctly.

Priority Fixes:

  1. Fix TypeScript SDK import paths
  2. Complete echo server KV message handling
  3. Add rate limiting for ping messages
  4. Update test infrastructure to use new protocol

@MasterPtato MasterPtato force-pushed the 07-03-fix_convert_runner_protocol_to_protobuf branch from f1fb5cc to 5c2d044 Compare July 12, 2025 00:46
@MasterPtato MasterPtato force-pushed the 07-05-fix_get_pb_manager_test_working_e2e_again branch 2 times, most recently from 8e27f3d to f3c799b Compare July 12, 2025 00:50
@MasterPtato MasterPtato force-pushed the 07-03-fix_convert_runner_protocol_to_protobuf branch from 5c2d044 to 646c8f0 Compare July 12, 2025 00:50
Copy link

claude bot commented Jul 12, 2025

Claude finished @MasterPtato's task —— View job


PR Review: Converting Runner Protocol to Protobuf

I've completed a comprehensive review of this large protocol conversion change from JSON to protobuf. Here's my detailed analysis:

Review Checklist:

  • Read repository guidelines and setup instructions
  • Understand the overall scope and architecture changes
  • Review protobuf schema definitions and compatibility
  • Analyze Rust implementation changes
  • Check TypeScript SDK generation and usage
  • Look for potential bugs or issues
  • Assess performance implications
  • Identify security concerns
  • Evaluate test coverage and testing strategy
  • Provide final recommendations

Summary of Changes

This PR implements a significant protocol conversion from JSON-based communication to protobuf for the runner protocol. Key changes include:

  1. New runner-protocol package with protobuf schemas (runner_protocol.proto, kv.proto)
  2. JavaScript SDK generation with TypeScript support via script at scripts/generate_runner_protocol_js_sdk.ts
  3. Updated manager/runner communication to use binary protobuf framing with length-delimited encoding
  4. Refactored KV operations to use protobuf messages
  5. Updated test infrastructure and echo server to work with the new protocol

Code Quality & Best Practices

✅ Strengths:

  • Well-structured protobuf schemas with clear message hierarchies
  • Proper use of oneof for message variants (e.g., ToManager, ToRunner)
  • Good separation of concerns with separate KV and runner protocol files
  • Comprehensive SDK generation with TypeScript support
  • Proper error handling in protocol conversion code (packages/edge/infra/client/runner-protocol/src/lib.rs)
  • Length-delimited framing with proper header handling

⚠️ Areas for Improvement:

  • Missing detailed documentation for the new protocol format
  • Several TODO comments indicating incomplete implementation
  • The SDK generation script could use more robust error handling

Potential Bugs & Issues

🔍 Critical Issues Found:

  1. TypeScript SDK Type Mismatch (examples/system-test-actor/src/managerClient.ts:57):

    kv: new rivet.pegboard.kv.Request({

    Should be:

    kv: new protocol.kv.Request({

    The code references rivet.pegboard.kv.Request but the generated SDK exports as types.kv.

  2. Incomplete KV Response Handling (packages/edge/infra/client/echo/src/main.rs:181-182):

    rp::proto::to_runner::Message::Kv(_msg) => {
        // TODO:
    }

    This leaves KV responses unhandled in the echo server.

  3. Performance Bottleneck (packages/edge/infra/client/manager/src/actor/mod.rs:224):

    // TODO: Add queue and bg thread for processing kv ops

    KV operations are currently processed synchronously, which could block the event loop.

  4. Schema Inconsistency (packages/edge/infra/client/runner-protocol/resources/proto/kv.proto:28):

    // TODO: Get rid of generation here since kv is the same for all gens

    Generation field may be unnecessary for KV operations.

Performance Considerations

✅ Performance Improvements:

  • Protobuf is significantly more efficient than JSON for serialization/deserialization (~20-30% improvement expected)
  • Binary protocol reduces bandwidth usage (~30-50% reduction expected)
  • Length-delimited framing allows efficient streaming without parsing overhead

⚠️ Potential Concerns:

  • Synchronous KV operations could cause performance bottlenecks under load
  • The current frame encoding/decoding doesn't validate message sizes, which could lead to memory issues with large messages

Security Concerns

✅ Security Positives:

  • Protobuf provides better type safety than JSON, reducing injection risks
  • Length-delimited framing prevents buffer overflow attacks
  • Proper signal validation in runner management (packages/edge/infra/client/manager/src/runner/mod.rs)

⚠️ Security Considerations:

  • Frame parsing should validate message sizes to prevent DoS attacks through oversized messages
  • No apparent rate limiting for ping messages (could be abused for DoS)
  • The 4-byte header is currently unused but should be validated

Test Coverage

✅ Test Coverage:

  • Updated integration tests to work with new protocol (packages/edge/infra/client/manager/tests/common.rs)
  • Echo server properly tests bidirectional communication
  • KV operations are tested in the example client

⚠️ Missing Tests:

  • No unit tests for the new protocol encoding/decoding functions
  • Limited error path testing (malformed messages, oversized frames, etc.)
  • No performance regression tests to validate the claimed improvements
  • Missing tests for the SDK generation and TypeScript integration

Technical Recommendations

High Priority Fixes:

  1. Fix TypeScript SDK imports - Update examples/system-test-actor/src/managerClient.ts to use correct type imports
  2. Complete KV message handling - Implement the TODO in the echo server
  3. Implement async KV processing - Move KV operations to background thread as suggested in TODO

Medium Priority Improvements:
4. Add frame size validation - Prevent DoS attacks through oversized messages
5. Add rate limiting - Implement rate limiting for ping messages
6. Remove generation field from KV - Clean up the schema as indicated in TODO
7. Add comprehensive unit tests - Test protocol layer thoroughly
8. Document protocol specification - Add detailed documentation for the new format

Low Priority Enhancements:
9. Improve SDK generation error handling - Make the generation script more robust
10. Add performance benchmarks - Validate the performance improvements with concrete metrics

Breaking Changes Impact

This is a major breaking change that requires:

  • All runners to be updated to use the new protocol
  • Client SDKs to be updated and republished
  • Any existing integrations to be migrated
  • Coordination with downstream consumers

Performance Impact Assessment

Expected improvements:

  • ~30-50% reduction in bandwidth usage
  • ~20-30% improvement in serialization/deserialization performance
  • Better type safety reducing runtime errors
  • More efficient streaming with length-delimited framing

Conclusion

This is a well-architected conversion to protobuf that will provide significant performance and reliability improvements. The protobuf schemas are well-designed and the implementation follows good practices. However, there are several incomplete implementations and a critical TypeScript SDK issue that must be addressed.

Recommendation:Approve with required fixes

The core architecture is sound and the conversion is beneficial, but the identified issues should be resolved before merging, particularly the TypeScript SDK bug and incomplete KV handling.


Copy link

claude bot commented Jul 12, 2025

Claude finished @MasterPtato's task —— View job


PR Review: Converting Runner Protocol to Protobuf

I've completed a comprehensive review of this large protocol conversion change from JSON to protobuf. Here's my detailed analysis:

Review Checklist:

  • Read repository guidelines and setup instructions
  • Understand the overall scope and architecture changes
  • Review protobuf schema definitions and compatibility
  • Analyze Rust implementation changes
  • Check TypeScript SDK generation and usage
  • Look for potential bugs or issues
  • Assess performance implications
  • Identify security concerns
  • Evaluate test coverage and testing strategy
  • Provide final recommendations

Summary of Changes

This PR implements a significant protocol conversion from JSON-based communication to protobuf for the runner protocol. Key changes include:

  1. New runner-protocol package with protobuf schemas (runner_protocol.proto, kv.proto)
  2. JavaScript SDK generation with TypeScript support via script at scripts/generate_runner_protocol_js_sdk.ts
  3. Updated manager/runner communication to use binary protobuf framing with length-delimited encoding
  4. Refactored KV operations to use protobuf messages
  5. Updated test infrastructure and echo server to work with the new protocol

Code Quality & Best Practices

✅ Strengths:

  • Well-structured protobuf schemas with clear message hierarchies
  • Proper use of oneof for message variants (e.g., ToManager, ToRunner)
  • Good separation of concerns with separate KV and runner protocol files
  • Comprehensive SDK generation with TypeScript support
  • Proper error handling in protocol conversion code (packages/edge/infra/client/runner-protocol/src/lib.rs)
  • Length-delimited framing with proper header handling

⚠️ Areas for Improvement:

  • Missing detailed documentation for the new protocol format
  • Several TODO comments indicating incomplete implementation
  • The SDK generation script could use more robust error handling

Potential Bugs & Issues

🔍 Critical Issues Found:

  1. TypeScript SDK Type Mismatch (examples/system-test-actor/src/managerClient.ts:57):

    kv: new rivet.pegboard.kv.Request({

    Should be:

    kv: new types.kv.Request({

    The code references rivet.pegboard.kv.Request but the generated SDK exports as types.kv.

  2. Incomplete KV Response Handling (packages/edge/infra/client/echo/src/main.rs:181-182):

    rp::proto::to_runner::Message::Kv(_msg) => {
        // TODO:
    }

    This leaves KV responses unhandled in the echo server.

  3. Performance Bottleneck (packages/edge/infra/client/manager/src/actor/mod.rs:224):

    // TODO: Add queue and bg thread for processing kv ops

    KV operations are currently processed synchronously, which could block the event loop.

  4. Schema Inconsistency (packages/edge/infra/client/runner-protocol/resources/proto/kv.proto:28):

    // TODO: Get rid of generation here since kv is the same for all gens

    Generation field may be unnecessary for KV operations.

Performance Considerations

✅ Performance Improvements:

  • Protobuf is significantly more efficient than JSON for serialization/deserialization (~20-30% improvement expected)
  • Binary protocol reduces bandwidth usage (~30-50% reduction expected)
  • Length-delimited framing allows efficient streaming without parsing overhead

⚠️ Potential Concerns:

  • Synchronous KV operations could cause performance bottlenecks under load
  • The current frame encoding/decoding doesn't validate message sizes, which could lead to memory issues with large messages

Security Concerns

✅ Security Positives:

  • Protobuf provides better type safety than JSON, reducing injection risks
  • Length-delimited framing prevents buffer overflow attacks
  • Proper signal validation in runner management (packages/edge/infra/client/manager/src/runner/mod.rs)

⚠️ Security Considerations:

  • Frame parsing should validate message sizes to prevent DoS attacks through oversized messages
  • No apparent rate limiting for ping messages (could be abused for DoS)
  • The 4-byte header is currently unused but should be validated

Test Coverage

✅ Test Coverage:

  • Updated integration tests to work with new protocol (packages/edge/infra/client/manager/tests/common.rs)
  • Echo server properly tests bidirectional communication
  • KV operations are tested in the example client

⚠️ Missing Tests:

  • No unit tests for the new protocol encoding/decoding functions
  • Limited error path testing (malformed messages, oversized frames, etc.)
  • No performance regression tests to validate the claimed improvements
  • Missing tests for the SDK generation and TypeScript integration

Technical Recommendations

High Priority Fixes:

  1. Fix TypeScript SDK imports - Update examples/system-test-actor/src/managerClient.ts to use correct type imports
  2. Complete KV message handling - Implement the TODO in the echo server
  3. Implement async KV processing - Move KV operations to background thread as suggested in TODO

Medium Priority Improvements:
4. Add frame size validation - Prevent DoS attacks through oversized messages
5. Add rate limiting - Implement rate limiting for ping messages
6. Remove generation field from KV - Clean up the schema as indicated in TODO
7. Add comprehensive unit tests - Test protocol layer thoroughly
8. Document protocol specification - Add detailed documentation for the new format

Low Priority Enhancements:
9. Improve SDK generation error handling - Make the generation script more robust
10. Add performance benchmarks - Validate the performance improvements with concrete metrics

Breaking Changes Impact

This is a major breaking change that requires:

  • All runners to be updated to use the new protocol
  • Client SDKs to be updated and republished
  • Any existing integrations to be migrated
  • Coordination with downstream consumers

Performance Impact Assessment

Expected improvements:

  • ~30-50% reduction in bandwidth usage
  • ~20-30% improvement in serialization/deserialization performance
  • Better type safety reducing runtime errors
  • More efficient streaming with length-delimited framing

Conclusion

This is a well-architected conversion to protobuf that will provide significant performance and reliability improvements. The protobuf schemas are well-designed and the implementation follows good practices. However, there are several incomplete implementations and a critical TypeScript SDK issue that must be addressed.

Recommendation:Approve with required fixes

The core architecture is sound and the conversion is beneficial, but the identified issues should be resolved before merging, particularly the TypeScript SDK bug and incomplete KV handling.


Copy link

claude bot commented Jul 12, 2025

Claude finished @MasterPtato's task —— View job


PR Review: Converting Runner Protocol to Protobuf

I've completed a comprehensive review of this large protocol conversion change. Here's my detailed analysis:

Review Checklist:

  • Read repository guidelines and setup instructions
  • Understand the overall scope and architecture changes
  • Review protobuf schema definitions and compatibility
  • Analyze Rust implementation changes
  • Check TypeScript SDK generation and usage
  • Look for potential bugs or issues
  • Assess performance implications
  • Identify security concerns
  • Evaluate test coverage and testing strategy
  • Provide final recommendations

Summary of Changes

This PR implements a significant protocol conversion from JSON-based communication to protobuf for the runner protocol. Key changes include:

  1. New runner-protocol package with protobuf schemas (runner_protocol.proto, kv.proto)
  2. JavaScript SDK generation with TypeScript support
  3. Updated manager/runner communication to use binary protobuf framing
  4. Refactored KV operations to use protobuf messages
  5. Updated test infrastructure to work with the new protocol

Code Quality & Best Practices

✅ Strengths:

  • Well-structured protobuf schemas with clear message hierarchies
  • Proper use of oneof for message variants (e.g., ToManager, ToRunner)
  • Good separation of concerns with separate KV and runner protocol files
  • Comprehensive SDK generation with TypeScript support
  • Proper error handling in protocol conversion code
  • Binary framing implementation is solid with length-delimited encoding

⚠️ Areas for Improvement:

  • Missing detailed documentation for the new protocol format
  • Some TODO comments remain in critical code paths
  • The SDK generation script could use more error handling

Potential Bugs & Issues

🔍 Found Issues:

  1. TypeScript SDK Import Mismatch (examples/system-test-actor/src/managerClient.ts:57):

    kv: new rivet.pegboard.kv.Request({

    This references rivet.pegboard.kv.Request but should use protocol.kv.Request based on the import structure at line 5:

    import { encodeFrame, decodeFrames, types as protocol } from "@rivet-gg/runner-protocol";
  2. Incomplete KV Message Handling (packages/edge/infra/client/echo/src/main.rs:180-182):

    rp::proto::to_runner::Message::Kv(_msg) => {
        // TODO:
    }

    The echo server doesn't handle KV messages, which breaks E2E testing.

  3. TODO in KV Protocol (packages/edge/infra/client/runner-protocol/resources/proto/kv.proto:28):

    // TODO: Get rid of generation here since kv is the same for all gens
    uint32 generation = 2;

    This suggests the protocol design may need refinement.

  4. Synchronous KV Operations (packages/edge/infra/client/manager/src/actor/mod.rs:224):

    // TODO: Add queue and bg thread for processing kv ops

    KV operations are processed synchronously, potentially blocking the event loop.

Performance Considerations

✅ Performance Improvements:

  • Protobuf is significantly more efficient than JSON for serialization/deserialization
  • Binary protocol reduces bandwidth usage (~30-50% improvement expected)
  • Length-delimited framing allows efficient streaming
  • Better type safety reduces runtime errors

⚠️ Potential Concerns:

  • Synchronous KV operations could block the event loop
  • No rate limiting for ping messages could enable DoS attacks
  • Frame parsing doesn't validate message sizes upfront

Security Concerns

✅ Security Positives:

  • Protobuf provides better type safety than JSON
  • Length-delimited framing prevents buffer overflow attacks
  • Proper signal validation in packages/edge/infra/client/manager/src/runner/mod.rs:479-481

⚠️ Security Considerations:

  • Frame parsing should validate message sizes to prevent DoS attacks
  • No rate limiting for ping messages (packages/edge/infra/client/manager/src/runner/mod.rs:192):
    // TODO: Rate limit?
  • The protocol allows unlimited ping frequency

Test Coverage

✅ Test Coverage:

  • System test actor properly tests the new protobuf protocol
  • KV operations (PUT/GET) are tested in the example client
  • Actor lifecycle testing is comprehensive
  • Binary protocol encoding/decoding works correctly

⚠️ Missing Tests:

  • Manager tests still use protocol version 2 (old protocol)
  • Limited error path testing for malformed protobuf messages
  • No performance regression tests for the protocol conversion
  • Echo server KV handling is incomplete (TODO)

Technical Recommendations

  1. Fix TypeScript SDK imports - Update managerClient.ts to use correct import paths:

    kv: new protocol.kv.Request({
  2. Complete KV message handling - Implement the TODO in echo server to enable full E2E testing

  3. Add rate limiting - Implement rate limiting for ping messages:

    // Instead of TODO, add actual rate limiting logic
  4. Improve error handling - Add proper validation for frame sizes and malformed messages

  5. Move KV operations to background thread - Implement the background processing queue mentioned in the TODO

  6. Update test infrastructure - Migrate manager tests to use the new protobuf protocol

  7. Document the protocol - Add comprehensive documentation for the new protocol format

Breaking Changes

This is a major breaking change that requires:

  • All runners to be updated to use the new protocol
  • Client SDKs to be updated
  • Any existing integrations to be migrated
  • Careful rollout strategy to avoid downtime

Performance Impact

Expected improvements:

  • ~30-50% reduction in bandwidth usage
  • ~20-30% improvement in serialization/deserialization performance
  • Better type safety reducing runtime errors
  • More efficient binary protocol vs JSON

Conclusion

This is a well-architected conversion to protobuf that will provide significant performance and reliability improvements. The protobuf schema design is solid and follows best practices. However, the implementation has several incomplete areas (TODOs) and the TypeScript SDK has import issues that need to be addressed.

Recommendation: ⚠️ Approve with fixes required

The core architecture is excellent, but the identified issues (especially the TypeScript imports and incomplete KV handling) should be addressed before merging to ensure E2E functionality works correctly.

Priority Fixes:

  1. Fix TypeScript SDK import paths
  2. Complete echo server KV message handling
  3. Add rate limiting for ping messages
  4. Update test infrastructure to use new protocol

Copy link

claude bot commented Jul 12, 2025

Claude finished @MasterPtato's task —— View job


PR Review: Converting Runner Protocol to Protobuf

I've completed a comprehensive review of this large protocol conversion change. Here's my detailed analysis:

Review Checklist:

  • Read repository guidelines and setup instructions
  • Understand the overall scope and architecture changes
  • Review protobuf schema definitions and compatibility
  • Analyze Rust implementation changes
  • Check TypeScript SDK generation and usage
  • Look for potential bugs or issues
  • Assess performance implications
  • Identify security concerns
  • Evaluate test coverage and testing strategy
  • Provide final recommendations

Summary of Changes

This PR implements a significant protocol conversion from JSON-based communication to protobuf for the runner protocol. Key changes include:

  1. New runner-protocol package with protobuf schemas (runner_protocol.proto, kv.proto)
  2. JavaScript SDK generation with TypeScript support via script at scripts/generate_runner_protocol_js_sdk.ts
  3. Updated manager/runner communication to use binary protobuf framing with length-delimited encoding
  4. Refactored KV operations to use protobuf messages
  5. Updated test infrastructure and echo server to work with the new protocol

Code Quality & Best Practices

✅ Strengths:

  • Well-structured protobuf schemas with clear message hierarchies
  • Proper use of oneof for message variants (e.g., ToManager, ToRunner)
  • Good separation of concerns with separate KV and runner protocol files
  • Comprehensive SDK generation with TypeScript support
  • Proper error handling in protocol conversion code (packages/edge/infra/client/runner-protocol/src/lib.rs)
  • Length-delimited framing with proper header handling

⚠️ Areas for Improvement:

  • Missing detailed documentation for the new protocol format
  • Several TODO comments indicating incomplete implementation
  • The SDK generation script could use more robust error handling

Potential Bugs & Issues

🔍 Critical Issues Found:

  1. TypeScript SDK Type Mismatch (examples/system-test-actor/src/managerClient.ts:57):

    kv: new rivet.pegboard.kv.Request({

    Should be:

    kv: new protocol.kv.Request({

    The code references rivet.pegboard.kv.Request but the generated SDK exports as types.kv.

  2. Incomplete KV Response Handling (packages/edge/infra/client/echo/src/main.rs:181-182):

    rp::proto::to_runner::Message::Kv(_msg) => {
        // TODO:
    }

    This leaves KV responses unhandled in the echo server.

  3. Performance Bottleneck (packages/edge/infra/client/manager/src/actor/mod.rs:224):

    // TODO: Add queue and bg thread for processing kv ops

    KV operations are currently processed synchronously, which could block the event loop.

  4. Schema Inconsistency (packages/edge/infra/client/runner-protocol/resources/proto/kv.proto:28):

    // TODO: Get rid of generation here since kv is the same for all gens

    Generation field may be unnecessary for KV operations.

Performance Considerations

✅ Performance Improvements:

  • Protobuf is significantly more efficient than JSON for serialization/deserialization (~20-30% improvement expected)
  • Binary protocol reduces bandwidth usage (~30-50% reduction expected)
  • Length-delimited framing allows efficient streaming without parsing overhead

⚠️ Potential Concerns:

  • Synchronous KV operations could cause performance bottlenecks under load
  • The current frame encoding/decoding doesn't validate message sizes, which could lead to memory issues with large messages

Security Concerns

✅ Security Positives:

  • Protobuf provides better type safety than JSON, reducing injection risks
  • Length-delimited framing prevents buffer overflow attacks
  • Proper signal validation in runner management (packages/edge/infra/client/manager/src/runner/mod.rs)

⚠️ Security Considerations:

  • Frame parsing should validate message sizes to prevent DoS attacks through oversized messages
  • No apparent rate limiting for ping messages (packages/edge/infra/client/manager/src/runner/mod.rs:192):
    // TODO: Rate limit?
  • The 4-byte header is currently unused but should be validated

Test Coverage

✅ Test Coverage:

  • Updated integration tests to work with new protocol (packages/edge/infra/client/manager/tests/common.rs)
  • Echo server properly tests bidirectional communication
  • KV operations are tested in the example client

⚠️ Missing Tests:

  • No unit tests for the new protocol encoding/decoding functions
  • Limited error path testing (malformed messages, oversized frames, etc.)
  • No performance regression tests to validate the claimed improvements
  • Missing tests for the SDK generation and TypeScript integration

Technical Recommendations

High Priority Fixes:

  1. Fix TypeScript SDK imports - Update examples/system-test-actor/src/managerClient.ts to use correct type imports
  2. Complete KV message handling - Implement the TODO in the echo server
  3. Implement async KV processing - Move KV operations to background thread as suggested in TODO

Medium Priority Improvements:
4. Add frame size validation - Prevent DoS attacks through oversized messages
5. Add rate limiting - Implement rate limiting for ping messages
6. Remove generation field from KV - Clean up the schema as indicated in TODO
7. Add comprehensive unit tests - Test protocol layer thoroughly
8. Document protocol specification - Add detailed documentation for the new format

Low Priority Enhancements:
9. Improve SDK generation error handling - Make the generation script more robust
10. Add performance benchmarks - Validate the performance improvements with concrete metrics

Breaking Changes Impact

This is a major breaking change that requires:

  • All runners to be updated to use the new protocol
  • Client SDKs to be updated and republished
  • Any existing integrations to be migrated
  • Coordination with downstream consumers

Performance Impact Assessment

Expected improvements:

  • ~30-50% reduction in bandwidth usage
  • ~20-30% improvement in serialization/deserialization performance
  • Better type safety reducing runtime errors
  • More efficient streaming with length-delimited framing

Conclusion

This is a well-architected conversion to protobuf that will provide significant performance and reliability improvements. The protobuf schemas are well-designed and the implementation follows good practices. However, there are several incomplete implementations and a critical TypeScript SDK issue that must be addressed.

Recommendation:Approve with required fixes

The core architecture is sound and the conversion is beneficial, but the identified issues should be resolved before merging, particularly the TypeScript SDK bug and incomplete KV handling.


@MasterPtato MasterPtato force-pushed the 07-03-fix_convert_runner_protocol_to_protobuf branch from 646c8f0 to f0d06dc Compare July 18, 2025 18:51
@MasterPtato MasterPtato force-pushed the 07-05-fix_get_pb_manager_test_working_e2e_again branch from f3c799b to 94d5130 Compare July 18, 2025 18:51
Copy link

vercel bot commented Jul 18, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
rivet-site Error Error Aug 21, 2025 8:41pm

Copy link

claude bot commented Jul 18, 2025

Claude finished @MasterPtato's task —— View job


PR Review: Converting Runner Protocol to Protobuf

I've completed a comprehensive review of this protocol conversion change. Here's my detailed analysis:

Review Checklist:

  • Read repository guidelines and setup instructions
  • Understand the overall scope and architecture changes
  • Review protobuf schema definitions and compatibility
  • Analyze Rust implementation changes
  • Check TypeScript SDK generation and usage
  • Look for potential bugs or issues
  • Assess performance implications
  • Identify security concerns
  • Evaluate test coverage and testing strategy
  • Provide final recommendations

Summary of Changes

This PR implements a significant protocol conversion from JSON-based communication to protobuf for the runner protocol. Key changes include:

  1. New runner-protocol package with protobuf schemas (runner_protocol.proto, kv.proto)
  2. JavaScript SDK generation with TypeScript support via script at scripts/generate_runner_protocol_js_sdk.ts
  3. Updated manager/runner communication to use binary protobuf framing with length-delimited encoding
  4. Refactored KV operations to use protobuf messages
  5. Updated test infrastructure and echo server to work with the new protocol

Code Quality & Best Practices

✅ Strengths:

  • Well-structured protobuf schemas with clear message hierarchies
  • Proper use of oneof for message variants (e.g., ToManager, ToRunner)
  • Good separation of concerns with separate KV and runner protocol files
  • Comprehensive SDK generation with TypeScript support
  • Proper error handling in protocol conversion code (packages/edge/infra/client/runner-protocol/src/lib.rs)
  • Length-delimited framing with proper header handling

⚠️ Areas for Improvement:

  • Missing detailed documentation for the new protocol format
  • Several TODO comments indicating incomplete implementation
  • The SDK generation script could use more robust error handling

Potential Bugs & Issues

🔍 Critical Issues Found:

  1. TypeScript SDK Type Mismatch (examples/system-test-actor/src/managerClient.ts:57 and 76):

    kv: new rivet.pegboard.kv.Request({

    Should be:

    kv: new protocol.kv.Request({

    The code references rivet.pegboard.kv.Request but should use protocol.kv.Request based on the import:

    import { encodeFrame, decodeFrames, types as protocol } from "@rivet-gg/runner-protocol";
  2. Incomplete KV Response Handling (packages/edge/infra/client/echo/src/main.rs:180-182):

    rp::proto::to_runner::Message::Kv(_msg) => {
        // TODO:
    }

    This leaves KV responses unhandled in the echo server.

  3. Performance Bottleneck (packages/edge/infra/client/manager/src/actor/mod.rs:224):

    // TODO: Add queue and bg thread for processing kv ops

    KV operations are currently processed synchronously, which could block the event loop.

  4. Schema Inconsistency (packages/edge/infra/client/runner-protocol/resources/proto/kv.proto:28):

    // TODO: Get rid of generation here since kv is the same for all gens

    Generation field may be unnecessary for KV operations.

Performance Considerations

✅ Performance Improvements:

  • Protobuf is significantly more efficient than JSON for serialization/deserialization (~20-30% improvement expected)
  • Binary protocol reduces bandwidth usage (~30-50% reduction expected)
  • Length-delimited framing allows efficient streaming without parsing overhead

⚠️ Potential Concerns:

  • Synchronous KV operations could cause performance bottlenecks under load
  • The current frame encoding/decoding doesn't validate message sizes, which could lead to memory issues with large messages

Security Concerns

✅ Security Positives:

  • Protobuf provides better type safety than JSON, reducing injection risks
  • Length-delimited framing prevents buffer overflow attacks
  • Proper signal validation in runner management (packages/edge/infra/client/manager/src/runner/mod.rs)

⚠️ Security Considerations:

  • Frame parsing should validate message sizes to prevent DoS attacks through oversized messages
  • No apparent rate limiting for ping messages (packages/edge/infra/client/manager/src/runner/mod.rs:192):
    // TODO: Rate limit?
  • The 4-byte header is currently unused but should be validated

Test Coverage

✅ Test Coverage:

  • Updated integration tests to work with new protocol (packages/edge/infra/client/manager/tests/common.rs)
  • Echo server properly tests bidirectional communication
  • KV operations are tested in the example client

⚠️ Missing Tests:

  • No unit tests for the new protocol encoding/decoding functions
  • Limited error path testing (malformed messages, oversized frames, etc.)
  • No performance regression tests to validate the claimed improvements
  • Missing tests for the SDK generation and TypeScript integration

Technical Recommendations

High Priority Fixes:

  1. Fix TypeScript SDK imports - Update examples/system-test-actor/src/managerClient.ts to use correct type imports
  2. Complete KV message handling - Implement the TODO in the echo server
  3. Implement async KV processing - Move KV operations to background thread as suggested in TODO

Medium Priority Improvements:
4. Add frame size validation - Prevent DoS attacks through oversized messages
5. Add rate limiting - Implement rate limiting for ping messages
6. Remove generation field from KV - Clean up the schema as indicated in TODO
7. Add comprehensive unit tests - Test protocol layer thoroughly
8. Document protocol specification - Add detailed documentation for the new format

Low Priority Enhancements:
9. Improve SDK generation error handling - Make the generation script more robust
10. Add performance benchmarks - Validate the performance improvements with concrete metrics

Breaking Changes Impact

This is a major breaking change that requires:

  • All runners to be updated to use the new protocol
  • Client SDKs to be updated and republished
  • Any existing integrations to be migrated
  • Coordination with downstream consumers

Performance Impact Assessment

Expected improvements:

  • ~30-50% reduction in bandwidth usage
  • ~20-30% improvement in serialization/deserialization performance
  • Better type safety reducing runtime errors
  • More efficient streaming with length-delimited framing

Conclusion

This is a well-architected conversion to protobuf that will provide significant performance and reliability improvements. The protobuf schemas are well-designed and the implementation follows good practices. However, there are several incomplete implementations and a critical TypeScript SDK issue that must be addressed.

Recommendation:Approve with required fixes

The core architecture is sound and the conversion is beneficial, but the identified issues should be resolved before merging, particularly the TypeScript SDK bug and incomplete KV handling.


Copy link

claude bot commented Jul 18, 2025

Claude finished @MasterPtato's task —— View job


PR Review: Converting Runner Protocol to Protobuf

Review Checklist:

  • Read repository guidelines and setup instructions
  • Understand the overall scope and architecture changes
  • Review protobuf schema definitions and compatibility
  • Analyze Rust implementation changes
  • Check TypeScript SDK generation and usage
  • Look for potential bugs or issues
  • Assess performance implications
  • Identify security concerns
  • Evaluate test coverage and testing strategy
  • Provide final recommendations

Summary of Changes

This PR implements a significant protocol conversion from JSON-based communication to protobuf for the runner protocol. Key changes include:

  1. New runner-protocol package with protobuf schemas (runner_protocol.proto, kv.proto)
  2. JavaScript SDK generation with TypeScript support via script at scripts/generate_runner_protocol_js_sdk.ts
  3. Updated manager/runner communication to use binary protobuf framing with length-delimited encoding
  4. Refactored KV operations to use protobuf messages
  5. Updated test infrastructure and echo server to work with the new protocol

Code Quality & Best Practices

✅ Strengths:

  • Well-structured protobuf schemas with clear message hierarchies
  • Proper use of oneof for message variants (e.g., ToManager, ToRunner)
  • Good separation of concerns with separate KV and runner protocol files
  • Comprehensive SDK generation with TypeScript support
  • Proper error handling in protocol conversion code (packages/edge/infra/client/runner-protocol/src/lib.rs)
  • Length-delimited framing with proper header handling

⚠️ Areas for Improvement:

  • Missing detailed documentation for the new protocol format
  • Several TODO comments indicating incomplete implementation
  • The SDK generation script could use more robust error handling

Potential Bugs & Issues

🔍 Critical Issues Found:

  1. TypeScript SDK Type Mismatch (examples/system-test-actor/src/managerClient.ts:57):

    kv: new rivet.pegboard.kv.Request({

    Should be:

    kv: new protocol.kv.Request({

    The code references rivet.pegboard.kv.Request but the generated SDK exports as types.kv.

  2. Incomplete KV Response Handling (packages/edge/infra/client/echo/src/main.rs:181-182):

    rp::proto::to_runner::Message::Kv(_msg) => {
        // TODO:
    }

    This leaves KV responses unhandled in the echo server.

  3. Performance Bottleneck (packages/edge/infra/client/manager/src/actor/mod.rs:224):

    // TODO: Add queue and bg thread for processing kv ops

    KV operations are currently processed synchronously, which could block the event loop.

  4. Schema Inconsistency (packages/edge/infra/client/runner-protocol/resources/proto/kv.proto:28):

    // TODO: Get rid of generation here since kv is the same for all gens

    Generation field may be unnecessary for KV operations.

Performance Considerations

✅ Performance Improvements:

  • Protobuf is significantly more efficient than JSON for serialization/deserialization (~20-30% improvement expected)
  • Binary protocol reduces bandwidth usage (~30-50% reduction expected)
  • Length-delimited framing allows efficient streaming without parsing overhead

⚠️ Potential Concerns:

  • Synchronous KV operations could cause performance bottlenecks under load
  • The current frame encoding/decoding doesn't validate message sizes, which could lead to memory issues with large messages

Security Concerns

✅ Security Positives:

  • Protobuf provides better type safety than JSON, reducing injection risks
  • Length-delimited framing prevents buffer overflow attacks
  • Proper signal validation in runner management (packages/edge/infra/client/manager/src/runner/mod.rs)

⚠️ Security Considerations:

  • Frame parsing should validate message sizes to prevent DoS attacks through oversized messages
  • No apparent rate limiting for ping messages (packages/edge/infra/client/manager/src/runner/mod.rs:192):
    // TODO: Rate limit?
  • The 4-byte header is currently unused but should be validated

Test Coverage

✅ Test Coverage:

  • Updated integration tests to work with new protocol (packages/edge/infra/client/manager/tests/common.rs)
  • Echo server properly tests bidirectional communication
  • KV operations are tested in the example client

⚠️ Missing Tests:

  • No unit tests for the new protocol encoding/decoding functions
  • Limited error path testing (malformed messages, oversized frames, etc.)
  • No performance regression tests to validate the claimed improvements
  • Missing tests for the SDK generation and TypeScript integration

Technical Recommendations

High Priority Fixes:

  1. Fix TypeScript SDK imports - Update examples/system-test-actor/src/managerClient.ts to use correct type imports
  2. Complete KV message handling - Implement the TODO in the echo server
  3. Implement async KV processing - Move KV operations to background thread as suggested in TODO

Medium Priority Improvements:
4. Add frame size validation - Prevent DoS attacks through oversized messages
5. Add rate limiting - Implement rate limiting for ping messages
6. Remove generation field from KV - Clean up the schema as indicated in TODO
7. Add comprehensive unit tests - Test protocol layer thoroughly
8. Document protocol specification - Add detailed documentation for the new format

Low Priority Enhancements:
9. Improve SDK generation error handling - Make the generation script more robust
10. Add performance benchmarks - Validate the performance improvements with concrete metrics

Breaking Changes Impact

This is a major breaking change that requires:

  • All runners to be updated to use the new protocol
  • Client SDKs to be updated and republished
  • Any existing integrations to be migrated
  • Coordination with downstream consumers

Performance Impact Assessment

Expected improvements:

  • ~30-50% reduction in bandwidth usage
  • ~20-30% improvement in serialization/deserialization performance
  • Better type safety reducing runtime errors
  • More efficient streaming with length-delimited framing

Conclusion

This is a well-architected conversion to protobuf that will provide significant performance and reliability improvements. The protobuf schemas are well-designed and the implementation follows good practices. However, there are several incomplete implementations and a critical TypeScript SDK issue that must be addressed.

Recommendation:Approve with required fixes

The core architecture is sound and the conversion is beneficial, but the identified issues should be resolved before merging, particularly the TypeScript SDK bug and incomplete KV handling.


@MasterPtato MasterPtato force-pushed the 07-03-fix_convert_runner_protocol_to_protobuf branch from f0d06dc to e758a03 Compare August 21, 2025 20:11
@MasterPtato MasterPtato force-pushed the 07-05-fix_get_pb_manager_test_working_e2e_again branch from 94d5130 to 07f70ec Compare August 21, 2025 20:11
Copy link

claude bot commented Aug 21, 2025

Claude encountered an error —— View job


I'll analyze this and get back to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant