-
Notifications
You must be signed in to change notification settings - Fork 9
agents: implement gRPC retry policies #387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: node-v24.x-nsolid-v6.x
Are you sure you want to change the base?
Conversation
689c432 to
60e7889
Compare
WalkthroughAdds configurable gRPC deadlines via NSOLID_GRPC_DEADLINE and cached parsing helper; new GrpcClient MakeClientContext overload that accepts deadline seconds; channels receive a JSON service-config for retry policy; AssetStream and send_exit set per-call deadlines; GrpcAgent centralizes OTLP exporter/client setup; tests' timeouts adjusted. Changes
Sequence Diagram(s)sequenceDiagram
participant Env as Environment
participant Util as parse_env_var_int()
participant GrpcClient as GrpcClient
participant Channel as gRPC Channel
participant Context as ClientContext
participant Asset as AssetStream
participant Server as gRPC Server
Env->>Util: read NSOLID_GRPC_DEADLINE
Util->>Util: parse, validate & cache seconds
Note over GrpcClient,Context: MakeClientContext with deadline
GrpcClient->>GrpcClient: get_grpc_deadline_seconds()
GrpcClient->>Context: MakeClientContext(agent_id, saas, deadline)
Context->>Context: set_deadline(now + seconds(deadline))
Context->>Context: add agent_id / saas metadata
Note over Channel: Channel created with service-config (retry JSON)
GrpcClient->>Channel: SetServiceConfigJSON(retry policy)
Asset->>Context: AssetStream uses context with deadline
Context->>Server: Invoke RPC (deadline enforced)
Server-->>Context: respond or deadline error
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
agents/grpc/src/grpc_client.cc (1)
53-91: Review service config JSON formatting for maintainability.The current approach uses fixed-width placeholders (
%0000000000u,%0000000000.1f) and snprintf with clamping to format the retry policy configuration. While this works correctly, it has some maintainability concerns:
- The placeholder format strings must exactly match the allocation size
- Changes to the template require careful coordination with buffer sizing
- The
const_castis necessary but somewhat obscure (though commented)Consider using a more maintainable approach in the future, such as:
- JSON library serialization (nlohmann::json is already in use elsewhere)
- std::ostringstream for type-safe formatting
- C++20 std::format when available
Example alternative:
nlohmann::json service_config = { {"methodConfig", {{ {"name", {{}}}, {"retryPolicy", { {"maxAttempts", options.retry_policy_max_attempts}, {"initialBackoff", std::to_string(initial_backoff) + "s"}, {"maxBackoff", std::to_string(max_backoff) + "s"}, {"backoffMultiplier", backoff_multiplier}, {"retryableStatusCodes", {"CANCELLED", "DEADLINE_EXCEEDED", ...}} }} }}} }; grpc_arguments.SetServiceConfigJSON(service_config.dump());However, the current implementation is correct and handles edge cases properly with the min/max clamping.
agents/grpc/src/grpc_agent.cc (1)
996-1068: Excellent refactoring! Centralized exporter configuration.The new configure_grpc_exporters() method effectively consolidates OTLP exporter setup, making the code more maintainable. The retry policy configuration (max_attempts=5, initial_backoff=500ms, max_backoff=5s, multiplier=2.0) is correctly applied to both the client and exporters.
Optional: Extract retry policy constants to reduce duplication.
The retry policy parameters are duplicated at lines 1018-1021 and 1045-1048. Consider extracting these to named constants if you want to improve maintainability:
// At class or file scope static constexpr size_t kRetryMaxAttempts = 5; static constexpr auto kRetryInitialBackoff = std::chrono::milliseconds(500); static constexpr auto kRetryMaxBackoff = std::chrono::seconds(5); static constexpr float kRetryBackoffMultiplier = 2.0f;Then reference these constants in both locations. This is a nice-to-have improvement that can be deferred.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
agents/grpc/src/asset_stream.cc(3 hunks)agents/grpc/src/grpc_agent.cc(4 hunks)agents/grpc/src/grpc_agent.h(1 hunks)agents/grpc/src/grpc_client.cc(4 hunks)agents/grpc/src/grpc_client.h(2 hunks)deps/opentelemetry-cpp/otlp-http-exporter.gyp(2 hunks)src/nsolid/nsolid_util.h(2 hunks)test/parallel/test-nsolid-grpc-heap-profile.js(2 hunks)test/parallel/test-nsolid-grpc-heap-sampling.js(3 hunks)test/parallel/test-nsolid-grpc-profile.js(3 hunks)
🧰 Additional context used
🧠 Learnings (6)
📚 Learning: 2025-07-15T16:09:04.521Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 327
File: test/addons/nsolid-code-event-hook/nsolid-code-event-hook.js:47-73
Timestamp: 2025-07-15T16:09:04.521Z
Learning: In the nsolid test suite, hardcoded timeouts are acceptable for timing-sensitive tests. The preferred approach when tests become flaky is to increase timeout values rather than implementing more complex synchronization mechanisms.
Applied to files:
test/parallel/test-nsolid-grpc-profile.jstest/parallel/test-nsolid-grpc-heap-sampling.jstest/parallel/test-nsolid-grpc-heap-profile.js
📚 Learning: 2025-07-08T14:46:47.806Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 339
File: test/addons/nsolid-elf-utils/nsolid-elf-utils.js:13-14
Timestamp: 2025-07-08T14:46:47.806Z
Learning: In the nsolid test suite, native addon bindings are expected to be built beforehand; tests do not add fs.existsSync guards to skip when bindings are missing.
Applied to files:
test/parallel/test-nsolid-grpc-profile.jstest/parallel/test-nsolid-grpc-heap-sampling.jstest/parallel/test-nsolid-grpc-heap-profile.js
📚 Learning: 2025-07-08T14:47:34.724Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 339
File: test/addons/nsolid-elf-utils/nsolid-elf-utils.js:16-18
Timestamp: 2025-07-08T14:47:34.724Z
Learning: In the nsolid test suite, external tool invocations (e.g., execSync to run `readelf`) are intentionally left uncaught so that any failure causes the test to fail rather than being skipped.
Applied to files:
test/parallel/test-nsolid-grpc-profile.jstest/parallel/test-nsolid-grpc-heap-sampling.jstest/parallel/test-nsolid-grpc-heap-profile.js
📚 Learning: 2025-07-08T14:48:04.827Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 339
File: test/addons/nsolid-elf-utils/binding.cc:13-28
Timestamp: 2025-07-08T14:48:04.827Z
Learning: In nsolid test native addons (e.g., `test/addons/*/binding.cc`), additional JavaScript-facing argument validation is typically omitted because the tests supply well-formed inputs.
Applied to files:
test/parallel/test-nsolid-grpc-profile.jstest/parallel/test-nsolid-grpc-heap-sampling.jstest/parallel/test-nsolid-grpc-heap-profile.js
📚 Learning: 2025-06-10T09:52:49.401Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 327
File: test/addons/nsolid-code-event-hook/binding.cc:64-78
Timestamp: 2025-06-10T09:52:49.401Z
Learning: In the nsolid codebase, test files in test/addons/ consistently use assert() for input validation and runtime checks, not CHECK() macros. This is the established testing convention.
Applied to files:
test/parallel/test-nsolid-grpc-profile.jstest/parallel/test-nsolid-grpc-heap-sampling.jstest/parallel/test-nsolid-grpc-heap-profile.js
📚 Learning: 2025-10-14T10:18:23.123Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 377
File: test/parallel/test-nsolid-heap-profile-stream.js:81-88
Timestamp: 2025-10-14T10:18:23.123Z
Learning: In nsolid.heapProfileStream() and nsolid.heapSamplingStream(), synchronous throws occur for argument validation errors and immediate state validation checks (like assets being disabled), while async error events are emitted for runtime errors that manifest during execution (like invalid threadId).
Applied to files:
test/parallel/test-nsolid-grpc-profile.jstest/parallel/test-nsolid-grpc-heap-sampling.jstest/parallel/test-nsolid-grpc-heap-profile.js
🧬 Code graph analysis (6)
agents/grpc/src/grpc_agent.h (1)
agents/grpc/src/grpc_agent.cc (2)
configure_grpc_exporters(996-1068)configure_grpc_exporters(996-997)
agents/grpc/src/asset_stream.cc (1)
agents/grpc/src/grpc_client.cc (2)
get_grpc_deadline_seconds(27-33)get_grpc_deadline_seconds(27-27)
agents/grpc/src/grpc_client.cc (1)
src/nsolid/nsolid_util.h (3)
parse_env_var_int(72-88)string(91-99)string(146-150)
agents/grpc/src/grpc_agent.cc (2)
agents/grpc/src/grpc_client.cc (6)
MakeNSolidServiceStub(140-143)MakeNSolidServiceStub(141-141)MakeClientContext(113-118)MakeClientContext(114-115)MakeClientContext(123-135)MakeClientContext(124-126)src/nsolid/nsolid_util.h (1)
parse_env_var_int(72-88)
test/parallel/test-nsolid-grpc-heap-profile.js (2)
agents/grpc/src/grpc_client.h (1)
nsolid(26-162)src/nsolid/nsolid_util.h (1)
nsolid(40-277)
agents/grpc/src/grpc_client.h (1)
agents/grpc/src/grpc_client.cc (6)
MakeClientContext(113-118)MakeClientContext(114-115)MakeClientContext(123-135)MakeClientContext(124-126)get_grpc_deadline_seconds(27-33)get_grpc_deadline_seconds(27-27)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: test-macOS
- GitHub Check: test-linux (ubuntu-24.04)
- GitHub Check: test-linux (ubuntu-24.04-arm)
- GitHub Check: coverage-linux-without-intl
- GitHub Check: lint-js-and-md
- GitHub Check: lint-cpp
- GitHub Check: coverage-windows
- GitHub Check: coverage-linux
- GitHub Check: build-tarball
🔇 Additional comments (16)
deps/opentelemetry-cpp/otlp-http-exporter.gyp (1)
66-66: LGTM! Retry preview feature flag enabled correctly.The
ENABLE_OTLP_RETRY_PREVIEWdefine is properly added to both the target compilation settings and direct dependent settings, which will enable the OTLP retry functionality as intended.Also applies to: 79-79
test/parallel/test-nsolid-grpc-heap-profile.js (2)
3-5: Good practice: Environment variable set before module requires.Setting
NSOLID_GRPC_DEADLINEbefore requiring modules ensures the cached deadline value is initialized correctly, sinceget_grpc_deadline_seconds()caches the value on first access.
113-132: LGTM! Timing adjustment aligns with new gRPC deadline behavior.The 2000ms delay ensures the previous heap profile operation completes before starting new profile tests. This is consistent with the shortened 2-second gRPC deadline and follows the established test suite pattern of using hardcoded timeouts for timing-sensitive tests.
Based on learnings.
test/parallel/test-nsolid-grpc-heap-sampling.js (1)
3-5: LGTM! Consistent deadline configuration and timing adjustments.The test follows the same pattern as other gRPC tests in this PR, setting a shortened deadline and adjusting wait times accordingly.
Also applies to: 156-175
src/nsolid/nsolid_util.h (2)
16-18: Required includes added for environment variable parsing.The new includes support the
parse_env_var_intutility function implementation.
72-88: Robust environment variable parsing utility.The implementation properly validates integer environment variables with:
- Empty string handling
strtolparsing with errno checking- Range validation
- Debug logging for invalid values
agents/grpc/src/grpc_client.cc (2)
25-33: Verify: Cached deadline cannot be changed at runtime.The static cache in
get_grpc_deadline_seconds()means the deadline value is read once and cached forever. If runtime reconfiguration of the deadline is needed in the future, this implementation will need to be updated.Is the immutable caching behavior intentional? If so, consider documenting this limitation.
114-135: LGTM! Deadline support added with clean overload pattern.The two-overload design cleanly separates the default (environment-driven) deadline behavior from explicit deadline specification, and the forwarding implementation avoids code duplication.
agents/grpc/src/asset_stream.cc (1)
2-2: LGTM! Asset stream deadline configured consistently.The deadline is applied in the constructor, ensuring all asset stream RPCs respect the configured gRPC timeout. This is consistent with the deadline handling in other gRPC client contexts.
Also applies to: 12-13, 33-35
test/parallel/test-nsolid-grpc-profile.js (1)
3-5: LGTM! Timing adjustments consistent across gRPC test suite.This test follows the same deadline configuration and timing adjustment pattern as the other gRPC tests modified in this PR (test-nsolid-grpc-heap-profile.js and test-nsolid-grpc-heap-sampling.js).
Based on learnings.
Also applies to: 101-120
agents/grpc/src/grpc_agent.h (1)
205-205: Good refactoring: Centralized exporter configuration.The new
configure_grpc_exportersmethod consolidates OTLP exporter setup logic, improving maintainability by reducing code duplication in the configuration flow.agents/grpc/src/grpc_client.h (2)
71-77: LGTM! Clean deadline support added.The new MakeClientContext overload with configurable deadline is well-documented and integrates cleanly with the existing API. The implementation correctly forwards the 2-parameter version to this 3-parameter version using the environment-driven deadline.
159-160: LGTM! Helper function properly declared.The helper function declaration is clear and will enable consistent deadline configuration across the codebase via the NSOLID_GRPC_DEADLINE environment variable.
agents/grpc/src/grpc_agent.cc (3)
2-2: LGTM! Include added for utility function.The nsolid_util.h include is correctly added to access the parse_env_var_int utility used on line 1098.
1070-1104: LGTM! Config method improved with helper call and safer parsing.The changes correctly refactor exporter setup to use the new configure_grpc_exporters() helper, and the switch to utils::parse_env_var_int() provides safer environment variable parsing with proper bounds checking.
1802-1802: LGTM! Appropriate short deadline for exit event.Using a 1-second deadline for the exit event is a smart choice to ensure fast shutdown without hanging. This is a good use of the new deadline parameter.
fc50f8f to
a41deaf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/nsolid/nsolid_util.h (1)
72-91: Inconsistency with AI summary regarding logging.The AI summary states "Logging: logs an invalid-value message via Debug when parsing fails or value is out of range," but the function does not contain any logging code. Consider adding logging to help with debugging invalid environment variable values, or update the summary.
Consider using
longinstead ofint64_tfor the parsed value.The function uses
std::strtolwhich returnslong, but assigns the result toint64_t. While this works correctly (overflow is caught by theerrnocheck), it's clearer to uselongas the intermediate type to matchstrtol's return type.Apply this diff for better type consistency:
errno = 0; char* endptr = nullptr; - int64_t parsed = std::strtol(raw.c_str(), &endptr, 10); + long parsed = std::strtol(raw.c_str(), &endptr, 10); if (errno == 0 && endptr != nullptr && *endptr == '\0' && parsed >= min_value && parsed <= max_value) { return static_cast<int>(parsed); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
agents/grpc/src/asset_stream.cc(3 hunks)agents/grpc/src/grpc_agent.cc(5 hunks)agents/grpc/src/grpc_agent.h(1 hunks)agents/grpc/src/grpc_client.cc(4 hunks)agents/grpc/src/grpc_client.h(2 hunks)deps/opentelemetry-cpp/otlp-http-exporter.gyp(2 hunks)src/nsolid/nsolid_util.h(2 hunks)test/parallel/test-nsolid-grpc-heap-profile.js(2 hunks)test/parallel/test-nsolid-grpc-heap-sampling.js(3 hunks)test/parallel/test-nsolid-grpc-profile.js(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- test/parallel/test-nsolid-grpc-profile.js
- deps/opentelemetry-cpp/otlp-http-exporter.gyp
- agents/grpc/src/asset_stream.cc
- test/parallel/test-nsolid-grpc-heap-sampling.js
🧰 Additional context used
🧠 Learnings (6)
📚 Learning: 2025-07-15T16:09:04.521Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 327
File: test/addons/nsolid-code-event-hook/nsolid-code-event-hook.js:47-73
Timestamp: 2025-07-15T16:09:04.521Z
Learning: In the nsolid test suite, hardcoded timeouts are acceptable for timing-sensitive tests. The preferred approach when tests become flaky is to increase timeout values rather than implementing more complex synchronization mechanisms.
Applied to files:
test/parallel/test-nsolid-grpc-heap-profile.js
📚 Learning: 2025-07-08T14:46:47.806Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 339
File: test/addons/nsolid-elf-utils/nsolid-elf-utils.js:13-14
Timestamp: 2025-07-08T14:46:47.806Z
Learning: In the nsolid test suite, native addon bindings are expected to be built beforehand; tests do not add fs.existsSync guards to skip when bindings are missing.
Applied to files:
test/parallel/test-nsolid-grpc-heap-profile.js
📚 Learning: 2025-07-08T14:48:04.827Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 339
File: test/addons/nsolid-elf-utils/binding.cc:13-28
Timestamp: 2025-07-08T14:48:04.827Z
Learning: In nsolid test native addons (e.g., `test/addons/*/binding.cc`), additional JavaScript-facing argument validation is typically omitted because the tests supply well-formed inputs.
Applied to files:
test/parallel/test-nsolid-grpc-heap-profile.js
📚 Learning: 2025-07-08T14:47:34.724Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 339
File: test/addons/nsolid-elf-utils/nsolid-elf-utils.js:16-18
Timestamp: 2025-07-08T14:47:34.724Z
Learning: In the nsolid test suite, external tool invocations (e.g., execSync to run `readelf`) are intentionally left uncaught so that any failure causes the test to fail rather than being skipped.
Applied to files:
test/parallel/test-nsolid-grpc-heap-profile.js
📚 Learning: 2025-06-10T09:52:49.401Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 327
File: test/addons/nsolid-code-event-hook/binding.cc:64-78
Timestamp: 2025-06-10T09:52:49.401Z
Learning: In the nsolid codebase, test files in test/addons/ consistently use assert() for input validation and runtime checks, not CHECK() macros. This is the established testing convention.
Applied to files:
test/parallel/test-nsolid-grpc-heap-profile.js
📚 Learning: 2025-10-14T10:18:23.123Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 377
File: test/parallel/test-nsolid-heap-profile-stream.js:81-88
Timestamp: 2025-10-14T10:18:23.123Z
Learning: In nsolid.heapProfileStream() and nsolid.heapSamplingStream(), synchronous throws occur for argument validation errors and immediate state validation checks (like assets being disabled), while async error events are emitted for runtime errors that manifest during execution (like invalid threadId).
Applied to files:
test/parallel/test-nsolid-grpc-heap-profile.js
🧬 Code graph analysis (5)
test/parallel/test-nsolid-grpc-heap-profile.js (2)
agents/grpc/src/grpc_client.h (1)
nsolid(26-162)src/nsolid/nsolid_util.h (1)
nsolid(40-280)
agents/grpc/src/grpc_client.cc (1)
src/nsolid/nsolid_util.h (3)
parse_env_var_int(72-91)string(94-102)string(149-153)
agents/grpc/src/grpc_agent.h (1)
agents/grpc/src/grpc_agent.cc (2)
configure_grpc_exporters(1001-1073)configure_grpc_exporters(1001-1002)
agents/grpc/src/grpc_client.h (1)
agents/grpc/src/grpc_client.cc (6)
MakeClientContext(113-118)MakeClientContext(114-115)MakeClientContext(123-135)MakeClientContext(124-126)get_grpc_deadline_seconds(27-33)get_grpc_deadline_seconds(27-27)
agents/grpc/src/grpc_agent.cc (2)
agents/grpc/src/grpc_client.cc (6)
MakeNSolidServiceStub(140-143)MakeNSolidServiceStub(141-141)MakeClientContext(113-118)MakeClientContext(114-115)MakeClientContext(123-135)MakeClientContext(124-126)src/nsolid/nsolid_util.h (1)
parse_env_var_int(72-91)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: test-macOS
- GitHub Check: test-linux (ubuntu-24.04-arm)
- GitHub Check: test-linux (ubuntu-24.04)
- GitHub Check: build-tarball
- GitHub Check: coverage-windows
- GitHub Check: coverage-linux-without-intl
- GitHub Check: coverage-linux
- GitHub Check: lint-js-and-md
🔇 Additional comments (17)
src/nsolid/nsolid_util.h (1)
16-18: LGTM!The new includes are necessary for the
parse_env_var_intfunction:<cerrno>for errno handling,<cstdlib>forstrtol, and<climits>forINT_MIN/INT_MAXconstants.agents/grpc/src/grpc_agent.h (1)
205-205: LGTM! Clean refactoring to centralize exporter configuration.The new public method appropriately extracts OTLP exporter and client setup logic from the
config()method, improving code organization and maintainability.test/parallel/test-nsolid-grpc-heap-profile.js (2)
3-5: LGTM! Appropriate test optimization.Setting a shorter gRPC deadline (2 seconds) for tests avoids unnecessary delays while still providing sufficient time for operations to complete.
113-132: LGTM! Necessary timing adjustment for new deadline behavior.The 2-second delay ensures the previous heap profile operation completes before starting new profile tests. This aligns with the shortened gRPC deadline and prevents
EInProgressErrorfrom concurrent profile attempts.agents/grpc/src/grpc_client.h (2)
71-77: LGTM! Well-designed API extension.The new overload provides flexibility for custom per-call deadlines while preserving backward compatibility through the existing two-argument version that forwards to this implementation.
159-160: LGTM! Appropriate helper function.The helper function encapsulates deadline retrieval from the environment with caching, promoting code reuse across the gRPC subsystem.
agents/grpc/src/grpc_client.cc (5)
5-7: LGTM! Necessary headers for deadline support.The added headers support the environment variable parsing and provide necessary constants for the deadline implementation.
18-19: LGTM! Solid deadline caching implementation.The environment-based deadline support with static caching is thread-safe and provides sensible defaults (10 seconds) with appropriate bounds (1 second minimum).
Also applies to: 25-33
113-118: LGTM! Appropriate delegation pattern.The two-argument overload correctly forwards to the new three-argument version using the environment-configured deadline, preserving backward compatibility while enabling the new functionality.
120-135: LGTM! Clean implementation.The three-argument overload correctly sets the gRPC context deadline based on the provided seconds value and maintains the existing metadata handling.
53-91: Code is safe; original review concern was unfounded.The verification confirms the buffer allocation is sufficient. The format specifiers
%0000000000uand%0000000000.1feffectively reserve space within the template string (490 bytes). Even with worst-case values (max uint32 and floats clamped to 999999999.0), the formatted JSON output (479 bytes) fits safely within the allocated buffer with an 11-byte safety margin. No buffer overflow or truncation occurs. The code requires no changes.agents/grpc/src/grpc_agent.cc (6)
2-2: LGTM! Necessary include.The include provides the
parse_env_var_intutility function used later in the file.
66-70: LGTM! Appropriate retry configuration.The retry policy constants are well-defined and match the PR description (max 5 attempts, 500ms to 5s exponential backoff with 2x multiplier). These values align with gRPC retry best practices.
1001-1073: LGTM! Solid refactoring.The new method effectively centralizes OTLP exporter and client setup, reducing code duplication through the
set_common_optionslambda. The retry policy is consistently applied across all exporters, and the comment about CommandStream creation order helps prevent potential race conditions.
1075-1075: LGTM! Improved type safety and error handling.Line 1075: Making the JSON type explicit removes ambiguity and improves readability.
Line 1103: Using
parse_env_var_intinstead ofstoullprovides safer environment variable parsing with bounds checking (0-1 for the boolean insecure flag) and graceful error handling.Also applies to: 1103-1103
1109-1109: LGTM! Clean refactoring.Replacing inline exporter setup with the call to
configure_grpc_exporters()improves code organization while maintaining the same functionality.
1807-1808: LGTM! Appropriate short deadline for exit.Using a 1-second deadline for the exit event ensures the process terminates quickly rather than waiting for the default 10-second timeout. This is reasonable since the process is shutting down regardless of success or failure.
a41deaf to
3c83261
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
agents/grpc/src/asset_stream.cc(3 hunks)agents/grpc/src/grpc_agent.cc(5 hunks)agents/grpc/src/grpc_agent.h(1 hunks)agents/grpc/src/grpc_client.cc(4 hunks)agents/grpc/src/grpc_client.h(2 hunks)deps/opentelemetry-cpp/otlp-http-exporter.gyp(2 hunks)src/nsolid/nsolid_util.h(2 hunks)test/parallel/test-nsolid-grpc-heap-profile.js(2 hunks)test/parallel/test-nsolid-grpc-heap-sampling.js(3 hunks)test/parallel/test-nsolid-grpc-profile.js(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- agents/grpc/src/grpc_client.h
- test/parallel/test-nsolid-grpc-heap-sampling.js
- agents/grpc/src/grpc_agent.h
- agents/grpc/src/asset_stream.cc
🧰 Additional context used
🧠 Learnings (6)
📚 Learning: 2025-07-15T16:09:04.521Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 327
File: test/addons/nsolid-code-event-hook/nsolid-code-event-hook.js:47-73
Timestamp: 2025-07-15T16:09:04.521Z
Learning: In the nsolid test suite, hardcoded timeouts are acceptable for timing-sensitive tests. The preferred approach when tests become flaky is to increase timeout values rather than implementing more complex synchronization mechanisms.
Applied to files:
test/parallel/test-nsolid-grpc-heap-profile.jstest/parallel/test-nsolid-grpc-profile.js
📚 Learning: 2025-07-08T14:46:47.806Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 339
File: test/addons/nsolid-elf-utils/nsolid-elf-utils.js:13-14
Timestamp: 2025-07-08T14:46:47.806Z
Learning: In the nsolid test suite, native addon bindings are expected to be built beforehand; tests do not add fs.existsSync guards to skip when bindings are missing.
Applied to files:
test/parallel/test-nsolid-grpc-heap-profile.jstest/parallel/test-nsolid-grpc-profile.js
📚 Learning: 2025-07-08T14:48:04.827Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 339
File: test/addons/nsolid-elf-utils/binding.cc:13-28
Timestamp: 2025-07-08T14:48:04.827Z
Learning: In nsolid test native addons (e.g., `test/addons/*/binding.cc`), additional JavaScript-facing argument validation is typically omitted because the tests supply well-formed inputs.
Applied to files:
test/parallel/test-nsolid-grpc-heap-profile.jstest/parallel/test-nsolid-grpc-profile.js
📚 Learning: 2025-07-08T14:47:34.724Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 339
File: test/addons/nsolid-elf-utils/nsolid-elf-utils.js:16-18
Timestamp: 2025-07-08T14:47:34.724Z
Learning: In the nsolid test suite, external tool invocations (e.g., execSync to run `readelf`) are intentionally left uncaught so that any failure causes the test to fail rather than being skipped.
Applied to files:
test/parallel/test-nsolid-grpc-heap-profile.jstest/parallel/test-nsolid-grpc-profile.js
📚 Learning: 2025-06-10T09:52:49.401Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 327
File: test/addons/nsolid-code-event-hook/binding.cc:64-78
Timestamp: 2025-06-10T09:52:49.401Z
Learning: In the nsolid codebase, test files in test/addons/ consistently use assert() for input validation and runtime checks, not CHECK() macros. This is the established testing convention.
Applied to files:
test/parallel/test-nsolid-grpc-heap-profile.jstest/parallel/test-nsolid-grpc-profile.js
📚 Learning: 2025-10-14T10:18:23.123Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 377
File: test/parallel/test-nsolid-heap-profile-stream.js:81-88
Timestamp: 2025-10-14T10:18:23.123Z
Learning: In nsolid.heapProfileStream() and nsolid.heapSamplingStream(), synchronous throws occur for argument validation errors and immediate state validation checks (like assets being disabled), while async error events are emitted for runtime errors that manifest during execution (like invalid threadId).
Applied to files:
test/parallel/test-nsolid-grpc-heap-profile.jstest/parallel/test-nsolid-grpc-profile.js
🧬 Code graph analysis (3)
test/parallel/test-nsolid-grpc-heap-profile.js (2)
agents/grpc/src/grpc_client.h (1)
nsolid(26-162)src/nsolid/nsolid_util.h (1)
nsolid(41-281)
agents/grpc/src/grpc_agent.cc (2)
agents/grpc/src/grpc_client.cc (6)
MakeNSolidServiceStub(140-143)MakeNSolidServiceStub(141-141)MakeClientContext(113-118)MakeClientContext(114-115)MakeClientContext(123-135)MakeClientContext(124-126)src/nsolid/nsolid_util.h (1)
parse_env_var_int(73-92)
agents/grpc/src/grpc_client.cc (1)
src/nsolid/nsolid_util.h (3)
parse_env_var_int(73-92)string(95-103)string(150-154)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: coverage-windows
- GitHub Check: lint-js-and-md
- GitHub Check: test-macOS
- GitHub Check: test-linux (ubuntu-24.04-arm)
- GitHub Check: test-linux (ubuntu-24.04)
- GitHub Check: coverage-linux-without-intl
- GitHub Check: coverage-linux
- GitHub Check: build-tarball
🔇 Additional comments (2)
deps/opentelemetry-cpp/otlp-http-exporter.gyp (2)
62-67: LGTM!The
ENABLE_OTLP_RETRY_PREVIEWflag is correctly added to the top-level defines, enabling the retry preview feature at compile time. The syntax and placement are consistent with existing flags.
75-80: LGTM!Correctly propagates
ENABLE_OTLP_RETRY_PREVIEWto direct dependent targets, ensuring the retry feature is enabled consistently across the build. This follows the same pattern as other feature flags in the configuration.
9c2e3f4 to
3c83261
Compare
Enable them in opentelemetry-cpp by defining `ENABLE_OTLP_RETRY_PREVIEW` and also them to our GrpcClient. The values we're using with a default deadline of 10 seconds are: - max_attemps: 5 - initial_backoff: 500ms - max_backoff: 5s - backoff_multiplier: 2 Add NSOLID_GRPC_DEADLINE env var for configurable gRPC timeouts. Specially useful for testing. Refactor and dry code when creating grpc exporters inside the new GrpcAgent::configure_grpc_exporters() helper function. Create parse_env_var_int() utility in nsolid_util.h for safe env parsing.
3c83261 to
e74e183
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
agents/grpc/src/asset_stream.cc(3 hunks)agents/grpc/src/grpc_agent.cc(5 hunks)agents/grpc/src/grpc_agent.h(1 hunks)agents/grpc/src/grpc_client.cc(4 hunks)agents/grpc/src/grpc_client.h(2 hunks)deps/opentelemetry-cpp/otlp-http-exporter.gyp(2 hunks)src/nsolid/nsolid_util.h(2 hunks)test/parallel/test-nsolid-grpc-heap-profile.js(2 hunks)test/parallel/test-nsolid-grpc-heap-sampling.js(3 hunks)test/parallel/test-nsolid-grpc-profile.js(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
- test/parallel/test-nsolid-grpc-heap-sampling.js
- test/parallel/test-nsolid-grpc-heap-profile.js
- agents/grpc/src/grpc_agent.h
- deps/opentelemetry-cpp/otlp-http-exporter.gyp
- agents/grpc/src/grpc_client.h
- src/nsolid/nsolid_util.h
🧰 Additional context used
🧠 Learnings (9)
📚 Learning: 2025-07-15T16:09:04.521Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 327
File: test/addons/nsolid-code-event-hook/nsolid-code-event-hook.js:47-73
Timestamp: 2025-07-15T16:09:04.521Z
Learning: In the nsolid test suite, hardcoded timeouts are acceptable for timing-sensitive tests. The preferred approach when tests become flaky is to increase timeout values rather than implementing more complex synchronization mechanisms.
Applied to files:
test/parallel/test-nsolid-grpc-profile.js
📚 Learning: 2025-07-08T14:46:47.806Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 339
File: test/addons/nsolid-elf-utils/nsolid-elf-utils.js:13-14
Timestamp: 2025-07-08T14:46:47.806Z
Learning: In the nsolid test suite, native addon bindings are expected to be built beforehand; tests do not add fs.existsSync guards to skip when bindings are missing.
Applied to files:
test/parallel/test-nsolid-grpc-profile.js
📚 Learning: 2025-07-08T14:47:34.724Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 339
File: test/addons/nsolid-elf-utils/nsolid-elf-utils.js:16-18
Timestamp: 2025-07-08T14:47:34.724Z
Learning: In the nsolid test suite, external tool invocations (e.g., execSync to run `readelf`) are intentionally left uncaught so that any failure causes the test to fail rather than being skipped.
Applied to files:
test/parallel/test-nsolid-grpc-profile.js
📚 Learning: 2025-07-08T14:48:04.827Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 339
File: test/addons/nsolid-elf-utils/binding.cc:13-28
Timestamp: 2025-07-08T14:48:04.827Z
Learning: In nsolid test native addons (e.g., `test/addons/*/binding.cc`), additional JavaScript-facing argument validation is typically omitted because the tests supply well-formed inputs.
Applied to files:
test/parallel/test-nsolid-grpc-profile.js
📚 Learning: 2025-06-10T09:52:49.401Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 327
File: test/addons/nsolid-code-event-hook/binding.cc:64-78
Timestamp: 2025-06-10T09:52:49.401Z
Learning: In the nsolid codebase, test files in test/addons/ consistently use assert() for input validation and runtime checks, not CHECK() macros. This is the established testing convention.
Applied to files:
test/parallel/test-nsolid-grpc-profile.js
📚 Learning: 2025-10-14T10:18:23.123Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 377
File: test/parallel/test-nsolid-heap-profile-stream.js:81-88
Timestamp: 2025-10-14T10:18:23.123Z
Learning: In nsolid.heapProfileStream() and nsolid.heapSamplingStream(), synchronous throws occur for argument validation errors and immediate state validation checks (like assets being disabled), while async error events are emitted for runtime errors that manifest during execution (like invalid threadId).
Applied to files:
test/parallel/test-nsolid-grpc-profile.js
📚 Learning: 2025-07-08T16:05:45.341Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 339
File: src/nsolid/nsolid_elf_utils.cc:18-18
Timestamp: 2025-07-08T16:05:45.341Z
Learning: In the NSolid project, the `GetBuildId` function in `src/nsolid/nsolid_elf_utils.cc` is designed to be called only from the NSolid thread, so the static cache doesn't require thread safety mechanisms like mutex protection.
Applied to files:
agents/grpc/src/grpc_client.cc
📚 Learning: 2025-11-05T13:30:16.195Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 387
File: agents/grpc/src/grpc_client.cc:81-97
Timestamp: 2025-11-05T13:30:16.195Z
Learning: In gRPC's ChannelArguments::SetServiceConfigJSON(), trailing NUL bytes in the std::string argument don't cause issues because downstream gRPC code uses c_str() to treat it as a C NUL-terminated string, reading only up to the first NUL terminator.
Applied to files:
agents/grpc/src/grpc_client.cc
📚 Learning: 2025-08-20T11:17:56.528Z
Learnt from: santigimeno
Repo: nodesource/nsolid PR: 354
File: test/agents/test-grpc-blocked-loop.mjs:224-224
Timestamp: 2025-08-20T11:17:56.528Z
Learning: In the gRPC agent token parsing (agents/grpc/src/grpc_agent.cc), CONSOLE_ID_SIZE = 36 is used for generating console IDs, not for validating console ID length during SAAS token parsing. Token parsing does not enforce console ID length constraints, so test tokens with console ID portions of different lengths (like 32 characters) work correctly.
Applied to files:
agents/grpc/src/grpc_client.cc
🔇 Additional comments (2)
agents/grpc/src/asset_stream.cc (2)
2-2: LGTM: Include and using declarations are appropriate.The include of
grpc_client.hand the using declarations for chrono types are necessary for the deadline functionality and follow standard C++ practices.Also applies to: 12-13
33-35: Verify deadline behavior for long-running streaming RPCs.The deadline is set once in the constructor and applies to the entire RPC lifetime. For streaming RPCs like
ExportContinuousProfile(line 39), this means the stream will terminate when the deadline is reached (default 10 seconds per PR summary).Please verify:
- Are continuous profiling streams designed to be short-lived and periodically recreated?
- If streams need to stay open longer than the deadline, consider either:
- Using different deadline strategies per RPC type
- Relying on gRPC keepalive settings for long-running streams
- Documenting that
NSOLID_GRPC_DEADLINEmust be set appropriately for continuous profiling
Enable them in opentelemetry-cpp by defining
ENABLE_OTLP_RETRY_PREVIEWand also them to our GrpcClient. The values we're using with a default deadline of 10 seconds are:Add NSOLID_GRPC_DEADLINE env var for configurable gRPC timeouts. Specially useful for testing.
Refactor and dry code when creating grpc exporters inside the new GrpcAgent::configure_grpc_exporters() helper function.
Create parse_env_var_int() utility in nsolid_util.h for safe env parsing.
Summary by CodeRabbit
New Features
Bug Fixes
Tests