-
-
Notifications
You must be signed in to change notification settings - Fork 955
[Management/Client] Trigger debug bundle runs from API/Dashboard (#4592) #4832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughIntroduces a per‑peer job subsystem (create/send/receive jobs), adds debug bundle generation/upload helpers, threads logPath/profile config through client engine and run flows, refactors status conversion to use proto.FullStatus, updates protobuf/OpenAPI and wires JobManager across server, client, and tests. Changes
Sequence Diagram(s)sequenceDiagram
actor Mgmt as Management API
participant AM as AccountManager / JobManager
participant GRPC as Mgmt gRPC (Job stream)
participant Peer as Client (Engine)
participant Executor as Job Executor / Bundler
Mgmt->>AM: CreatePeerJob(account, peer, job)
AM->>GRPC: enqueue job event for peer
GRPC->>Peer: stream JobRequest (encrypted)
Peer->>Peer: receiveJobEvents()
Peer->>Executor: BundleJob(...) -- generate & upload
Executor->>Executor: collect logs/status, create bundle
Executor->>Mgmt: upload returns key (JobResponse)
Peer->>GRPC: stream JobResponse (encrypted)
GRPC->>AM: HandleResponse(job response) -> persist completion
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Areas to focus review on:
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
client/status/status.go (1)
234-315: ToProtoFullStatus mapping is good; add nil‑safety when reading latency/handshakeThe new
ToProtoFullStatushelper correctly maps:
- Management and signal state (URL, connected, error).
- Local peer state (IP, pub key, kernel, FQDN, Rosenpass flags, networks from route keys).
- Per‑peer fields (status, timestamps, transfer stats, Rosenpass, networks, latency, SSH host key).
- Relays and DNS group state, including error strings.
Downstream,
mapPeersassumesLatencyandLastWireguardHandshakeare always set:lastHandshake := pbPeerState.GetLastWireguardHandshake().AsTime().Local() ... Latency: pbPeerState.GetLatency().AsDuration(),That’s safe for
FullStatusobjects produced byToProtoFullStatus, but will panic if aPeerStatearrives with these fields unset (e.g., older daemon or other producer). To harden this:var lastHandshake time.Time if ts := pbPeerState.GetLastWireguardHandshake(); ts != nil { lastHandshake = ts.AsTime().Local() } var latency time.Duration if d := pbPeerState.GetLatency(); d != nil { latency = d.AsDuration() }and then use those locals when building
PeerStateDetailOutput.Also applies to: 549-635
client/internal/engine.go (2)
34-53: Engine config/struct wiring for jobs is fine, butc *profilemanager.Configis unusedThe new debug/job additions to
EngineConfig(ProfileConfig,LogPath) andEngine(jobExecutor,jobExecutorWG) are wired in cleanly, and creatingjobExecutorinNewEngineis reasonable.However, the extra parameter
c *profilemanager.ConfigonNewEngineis never used in the function body. In Go this is a compile‑time error. Either remove the parameter or actually thread it into the engine configuration, e.g.:config.ProfileConfig = c // or engine.config.ProfileConfig = cdepending on where you intend to own that reference.
Also applies to: 83-142, 145-222, 235-276
278-366: Job stream consumption and bundle handling are generally solid; watch for restart behavior and nil profile configThe new job flow on the client side looks good overall:
receiveJobEventshooks intomgmClient.Jobwith a per‑message handler that defaults responses toJobStatus_failedand branches onWorkloadParameters– currently onlyBundleis handled, everything else returnsErrJobNotImplemented.jobExecutorWGensuresStop()waits for the Job stream goroutine to exit before tearing down the engine.handleBundlebuildsdebug.GeneratorDependenciesfrom engine state, callsjobExecutor.BundleJob, and wraps the resulting upload key in aJobResponse_Bundle.Two points to be aware of:
When
mgmClient.Jobreturns any error (including whene.ctxis canceled duringStop()), you treat it as a hard failure and callCtxGetState(e.ctx).Wrap(ErrResetConnection)+e.clientCancel(), which triggers a full client restart. That mirrors how Sync/Signal errors are handled but also means a graceful “down” or engine stop will trigger a restart path instead of a quiet shutdown. If that’s not intended, you may want to distinguish context cancellation from other errors.
handleBundleassumese.config.ProfileConfigande.config.ProfileConfig.ManagementURLare non‑nil:InternalConfig: e.config.ProfileConfig, ... uploadKey, err := e.jobExecutor.BundleJob(..., e.config.ProfileConfig.ManagementURL.String())If
ProfileConfigor itsManagementURLcan be unset (e.g., in tests or some platforms), this will panic. A defensive check that returns a failedJobResponsewith a clear reason would make the behavior safer.Also applies to: 942-1012
🧹 Nitpick comments (38)
shared/management/client/grpc.go (5)
112-160: Centralized withMgmtStream helper looks good; double‑check backoff choice and final loggingThe
withMgmtStreamwrapper nicely unifies readiness checks, server key fetching, and retry for bothSyncandJob. One thing to double‑check is whether you explicitly want streaming calls to usedefaultBackoff(ctx)while other RPCs still usenbgrpc.Backoff(...); if not, consider reusing the same backoff helper for consistency, or documenting the intentional difference.Also,
withMgmtStreamlogs"unrecoverable error"for any non‑nilerrafterbackoff.Retry, includingcontext.Canceled/DeadlineExceeded. If you expect normal shutdowns via context cancellation, you may want to special‑case those errors before logging at warn to avoid noisy logs.
162-216: Job stream error handling is solid; refine connection notifications for better state reportingThe overall
handleJobStreamflow (open stream → handshake → recv/process/respond loop) and error classification by gRPC code look good.A couple of refinements around connection state notifications:
notifyDisconnected(err)is called for every receive error, includingcodes.Unimplemented. In that case, the server is reachable but doesn’t supportJob; marking management as “disconnected” can mislead consumers ofConnStateNotifier. Consider movingnotifyDisconnectedinto theswitchand skipping it forcodes.Unimplemented(and possiblycodes.Canceled, which usually indicates local shutdown).- The Job path never calls
notifyConnected(), so this stream can only move the state toward “disconnected”. IfConnStateNotifieris used for user‑visible connectivity, you might want to callnotifyConnected()once the stream and handshake succeed (or when the first job is successfully received) to keep the state transitions balanced.- On
sendJobResponsefailure you return the error but don’t notify disconnection; if this error typically indicates a broken stream, it may be worth also invokingnotifyDisconnected(err)there.These are behavioral tweaks rather than correctness issues but would make connection state reporting more accurate.
218-252: Job helpers mirror existing encryption patterns; consider cleaning up unused ctx parameters
sendHandshakeandreceiveJobRequestcorrectly follow the existingEncryptMessage/DecryptMessagepattern and use the WireGuard public key in theEncryptedMessageas elsewhere in this file.Right now the
ctxparameter passed into these helpers isn’t used inside them; the only effective cancellation point is the stream’s own context fromrealClient.Job(ctx). That’s fine behaviorally, but for clarity you could either:
- Drop
ctxfrom these helper signatures, or- Start using it explicitly (e.g., check
ctx.Err()before/after blocking operations, or for future timeouts / per‑job cancellation).Given this is internal code, I’d treat it as a readability/maintenance cleanup to do when convenient.
254-295: Ensure JobResponse is always well‑formed for correlation and logging
processJobRequestnicely guards against anilhandler result by synthesizing aJobResponsewithStatus: JobStatus_failedand a reason. Two minor robustness tweaks you might consider:
- If the handler returns a non‑nil
JobResponsebut forgets to populateID, you could default it tojobReq.IDso the management server can always correlate responses:jobResp := msgHandler(jobReq) if jobResp == nil { jobResp = &proto.JobResponse{ ID: jobReq.ID, Status: proto.JobStatus_failed, Reason: []byte("handler returned nil response"), } } else if len(jobResp.ID) == 0 { jobResp.ID = jobReq.ID }
- For logging,
string(jobReq.ID)/string(resp.ID)assumes the IDs are valid UTF‑8. If these are ever changed to non‑textual bytes, consider logging them as%xor viahex.EncodeToStringto avoid odd output.Not blockers, but they make the job channel a bit more defensive and easier to debug.
297-395: Sync refactor via connectToSyncStream/receiveUpdatesEvents looks consistent with existing patternsThe split into
connectToSyncStreamandreceiveUpdatesEventsreads clean and matches the encryption/decryption contract used elsewhere (EncryptMessage(serverPubKey, c.key, req)andDecryptMessage(serverPubKey, c.key, update.Body, resp)).A small optional improvement:
GetNetworkMapnow reimplements the “singleRecv+ decryptSyncResponse” logic that’s very similar to whatreceiveUpdatesEventsdoes in a loop. If you find yourself touching this code again, you might consider a tiny shared helper (e.g.,decryptSyncUpdate(serverPubKey, update *proto.EncryptedMessage) (*proto.SyncResponse, error)) to keep the decryption path fully DRY.Functionally, the new sync connection flow looks correct.
client/internal/debug/upload_test.go (1)
1-1: Test now exercises the exported UploadDebugBundle correctlySwitching to
package debugand callingUploadDebugBundle(context.Background(), testURL+types.GetURLPath, testURL, file)matches the new API and keeps the expectations (getURLHash(testURL)prefix, stored file content) intact.You might consider using a context with timeout here (and/or a simple readiness wait on
srv.Start) to make this integration-style test more robust under slow CI environments.Also applies to: 41-48
client/cmd/debug.go (1)
338-360: Debug bundle generator now uses LogPath and shared default upload URLPassing
LogPath: logFilePathintodebug.NewBundleGeneratormatches the updatedGeneratorDependenciesand ensures the bundle generator knows where to look for logs. Reusingtypes.DefaultBundleURLas the default for--upload-bundle-urlin bothdebug bundleanddebug forkeeps the CLI consistent with the upload-server configuration.Consider clarifying in the flag help text that the default is the NetBird-hosted debug upload service, so self‑hosters know they should override it if needed.
Also applies to: 369-379
client/server/server_test.go (1)
21-21: JobManager wiring into test management server is correctCreating a
jobManager := job.NewJobManager(nil, store)and passing it into bothserver.BuildManager(...)andnbgrpc.NewServer(...)matches the updated constructor signatures and ensures job flows are exercised in this test setup.The
if err != nil { return nil, "", err }right aftereventStore := &activity.InMemoryEventStore{}still refers to the earliererrfromNewTestStoreFromSQL, which has already been handled and is guaranteed nil here. It can be removed to avoid confusion.Also applies to: 298-329
client/internal/debug/debug.go (1)
30-34: Status seeding via FullStatus/nbstatus is coherent with anonymization strategyUsing
statusRecorder.GetFullStatus()→nbstatus.ToProtoFullStatus→ConvertToStatusOutputOverview→ParseToFullDetailSummaryaligns thestatus.txtcontents with the regularnetbird statusoutput and correctly injects event history and profile name. Seeding the bundle anonymizer viaseedFromStatus(g.anonymizer, &fullStatus)leverages this richer status snapshot to anonymize later artifacts more consistently.One minor nit:
seedFromStatusruns even wheng.anonymizeis false, which is harmless but unnecessary work; you could optionally guard it withif g.anonymize { ... }for clarity.Also applies to: 382-405, 852-880
shared/management/http/api/generate.sh (1)
14-16: oapi-codegen v2 install path looks correct; consider pinning a versionThe switch to
github.com/oapi-codegen/oapi-codegen/v2/cmd/oapi-codegen@latestmatches the v2 module layout and should work fine with recent Go toolchains. For more reproducible codegen in CI, you might eventually want to pin a specific version instead of@latest.client/embed/embed.go (1)
175-185: Recorder integration and updated Run call look consistentWiring a
peer.NewRecorderintoNewConnectClientand updatingRuntoRun(run, "")matches the new client API and should keep embedded startup behavior intact. If you later surface a log/debug path inOptions, this is the right place to thread it through instead of an empty string.management/server/http/testing/testing_tools/channel/channel.go (1)
19-39: JobManager wiring into BuildManager is correct; reuse metrics instead of nilInjecting a concrete
job.ManagerintoBuildManagerhere is a nice improvement in test realism. Given you already constructmetricsin this helper, you can pass it into the job manager instead ofnilso job‑related metrics behavior is also covered in tests:- peersUpdateManager := update_channel.NewPeersUpdateManager(nil) - jobManager := job.NewJobManager(nil, store) + peersUpdateManager := update_channel.NewPeersUpdateManager(nil) + jobManager := job.NewJobManager(metrics, store) ... - am, err := server.BuildManager(ctx, nil, store, networkMapController, jobManager, nil, "", &activity.InMemoryEventStore{}, geoMock, false, validatorMock, metrics, proxyController, settingsManager, permissionsManager, false) + am, err := server.BuildManager(ctx, nil, store, networkMapController, jobManager, nil, "", &activity.InMemoryEventStore{}, geoMock, false, validatorMock, metrics, proxyController, settingsManager, permissionsManager, false)Also applies to: 53-79
management/server/account_test.go (1)
35-50: Account test manager now wires a JobManager; prefer passing metrics instead of nilImporting
joband passing a concreteJobManagerintoBuildManagerincreateManagermatches the new account manager wiring and lets tests exercise job‑related paths. Since you already computemetricsin this helper, you can avoid a nil metrics field on the JobManager and better mirror production setup by doing:- updateManager := update_channel.NewPeersUpdateManager(metrics) - requestBuffer := NewAccountRequestBuffer(ctx, store) - networkMapController := controller.NewController(ctx, store, metrics, updateManager, requestBuffer, MockIntegratedValidator{}, settingsMockManager, "netbird.cloud", port_forwarding.NewControllerMock(), &config.Config{}) - manager, err := BuildManager(ctx, nil, store, networkMapController, job.NewJobManager(nil, store), nil, "", eventStore, nil, false, MockIntegratedValidator{}, metrics, port_forwarding.NewControllerMock(), settingsMockManager, permissionsManager, false) + updateManager := update_channel.NewPeersUpdateManager(metrics) + requestBuffer := NewAccountRequestBuffer(ctx, store) + networkMapController := controller.NewController(ctx, store, metrics, updateManager, requestBuffer, MockIntegratedValidator{}, settingsMockManager, "netbird.cloud", port_forwarding.NewControllerMock(), &config.Config{}) + manager, err := BuildManager(ctx, nil, store, networkMapController, job.NewJobManager(metrics, store), nil, "", eventStore, nil, false, MockIntegratedValidator{}, metrics, port_forwarding.NewControllerMock(), settingsMockManager, permissionsManager, false)Also applies to: 2931-2969
management/server/route_test.go (1)
22-34: Route tests correctly inject a JobManager; reuse metrics for consistencyThe additional
jobimport and passingjob.NewJobManager(nil, store)intoBuildManagerincreateRouterManagerare aligned with the new constructor and ensure route tests run with job support enabled. Since this helper already initializesmetrics, you can tighten the fidelity of the test setup by wiring it into the JobManager too:- updateManager := update_channel.NewPeersUpdateManager(metrics) - requestBuffer := NewAccountRequestBuffer(ctx, store) - networkMapController := controller.NewController(ctx, store, metrics, updateManager, requestBuffer, MockIntegratedValidator{}, settingsMockManager, "netbird.selfhosted", port_forwarding.NewControllerMock(), &config.Config{}) - - am, err := BuildManager(context.Background(), nil, store, networkMapController, job.NewJobManager(nil, store), nil, "", eventStore, nil, false, MockIntegratedValidator{}, metrics, port_forwarding.NewControllerMock(), settingsMockManager, permissionsManager, false) + updateManager := update_channel.NewPeersUpdateManager(metrics) + requestBuffer := NewAccountRequestBuffer(ctx, store) + networkMapController := controller.NewController(ctx, store, metrics, updateManager, requestBuffer, MockIntegratedValidator{}, settingsMockManager, "netbird.selfhosted", port_forwarding.NewControllerMock(), &config.Config{}) + + am, err := BuildManager(context.Background(), nil, store, networkMapController, job.NewJobManager(metrics, store), nil, "", eventStore, nil, false, MockIntegratedValidator{}, metrics, port_forwarding.NewControllerMock(), settingsMockManager, permissionsManager, false)Also applies to: 1258-1302
management/server/peer_test.go (1)
35-35: Prefer initializingjob.Managerwith real metrics in testsThe wiring of
BuildManagerlooks correct (argument order matches the updated signature), but all these test setups construct the job manager withjob.NewJobManager(nil, s)whilemetricsis already available. This leavesjob.Manager.metricsnil in tests, which can (a) hide metric-related bugs and (b) potentially panic later if the manager starts using metrics without nil checks.Consider passing the same metrics instance you already create:
- am, err := BuildManager(context.Background(), nil, s, networkMapController, job.NewJobManager(nil, s), nil, "", eventStore, nil, false, MockIntegratedValidator{}, metrics, port_forwarding.NewControllerMock(), settingsMockManager, permissionsManager, false) + am, err := BuildManager(context.Background(), nil, s, networkMapController, job.NewJobManager(metrics, s), nil, "", eventStore, nil, false, MockIntegratedValidator{}, metrics, port_forwarding.NewControllerMock(), settingsMockManager, permissionsManager, false)Apply the same pattern to the other
BuildManagerinvocations in this file for consistency.Also applies to: 1296-1296, 1381-1381, 1534-1534, 1614-1614
management/server/nameserver_test.go (1)
19-19: Use the existing metrics instance when creatingJobManager
createNSManagercorrectly wires the newjobManagerdependency intoBuildManager, but you’re passingjob.NewJobManager(nil, store)even though a realmetricsinstance is already created and used forupdate_channel.NewPeersUpdateManager.To avoid a nil
metricsinsidejob.Managerin tests (and potential surprises if metrics are used later), prefer:- return BuildManager(context.Background(), nil, store, networkMapController, job.NewJobManager(nil, store), nil, "", eventStore, nil, false, MockIntegratedValidator{}, metrics, port_forwarding.NewControllerMock(), settingsMockManager, permissionsManager, false) + return BuildManager(context.Background(), nil, store, networkMapController, job.NewJobManager(metrics, store), nil, "", eventStore, nil, false, MockIntegratedValidator{}, metrics, port_forwarding.NewControllerMock(), settingsMockManager, permissionsManager, false)Also applies to: 798-799
management/server/dns_test.go (1)
17-17: Align testJobManagerconstruction with real metrics
createDNSManagercorrectly injects aJobManagerintoBuildManager, but usesjob.NewJobManager(nil, store)despite having ametricsinstance already.For more realistic tests and to avoid a nil metrics field inside
job.Manager, consider:- return BuildManager(context.Background(), nil, store, networkMapController, job.NewJobManager(nil, store), nil, "", eventStore, nil, false, MockIntegratedValidator{}, metrics, port_forwarding.NewControllerMock(), settingsMockManager, permissionsManager, false) + return BuildManager(context.Background(), nil, store, networkMapController, job.NewJobManager(metrics, store), nil, "", eventStore, nil, false, MockIntegratedValidator{}, metrics, port_forwarding.NewControllerMock(), settingsMockManager, permissionsManager, false)Also applies to: 229-230
management/server/management_proto_test.go (1)
32-32: JobManager wiring in tests looks correct; consider reusing metricsThe test correctly instantiates
job.Managerand threads it throughBuildManagerandnbgrpc.NewServer, matching the new signatures. To exercise metrics inside the job manager during tests, you might consider passing themetricsinstance instead ofnilwhen creatingjobManager, but that's optional.Also applies to: 342-343, 369-371, 380-381
client/cmd/testutil_test.go (1)
19-19: Consistent JobManager setup in test helper; minor polish possibleThe JobManager is correctly created and passed into
BuildManagerandnbgrpc.NewServer, aligning with the new signatures. Two small nits you could optionally address later:
- Pass the
metricsinstance tojob.NewJobManagerinstead ofnilif you want job metrics in tests.- The
if err != nil { return nil, nil }check aftereventStore := &activity.InMemoryEventStore{}is dead code and can be removed.Also applies to: 92-96, 123-124, 129-130
management/server/account.go (1)
18-18: JobManager injection is clean; clarify non‑nil contractAdding
jobManager *job.ManagertoDefaultAccountManagerand threading it viaBuildManageris consistent and keeps wiring explicit. However, methods likeCreatePeerJobdereferenceam.jobManagerwithout nil checks, soBuildManagereffectively requires a non‑nil JobManager in all call sites. Consider either:
- Documenting that
jobManagermust be non‑nil (and enforcing via tests), or- Adding a defensive nil check that returns a clear
status.Internal/status.PreconditionFailederror instead of panicking if it’s miswired.Would you like a small
ast-grep/rgscript to verify that allBuildManagercall sites now pass a non‑nil JobManager?Also applies to: 68-76, 179-197, 203-224
management/server/management_test.go (1)
31-31: Test server wiring with JobManager is correctThe test server now correctly constructs a JobManager and passes it through to
BuildManagerandnbgrpc.NewServer, matching the new APIs. As with other tests, you could optionally passmetricsinstead ofnilintojob.NewJobManagerif you want job metrics exercised, but the current setup is functionally fine.Also applies to: 183-185, 212-228, 235-247
management/server/http/handlers/peers/peers_handler.go (1)
31-41: New peer job HTTP endpoints are well‑structured; tighten validation and rely on fixed ownership checkThe new
/peers/{peerId}/jobsand/peers/{peerId}/jobs/{jobId}handlers are consistent with existing patterns (auth from context,util.WriteError/WriteJSONObject, and conversion helpers). A few small points:
- Once
GetPeerJobByIDis fixed to assertjob.PeerID == peerID(see account manager comment), these handlers will correctly enforce per‑peer job ownership.- For consistency with other handlers in this file (e.g.
HandlePeer,GetAccessiblePeers), you may want to add an explicit empty‑peerIdcheck inCreateJob,ListJobs, andGetJobto return a clearInvalidArgumentinstead of silently passing""through.toSingleJobResponsecleanly maps the domainJobto the HTTPJobResponse, including optionalFailedReasonand typedWorkload; that looks good.Overall, the HTTP surface for jobs is in good shape once the backend ownership check is tightened.
Also applies to: 51-142, 618-638
shared/management/proto/management.proto (1)
52-53: Refine Job proto types for clarity and forward compatibilityThe Job RPC and related messages look structurally sound and align with the existing EncryptedMessage pattern. A few tweaks would make them easier to use:
JobResponse.Reasonasbytesis unusual for a human-readable explanation; astringwould be more ergonomic unless you explicitly plan to ship arbitrary binary blobs.JobStatusonly hasunknown_status,succeeded,failed. If you intend to reflect persistent job lifecycle (pending/running vs completed) it may be worth adding an explicit in‑progress state instead of overloadingunknown_statusas a placeholder.- Consider documenting expected encoding/format for
JobRequest.ID/JobResponse.ID(stringified UUID vs raw bytes) so store/types.Job integration remains consistent.Functionally this is fine, these are just protocol polish suggestions.
Also applies to: 66-89
client/internal/debug/upload.go (1)
15-101: Upload helper is solid; consider small robustness improvementsThe end-to-end flow (get presigned URL → size check → PUT upload) looks good and side‑effect free. A few minor refinements:
- In
upload, prefer%wwhen wrapping the HTTP error to preserve the error chain:return fmt.Errorf("upload failed: %w", err)getUploadURLbuilds the query asurl+"?id="+id. Ifurlmight ever include its own query string, usingnet/urlto appendidwould be safer.- Both
getUploadURLanduploadstrictly requireStatusOK. If the upload service ever returns another 2xx (e.g. 201/204), this will be treated as failure. If you control that service and guarantee 200, current code is fine; otherwise considerif putResp.StatusCode/100 != 2.None of these are blockers; the current implementation should work as intended.
client/server/debug.go (1)
27-39: Use the incoming context (and possibly avoid holding the mutex) during uploadThe switch to
debug.UploadDebugBundleandLogPath: s.logFilelooks correct.Two follow‑up improvements worth considering:
DebugBundleignores the incoming context and callsUploadDebugBundlewithcontext.Background(). That means client cancellation/timeouts won’t stop the upload. Renaming the parameter toctx context.Contextand passingctxthrough would make this RPC better behaved under cancellation.- The upload is done while
s.mutexis held. For large bundles or slow networks this can block other server operations behind this lock. If state safety allows it, you might generate the bundle under the lock, then release the mutex before starting the network upload.Behavior is fine for a rarely used debug path, but these tweaks would make it more responsive.
Also applies to: 49-57
management/server/account/manager.go (1)
127-129: Job APIs on Manager look good; consider pagination/filtersThe new job methods are consistent with the rest of the Manager interface (accountID/userID/peerID ordering, types.Job usage).
Two design points to keep in mind for implementations:
GetAllPeerJobscan grow unbounded over time; if jobs are expected to accumulate, consider adding pagination and/or server‑side filtering (e.g. by status or time range) at some point.- Ensure implementations consistently enforce
userIDauthorization, since these methods expose per‑peer operational history.No changes required here, just points to watch in the concrete manager.
client/internal/engine_test.go (1)
1591-1592: JobManager wiring in startManagement test mirrors production; consider reusing metricsThe new
jobManager := job.NewJobManager(nil, store)and its injection intoBuildManagerandnbgrpc.NewServerproperly mirror the production wiring, so tests now exercise the job pipeline end‑to‑end.Since you already create
metrics, err := telemetry.NewDefaultAppMetrics(...)a few lines later, you may want to:
- Construct
metricsfirst, then- Call
job.NewJobManager(metrics, store)This would keep metrics behavior consistent between the JobManager and the rest of the management stack even in tests. Not required for correctness, but a bit cleaner.
Also applies to: 1622-1623, 1628-1629
shared/management/http/api/openapi.yml (2)
41-121: Align bundle/workload schema with actual bundle generator configThe Bundle/Workload schemas look reasonable, but there are a couple of potential contract mismatches to double‑check:
BundleParametersexposesbundle_forandbundle_for_timewhile the currentdebug.BundleConfigonly hasAnonymize,IncludeSystemInfo, andLogFileCount. There is no way for callers to controlIncludeSystemInfo, and the semantics ofbundle_for/bundle_for_timevs. what the client actually does (currently just a wait + regular bundle) should be clarified or adjusted so the API is not misleading.- You’re using a discriminator on
WorkloadRequest/WorkloadResponsewithpropertyName: type. Make sure the generated clients you care about support this pattern where the discriminator property lives only in the concrete types (BundleWorkloadRequest/Response) and not also declared at the base level; some generators are picky here.I’d suggest either:
- Extending
BundleConfig+ implementation to truly honor all declared parameters, or- Tightening the OpenAPI description and fields to only what is actually used today, to avoid surprising API consumers.
35-38: Clarify experimental nature and pagination/filtering for Jobs endpointsThe
Jobstag is markedx-experimental: true, but the individual operations are not. If you rely on tooling that inspects operation/vendor extensions rather than tags, you may want to duplicatex-experimental: trueon the new/api/peers/{peerId}/jobsand/api/peers/{peerId}/jobs/{jobId}operations.Additionally, listing jobs currently returns an unbounded array with no filtering or pagination parameters. If job counts can become large per peer, it’s worth at least documenting ordering (e.g. newest first) and planning for
page/page_size/statusfilters, even if you defer implementing them now.Also applies to: 2353-2456
shared/management/client/client_test.go (1)
25-26: JobManager wiring in tests looks correct; consider reusing metrics if you need job metricsThe new
jobManager := job.NewJobManager(nil, store)and its injection into bothBuildManagerandnbgrpc.NewServeralign with the updated constructor signatures and keep the tests realistic.If you ever want to exercise job‑related metrics in tests, you could pass the already‑created
metricsinstead ofniltoNewJobManager; otherwise this setup is fine as‑is for functional tests.Also applies to: 76-77, 123-124, 130-132
management/server/store/sql_store.go (1)
136-207: Job CRUD helpers mostly fine; consider tightening error handling and update scopeFunctionality looks correct overall (ID scoping, account/peer filters, status transitions), but a few details could be improved:
CompletePeerJobupdates byidonly; sinceIDis the primary key this is safe, but for consistency withGetPeerJobByIDyou may want to includeaccount_idin theWHEREas well.CreatePeerJob,CompletePeerJob, andMarkPendingJobsAsFailedcorrectly wrap errors withstatus.Internal, butGetPeerJobByID/GetPeerJobsreturn raw GORM errors on internal failure. Consider wrapping those instatus.Errorf(status.Internal, ...)as done elsewhere in this store so callers don’t depend on storage-specific errors.- If
types.Job.ApplyResponseonly populates a subset of fields,CompletePeerJob’sUpdates(job)call is fine; if it ever starts zeroing other fields, you may want to switch to an explicit field list/map to avoid unintended overwrites.management/internals/shared/grpc/server.go (1)
184-219: Job stream handling is generally sound; consider surfacing send failuresThe new
JobRPC and helpers hold up well:
- Handshake reads a single encrypted
JobRequestand reuses the existingparseRequestpath, so message framing and crypto stay consistent with Sync/Login.- Account/peer lookup mirrors the Sync flow and correctly maps missing peers to
PermissionDenied/Unauthenticated.CreateJobChannel+ deferredCloseChannelprovide a clear per-peer lifecycle, andstartResponseReceivercleanly demultiplexesJobResponsemessages into the job manager.One point to consider tightening:
- In
sendJobsLoop, any failure insendJobis logged and then the method returnsnil, so the gRPC stream terminates with an OK status even if job delivery to the client failed. For parity withhandleUpdates/sendUpdate, you may want to return the error (or a mapped status error) so callers can observe transport failures on the Job stream.Also applies to: 333-397, 450-467
management/server/job/manager.go (2)
44-151: Pending‑job cleanup semantics are coarse; minor concurrency detail worth double‑checkingOverall behavior is coherent (fail stuck jobs, enqueue new ones, complete on response), but a few aspects could be sharpened:
CreateJobChannelcallsMarkPendingJobsAsFailedfor(accountID, peerID)before taking the lock. That’s fine functionally, but means “stuck too long” in the DB is actually “any pending job when a fresh channel is created” — if you later add explicit TTL semantics, you may want a more targeted query.- In
SendJob, you look up theChannelunderRLockand then callch.AddEventafter releasing the lock. IfCreateJobChannelorCloseChannelcan callch.Close()concurrently, make sureChannel.AddEventis written to safely detect a closed channel and returnErrJobChannelClosedrather than panic on send to a closed chan.- Both
cleanupandCloseChanneluseMarkPendingJobsAsFailed(accountID, peerID, reason), which marks all pending DB jobs for that peer, not just the specificjobID. That’s acceptable if you only ever have a single pending job per peer, but if you later support multiple in‑flight jobs, you’ll likely want a job‑scoped failure path in the store.- In
CloseChannel, the loop overjm.pendingcallsMarkPendingJobsAsFailedonce per event for the same peer; a minor optimization is to call it once per peerID, then delete all relevantpendingentries.These are mostly behavioral clarifications and small optimizations rather than blockers.
90-117: HandleResponse flow is correct but could simplify error‑response coupling
HandleResponsecorrectly:
- Looks up the pending event by
jobID.- Builds a
types.Jobfrom the response viaApplyResponse.- Calls
Store.CompletePeerJoband always removes the entry frompending.Given that the
event.Responsefield is only set whenCompletePeerJobsucceeds and then immediately dropped frompending, you could omitevent.Responseentirely or set it before calling the store to simplify reasoning. Current behavior is valid; this is mainly an internal API cleanup opportunity.client/status/status.go (1)
122-169: Overview mapping from FullStatus looks correct; consider a nil guardSwitching
ConvertToStatusOutputOverviewto use*proto.FullStatusdirectly and threading indaemonVersionmatches how the proto is structured and keeps the output fields aligned with the underlying status (management, signal, local peer, relays, DNS, events, SSH).One small robustness improvement: if
pbFullStatuscan ever be nil (e.g., from an older daemon or error path),pbFullStatus.Get...()calls will panic. A quick upfront check likeif pbFullStatus == nil { return OutputOverview{} }would make the function safer against unexpected inputs.
management/server/types/job.go (2)
105-138: Minor: preserve wrapped error instead of usingerr.Error()In
BuildWorkloadResponse, you wrap bundle build errors as:if err := j.buildBundleResponse(&wl); err != nil { return nil, status.Errorf(status.InvalidArgument, err.Error()) }Using
err.Error()as the format string loses the original error as a distinct argument and makes further wrapping/debugging slightly harder. Consider:- return nil, status.Errorf(status.InvalidArgument, err.Error()) + return nil, status.Errorf(status.InvalidArgument, "%v", err)to keep the original error value intact and consistent with the rest of the file.
140-162: Bundle parameter validation is clear; consider tightening semantics only if neededThe validation for
BundleForTime(1–5 minutes whenBundleForis true) andLogFileCount(1–1000) is straightforward and matches the API docs. You also normalize stored parameters to theBundleParametersJSON and initializeResultto{}, which keepsBuildWorkloadResponsehappy for pending/failed jobs.If you ever want to hard‑fail obviously nonsensical inputs when
BundleForis false (e.g., negativeBundleForTime), you could extend the check here, but that’s optional and not strictly required by the current contract.shared/management/http/api/types.gen.go (1)
2112-2169: WorkloadRequest helpers match server usage; consider using the WorkloadType constantThe
As*/From*/Merge*helpers andDiscriminator/ValueByDiscriminatorforWorkloadRequestare consistent with howNewJobandvalidateAndBuildBundleParamsconsume the workload. One small optional improvement would be to useWorkloadTypeBundleinstead of the literal"bundle"to avoid drift if the constant ever changes:-func (t *WorkloadRequest) FromBundleWorkloadRequest(v BundleWorkloadRequest) error { - v.Type = "bundle" +func (t *WorkloadRequest) FromBundleWorkloadRequest(v BundleWorkloadRequest) error { + v.Type = WorkloadTypeBundleSame for
MergeBundleWorkloadRequestand the switch inValueByDiscriminator.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (4)
client/proto/daemon.pb.gois excluded by!**/*.pb.gogo.sumis excluded by!**/*.sumshared/management/proto/management.pb.gois excluded by!**/*.pb.goshared/management/proto/management_grpc.pb.gois excluded by!**/*.pb.go
📒 Files selected for processing (52)
.github/workflows/wasm-build-validation.yml(1 hunks)client/cmd/debug.go(1 hunks)client/cmd/status.go(1 hunks)client/cmd/testutil_test.go(3 hunks)client/cmd/up.go(1 hunks)client/embed/embed.go(1 hunks)client/internal/connect.go(7 hunks)client/internal/debug/debug.go(7 hunks)client/internal/debug/upload.go(1 hunks)client/internal/debug/upload_test.go(2 hunks)client/internal/engine.go(13 hunks)client/internal/engine_test.go(9 hunks)client/jobexec/executor.go(1 hunks)client/proto/daemon.proto(0 hunks)client/server/debug.go(2 hunks)client/server/server.go(4 hunks)client/server/server_test.go(3 hunks)client/status/status.go(4 hunks)client/status/status_test.go(1 hunks)client/ui/debug.go(2 hunks)go.mod(2 hunks)management/internals/server/boot.go(1 hunks)management/internals/server/controllers.go(3 hunks)management/internals/server/modules.go(1 hunks)management/internals/shared/grpc/server.go(11 hunks)management/server/account.go(4 hunks)management/server/account/manager.go(1 hunks)management/server/account_test.go(2 hunks)management/server/activity/codes.go(2 hunks)management/server/dns_test.go(2 hunks)management/server/http/handlers/peers/peers_handler.go(3 hunks)management/server/http/testing/testing_tools/channel/channel.go(3 hunks)management/server/job/channel.go(1 hunks)management/server/job/manager.go(1 hunks)management/server/management_proto_test.go(4 hunks)management/server/management_test.go(4 hunks)management/server/mock_server/account_mock.go(2 hunks)management/server/nameserver_test.go(2 hunks)management/server/peer.go(1 hunks)management/server/peer_test.go(5 hunks)management/server/route_test.go(2 hunks)management/server/store/sql_store.go(3 hunks)management/server/store/store.go(1 hunks)management/server/types/job.go(1 hunks)shared/management/client/client.go(1 hunks)shared/management/client/client_test.go(3 hunks)shared/management/client/grpc.go(8 hunks)shared/management/client/mock.go(2 hunks)shared/management/http/api/generate.sh(1 hunks)shared/management/http/api/openapi.yml(2 hunks)shared/management/http/api/types.gen.go(8 hunks)shared/management/proto/management.proto(2 hunks)
💤 Files with no reviewable changes (1)
- client/proto/daemon.proto
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-11-13T00:29:53.247Z
Learnt from: lixmal
Repo: netbirdio/netbird PR: 4015
File: client/cmd/ssh_exec_unix.go:53-74
Timestamp: 2025-11-13T00:29:53.247Z
Learning: In client/ssh/server/executor_unix.go, the method ExecuteWithPrivilegeDrop(ctx context.Context, config ExecutorConfig) has a void return type (no error return). It handles failures by exiting the process directly with appropriate exit codes rather than returning errors to the caller.
Applied to files:
client/jobexec/executor.go
📚 Learning: 2025-11-14T13:05:31.729Z
Learnt from: lixmal
Repo: netbirdio/netbird PR: 4015
File: client/ssh/server/userswitching_windows.go:89-139
Timestamp: 2025-11-14T13:05:31.729Z
Learning: In client/ssh/server/executor_windows.go, the WindowsExecutorConfig struct's Pty, PtyWidth, and PtyHeight fields are intentionally left unused for now and will be implemented in a future update.
Applied to files:
client/jobexec/executor.go
🧬 Code graph analysis (42)
shared/management/client/client.go (3)
management/server/types/job.go (1)
Job(34-58)shared/management/http/api/types.gen.go (2)
JobRequest(708-710)JobResponse(713-721)shared/management/proto/management.pb.go (6)
JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)
management/internals/server/controllers.go (3)
management/internals/server/server.go (1)
BaseServer(45-68)management/server/job/manager.go (2)
Manager(23-30)NewJobManager(32-42)management/internals/server/container.go (1)
Create(6-10)
client/status/status_test.go (1)
client/status/status.go (1)
ConvertToStatusOutputOverview(122-169)
client/internal/debug/upload_test.go (2)
client/internal/debug/upload.go (1)
UploadDebugBundle(17-28)upload-server/types/upload.go (1)
GetURLPath(9-9)
management/server/store/store.go (1)
management/server/types/job.go (1)
Job(34-58)
management/internals/server/boot.go (1)
management/internals/shared/grpc/server.go (1)
NewServer(84-147)
client/internal/debug/upload.go (1)
upload-server/types/upload.go (3)
GetURLResponse(15-18)ClientHeader(5-5)ClientHeaderValue(7-7)
client/cmd/up.go (1)
util/log.go (1)
FindFirstLogPath(77-84)
management/server/account_test.go (2)
management/server/account.go (1)
BuildManager(180-268)management/server/job/manager.go (1)
NewJobManager(32-42)
management/server/peer.go (7)
management/server/account.go (1)
DefaultAccountManager(68-113)management/server/types/job.go (2)
Job(34-58)Workload(60-64)management/server/permissions/modules/module.go (1)
Peers(7-7)management/server/permissions/operations/operation.go (1)
Delete(9-9)shared/management/status/error.go (5)
NewPermissionValidationError(213-215)NewPermissionDeniedError(209-211)NewPeerNotPartOfAccountError(105-107)Errorf(70-75)Type(46-46)management/server/store/store.go (3)
Store(50-211)LockingStrengthNone(47-47)LockingStrengthUpdate(43-43)management/server/activity/codes.go (1)
JobCreatedByUser(183-183)
management/server/http/testing/testing_tools/channel/channel.go (2)
management/server/job/manager.go (1)
NewJobManager(32-42)management/server/account.go (1)
BuildManager(180-268)
management/internals/server/modules.go (1)
management/server/account.go (1)
BuildManager(180-268)
shared/management/client/mock.go (3)
shared/management/http/api/types.gen.go (2)
JobRequest(708-710)JobResponse(713-721)shared/management/proto/management.pb.go (6)
JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)management/server/types/job.go (1)
Job(34-58)
management/server/account/manager.go (1)
management/server/types/job.go (1)
Job(34-58)
client/embed/embed.go (1)
client/internal/connect.go (1)
NewConnectClient(51-62)
client/cmd/status.go (1)
client/status/status.go (1)
ConvertToStatusOutputOverview(122-169)
management/server/management_test.go (1)
management/server/job/manager.go (1)
NewJobManager(32-42)
management/server/peer_test.go (2)
management/server/account.go (1)
BuildManager(180-268)management/server/job/manager.go (1)
NewJobManager(32-42)
client/server/server.go (1)
client/status/status.go (1)
ToProtoFullStatus(549-635)
management/server/mock_server/account_mock.go (1)
management/server/types/job.go (1)
Job(34-58)
management/server/nameserver_test.go (2)
management/server/account.go (1)
BuildManager(180-268)management/server/job/manager.go (1)
NewJobManager(32-42)
management/server/job/channel.go (1)
management/server/job/manager.go (1)
Event(17-21)
management/server/http/handlers/peers/peers_handler.go (4)
management/server/context/auth.go (1)
GetUserAuthFromContext(25-30)shared/management/http/util/util.go (3)
WriteError(84-120)WriteErrorResponse(70-80)WriteJSONObject(27-35)shared/management/http/api/types.gen.go (3)
JobRequest(708-710)JobResponse(713-721)JobResponseStatus(724-724)management/server/types/job.go (3)
NewJob(67-103)Job(34-58)Workload(60-64)
client/jobexec/executor.go (3)
client/internal/debug/debug.go (3)
GeneratorDependencies(238-243)BundleConfig(232-236)NewBundleGenerator(245-264)client/internal/debug/upload.go (1)
UploadDebugBundle(17-28)upload-server/types/upload.go (1)
DefaultBundleURL(11-11)
client/server/debug.go (1)
client/internal/debug/upload.go (1)
UploadDebugBundle(17-28)
client/internal/engine.go (6)
client/internal/profilemanager/config.go (1)
Config(89-160)client/jobexec/executor.go (3)
Executor(23-24)NewExecutor(26-28)ErrJobNotImplemented(20-20)shared/management/client/client.go (1)
Client(14-27)shared/management/proto/management.pb.go (21)
JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)JobStatus_failed(30-30)JobRequest_Bundle(457-459)JobRequest_Bundle(461-461)JobStatus_succeeded(29-29)BundleParameters(554-563)BundleParameters(578-578)BundleParameters(593-595)JobResponse_Bundle(548-550)JobResponse_Bundle(552-552)SyncResponse(721-738)SyncResponse(753-753)SyncResponse(768-770)BundleResult(625-631)BundleResult(646-646)BundleResult(661-663)client/internal/state.go (1)
CtxGetState(31-33)client/internal/debug/debug.go (2)
GeneratorDependencies(238-243)BundleConfig(232-236)
management/internals/shared/grpc/server.go (6)
management/server/account/manager.go (1)
Manager(27-131)management/server/job/manager.go (2)
Manager(23-30)Event(17-21)shared/management/proto/management_grpc.pb.go (1)
ManagementService_JobServer(427-431)shared/management/proto/management.pb.go (9)
JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)EncryptedMessage(322-333)EncryptedMessage(348-348)EncryptedMessage(363-365)management/server/job/channel.go (2)
Channel(18-21)ErrJobChannelClosed(15-15)encryption/message.go (1)
EncryptMessage(10-24)
management/server/store/sql_store.go (1)
management/server/types/job.go (3)
Job(34-58)JobStatusPending(18-18)JobStatusFailed(20-20)
management/server/job/manager.go (3)
shared/management/proto/management.pb.go (6)
JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)management/server/job/channel.go (2)
Channel(18-21)NewChannel(23-29)management/server/types/job.go (1)
Job(34-58)
shared/management/client/client_test.go (3)
management/server/job/manager.go (1)
NewJobManager(32-42)management/server/account.go (1)
BuildManager(180-268)management/internals/shared/grpc/server.go (1)
NewServer(84-147)
client/internal/connect.go (2)
client/internal/engine.go (2)
NewEngine(235-276)EngineConfig(83-142)client/internal/profilemanager/config.go (1)
Config(89-160)
management/server/dns_test.go (2)
management/server/account.go (1)
BuildManager(180-268)management/server/job/manager.go (1)
NewJobManager(32-42)
shared/management/client/grpc.go (3)
shared/management/proto/management.pb.go (13)
JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)EncryptedMessage(322-333)EncryptedMessage(348-348)EncryptedMessage(363-365)JobStatus_failed(30-30)SyncResponse(721-738)SyncResponse(753-753)SyncResponse(768-770)shared/management/proto/management_grpc.pb.go (2)
ManagementService_JobClient(169-173)ManagementService_SyncClient(89-92)encryption/message.go (2)
EncryptMessage(10-24)DecryptMessage(27-40)
management/server/types/job.go (3)
shared/management/proto/management.pb.go (22)
JobStatus(25-25)JobStatus(57-59)JobStatus(61-63)JobStatus(70-72)JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)BundleParameters(554-563)BundleParameters(578-578)BundleParameters(593-595)BundleResult(625-631)BundleResult(646-646)BundleResult(661-663)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)JobStatus_succeeded(29-29)JobStatus_failed(30-30)JobResponse_Bundle(548-550)JobResponse_Bundle(552-552)JobRequest_Bundle(457-459)JobRequest_Bundle(461-461)shared/management/http/api/types.gen.go (8)
JobRequest(708-710)WorkloadResponse(1948-1950)BundleParameters(361-373)BundleResult(376-378)BundleWorkloadResponse(391-399)WorkloadTypeBundle(195-195)WorkloadRequest(1943-1945)JobResponse(713-721)shared/management/status/error.go (4)
Errorf(70-75)BadRequest(36-36)InvalidArgument(27-27)Error(54-57)
client/server/server_test.go (3)
management/server/job/manager.go (1)
NewJobManager(32-42)management/server/account.go (1)
BuildManager(180-268)management/internals/shared/grpc/server.go (1)
NewServer(84-147)
client/internal/debug/debug.go (4)
util/log.go (1)
SpecialLogs(25-28)client/internal/profilemanager/profilemanager.go (1)
NewProfileManager(56-58)client/status/status.go (3)
ToProtoFullStatus(549-635)ConvertToStatusOutputOverview(122-169)ParseToFullDetailSummary(532-547)version/version.go (1)
NetbirdVersion(18-20)
client/internal/engine_test.go (5)
client/internal/engine.go (2)
NewEngine(235-276)EngineConfig(83-142)shared/management/client/mock.go (1)
MockClient(13-24)management/server/job/manager.go (1)
NewJobManager(32-42)management/server/account.go (1)
BuildManager(180-268)management/internals/shared/grpc/server.go (1)
NewServer(84-147)
shared/management/http/api/types.gen.go (2)
shared/management/proto/management.pb.go (9)
BundleParameters(554-563)BundleParameters(578-578)BundleParameters(593-595)BundleResult(625-631)BundleResult(646-646)BundleResult(661-663)JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)management/server/types/job.go (1)
Workload(60-64)
management/server/route_test.go (2)
management/server/account.go (1)
BuildManager(180-268)management/server/job/manager.go (1)
NewJobManager(32-42)
client/status/status.go (2)
client/proto/daemon.pb.go (21)
FullStatus(1994-2008)FullStatus(2021-2021)FullStatus(2036-2038)ManagementState(1682-1689)ManagementState(1702-1702)ManagementState(1717-1719)SignalState(1621-1628)SignalState(1641-1641)SignalState(1656-1658)LocalPeerState(1528-1539)LocalPeerState(1552-1552)LocalPeerState(1567-1569)PeerState(1347-1369)PeerState(1382-1382)PeerState(1397-1399)RelayState(1743-1750)RelayState(1763-1763)RelayState(1778-1780)NSGroupState(1803-1811)NSGroupState(1824-1824)NSGroupState(1839-1841)client/internal/peer/status.go (6)
FullStatus(151-161)ManagementState(128-132)SignalState(121-125)LocalPeerState(106-112)RosenpassState(135-138)NSGroupState(142-148)
management/server/management_proto_test.go (3)
management/server/job/manager.go (1)
NewJobManager(32-42)management/server/account.go (1)
BuildManager(180-268)management/internals/shared/grpc/server.go (1)
NewServer(84-147)
client/cmd/testutil_test.go (3)
management/server/job/manager.go (1)
NewJobManager(32-42)management/server/account.go (1)
BuildManager(180-268)management/internals/shared/grpc/server.go (1)
NewServer(84-147)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: release_ui
- GitHub Check: release_ui_darwin
- GitHub Check: release
- GitHub Check: iOS / Build
- GitHub Check: Android / Build
- GitHub Check: JS / Lint
- GitHub Check: Windows
- GitHub Check: Linux
- GitHub Check: Darwin
- GitHub Check: Client / Unit
- GitHub Check: Client / Unit
- GitHub Check: Build Cache
- GitHub Check: Check External GPL/AGPL Licenses
- GitHub Check: Client / Unit
| const ( | ||
| MaxBundleWaitTime = 60 * time.Minute // maximum wait time for bundle generation (1 hour) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Respect context cancellation during pre‑bundle wait
waitFor currently swallows ctx.Done() and only logs:
select {
case <-time.After(duration):
case <-ctx.Done():
log.Infof("wait cancelled: %v", ctx.Err())
}BundleJob ignores whether the wait was cancelled and always proceeds to generate and upload the bundle. That means a cancelled job (e.g. client disconnected or management cancelled the job) will still cause heavy work on the client.
To make bundle jobs behave like other context‑aware operations, consider:
- Letting
waitForreturnerror(or bool) and, inBundleJob, short‑circuit ifctx.Err() != nilafter the wait:
-func waitFor(ctx context.Context, duration time.Duration) {
+func waitFor(ctx context.Context, duration time.Duration) error {
log.Infof("wait for %v minutes before executing debug bundle", duration.Minutes())
select {
case <-time.After(duration):
- case <-ctx.Done():
- log.Infof("wait cancelled: %v", ctx.Err())
+ return nil
+ case <-ctx.Done():
+ log.Infof("wait cancelled: %v", ctx.Err())
+ return ctx.Err()
}
+ return nil
}
- if waitForDuration > 0 {
- waitFor(ctx, waitForDuration)
- }
+ if waitForDuration > 0 {
+ if err := waitFor(ctx, waitForDuration); err != nil {
+ return "", err
+ }
+ }This keeps the behavior for successful waits while correctly aborting on cancellation.
Also applies to: 30-57, 59-66
| // todo consider the channel buffer size when we allow to run multiple jobs | ||
| const jobChannelBuffer = 1 | ||
|
|
||
| var ( | ||
| ErrJobChannelClosed = errors.New("job channel closed") | ||
| ) | ||
|
|
||
| type Channel struct { | ||
| events chan *Event | ||
| once sync.Once | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Guard against send‑on‑closed channel and reconsider where timeout is enforced
Two concerns around Channel’s concurrency and semantics:
-
Potential panic on send after Close
Close()closesjc.events, andAddEventalways doesjc.events <- eventin theselectcase without checking for closure. IfClosecan race withAddEvent(which is likely when job streams disconnect while new jobs are being queued), a send on a closed channel will panic the process.
Consider adding internal state to mark the channel as closed and haveAddEventshort‑circuit withErrJobChannelClosed(or similar) instead of ever sending to a potentially closed channel, or ensure by design that only the sender closes and that no furtherAddEventcalls happen afterClose. -
Timeout at enqueue vs. job response (already noted by your TODO)
The currentresponseWaittimeout only limits how long you wait to enqueue into the buffered channel, not how long you wait for the peer’s job response. WithjobChannelBuffer = 1, this effectively prevents a second job from being queued while the first one is still pending, but does not clean uppendingjobs when a peer never responds. Moving timeout handling to the job lifecycle (pending → failed) in the manager would better match the comment.
I’d treat the send‑on‑closed‑channel risk as something to fix before merging; the timeout refactor can follow.
Also applies to: 31-41, 43-59
I'll examine the full implementation to validate the concurrency concerns.
---
🏁 Script executed:
```shell
#!/bin/bash
# Retrieve the full channel.go file to examine Close(), AddEvent(), and Event() implementations
find . -name "channel.go" -path "*/job/*" -type f | head -5 | xargs -I {} cat {}
Length of output: 1183
Guard against send-on-closed channel and reconsider where timeout is enforced
Two concerns around Channel's concurrency and semantics:
-
Potential panic on send after Close
Close()usessync.Onceto closejc.eventsexactly once, but this only prevents double-close, not send-on-closed-channel panics. IfAddEvent()is called concurrently withClose(), theselectcase that sends tojc.eventscan execute after the channel is closed, causing a panic. Thesync.Onceguard does not prevent this race.
Add an internal closed flag (guarded by mutex) thatAddEventchecks before entering the select, or ensure by documented contract and design thatAddEventcannot be called afterClose(). This must be fixed before merging. -
Timeout at enqueue vs. job response (confirmed by TODO)
TheresponseWaittimeout only limits how long to wait to enqueue into the buffered channel, not how long to wait for the peer's job response. WithjobChannelBuffer = 1, this prevents a second job from being queued while the first is pending, but does not clean up pending jobs when a peer never responds. The TODO comment (lines 34–35) confirms this is a known issue. Move timeout handling to the job lifecycle (pending → failed) in the manager.
🤖 Prompt for AI Agents
In management/server/job/channel.go around lines 11–21, AddEvent can panic if it
races with Close because closing the channel is guarded only by sync.Once; add
an internal boolean closed guarded by a mutex (or RWMutex) that Close sets and
AddEvent checks before attempting to send, returning ErrJobChannelClosed if
closed to avoid send-on-closed panics; also make the send non-blocking/select
with a default path to return an error if the channel is closed or full.
Separately, move the timeout logic out of the enqueue path into the job
lifecycle in the manager: stop using responseWait to gate channel enqueueing
(keep enqueue quick or increase buffer), and implement a manager-side
pending→failed transition with a timer for each job so unresponsive peers are
timed out and cleaned up rather than relying on the channel buffer to enforce
timeouts.
| func (am *DefaultAccountManager) CreatePeerJob(ctx context.Context, accountID, peerID, userID string, job *types.Job) error { | ||
| // todo: Create permissions for job | ||
| allowed, err := am.permissionsManager.ValidateUserPermissions(ctx, accountID, userID, modules.Peers, operations.Delete) | ||
| if err != nil { | ||
| return status.NewPermissionValidationError(err) | ||
| } | ||
| if !allowed { | ||
| return status.NewPermissionDeniedError() | ||
| } | ||
|
|
||
| peerAccountID, err := am.Store.GetAccountIDByPeerID(ctx, store.LockingStrengthNone, peerID) | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| if peerAccountID != accountID { | ||
| return status.NewPeerNotPartOfAccountError() | ||
| } | ||
|
|
||
| // check if peer connected | ||
| if !am.jobManager.IsPeerConnected(peerID) { | ||
| return status.Errorf(status.BadRequest, "peer not connected") | ||
| } | ||
|
|
||
| // check if already has pending jobs | ||
| // todo: The job checks here are not protected. The user can run this function from multiple threads, | ||
| // and each thread can think there is no job yet. This means entries in the pending job map will be overwritten, | ||
| // and only one will be kept, but potentially another one will overwrite it in the queue. | ||
| if am.jobManager.IsPeerHasPendingJobs(peerID) { | ||
| return status.Errorf(status.BadRequest, "peer already has pending job") | ||
| } | ||
|
|
||
| jobStream, err := job.ToStreamJobRequest() | ||
| if err != nil { | ||
| return status.Errorf(status.BadRequest, "invalid job request %v", err) | ||
| } | ||
|
|
||
| // try sending job first | ||
| if err := am.jobManager.SendJob(ctx, accountID, peerID, jobStream); err != nil { | ||
| return status.Errorf(status.Internal, "failed to send job: %v", err) | ||
| } | ||
|
|
||
| var peer *nbpeer.Peer | ||
| var eventsToStore func() | ||
|
|
||
| // persist job in DB only if send succeeded | ||
| err = am.Store.ExecuteInTransaction(ctx, func(transaction store.Store) error { | ||
| peer, err = transaction.GetPeerByID(ctx, store.LockingStrengthUpdate, accountID, peerID) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| if err := transaction.CreatePeerJob(ctx, job); err != nil { | ||
| return err | ||
| } | ||
|
|
||
| jobMeta := map[string]any{ | ||
| "for_peer_name": peer.Name, | ||
| "job_type": job.Workload.Type, | ||
| } | ||
|
|
||
| eventsToStore = func() { | ||
| am.StoreEvent(ctx, userID, peer.ID, accountID, activity.JobCreatedByUser, jobMeta) | ||
| } | ||
| return nil | ||
| }) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| eventsToStore() | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Peer job management: ownership check bug, nil‑safety, and ordering concerns
The new peer‑job methods are a good start, but there are a few issues worth addressing:
- Job ownership validation bug (path vs data mismatch)
GetPeerJobByIDignores thepeerIDargument when returning the job:job, err := am.Store.GetPeerJobByID(ctx, accountID, jobID) // no check that job.PeerID == peerID
- The HTTP handler
GET /peers/{peerId}/jobs/{jobId}passes bothpeerIDandjobID, but the account manager currently only checks that the peer belongs to the account, not that the job belongs to that peer. - Result: a caller can request
/peers/{peerA}/jobs/{jobOfPeerB}(within the same account) and see the job forpeerB. - Fix: after loading the job, assert
job.PeerID == peerIDand returnstatus.NotFoundorstatus.PermissionDeniedotherwise:func (am *DefaultAccountManager) GetPeerJobByID(ctx context.Context, accountID, userID, peerID, jobID string) (*types.Job, error) { // ... existing permission + peerAccountID checks ...
-
job, err := am.Store.GetPeerJobByID(ctx, accountID, jobID)
-
job, err := am.Store.GetPeerJobByID(ctx, accountID, jobID) if err != nil { return nil, err } -
if job.PeerID != peerID { -
return nil, status.NewPeerNotPartOfAccountError() -
}
} return job, nil
-
Potential nil‑pointer on
am.jobManagerCreatePeerJobassumesam.jobManageris non‑nil:if !am.jobManager.IsPeerConnected(peerID) { ... } if am.jobManager.IsPeerHasPendingJobs(peerID) { ... } if err := am.jobManager.SendJob(...); err != nil { ... }
- If
BuildManageris ever called with a nil JobManager (e.g., in older tests or misconfigured environments), this will panic on first use of the job API. - Consider a defensive check up front:
if am.jobManager == nil { return status.Errorf(status.Internal, "job manager is not configured") }
- Or make the non‑nil requirement explicit and enforced in construction/tests.
-
Send‑before‑persist ordering can desync DB and in‑memory job state
- The method currently:
- Checks for pending jobs.
- Builds
jobStreamand callsSendJob, which will enqueue/send work to the peer. - Only then writes the job to the DB via
CreatePeerJobinside a transaction.
- If step (3) fails (DB error, constraint, etc.),
CreatePeerJobreturns an error but the job has already been dispatched to the peer, potentially leaving:- A job executing on the client but absent from persistent storage and from
/jobslistings.
- A job executing on the client but absent from persistent storage and from
- Consider flipping the order:
- Persist the job first (status = pending), then send it to the peer, and if sending fails, update the job to
failedwith an appropriateFailedReason. That keeps DB and runtime state consistent at the cost of occasionally having “failed to send” jobs in the DB.
- Persist the job first (status = pending), then send it to the peer, and if sending fails, update the job to
- The method currently:
-
Permissions reused from
Peers/Delete- Using
modules.Peers+operations.Deleteas the gate for creating and viewing jobs is strict but workable short‑term. The TODO about introducing dedicated job permissions is valid; when you add those, these sites will be the key ones to update.
- Using
Overall structure is solid; tightening ownership checks and jobManager assumptions will make the new job API much safer.
Also applies to: 389-441, 618-638
| // ApplyResponse validates and maps a proto.JobResponse into the Job fields. | ||
| func (j *Job) ApplyResponse(resp *proto.JobResponse) error { | ||
| if resp == nil { | ||
| return nil | ||
| } | ||
|
|
||
| j.ID = string(resp.ID) | ||
| now := time.Now().UTC() | ||
| j.CompletedAt = &now | ||
| switch resp.Status { | ||
| case proto.JobStatus_succeeded: | ||
| j.Status = JobStatusSucceeded | ||
| case proto.JobStatus_failed: | ||
| j.Status = JobStatusFailed | ||
| if len(resp.Reason) > 0 { | ||
| reason := string(resp.Reason) | ||
| if len(resp.Reason) > MaxJobReasonLength { | ||
| reason = string(resp.Reason[:MaxJobReasonLength]) + "... (truncated)" | ||
| } | ||
| j.FailedReason = fmt.Sprintf("Client error: '%s'", reason) | ||
| } | ||
| return nil | ||
| default: | ||
| return fmt.Errorf("unexpected job status: %v", resp.Status) | ||
| } | ||
|
|
||
| // Handle workload results (oneof) | ||
| var err error | ||
| switch r := resp.WorkloadResults.(type) { | ||
| case *proto.JobResponse_Bundle: | ||
| if j.Workload.Result, err = json.Marshal(r.Bundle); err != nil { | ||
| return fmt.Errorf("failed to marshal workload results: %w", err) | ||
| } | ||
| default: | ||
| return fmt.Errorf("unsupported workload response type: %T", r) | ||
| } | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely JSON shape mismatch between proto.BundleResult and api.BundleResult
In ApplyResponse, for a succeeded bundle job you serialize the proto result directly:
case *proto.JobResponse_Bundle:
if j.Workload.Result, err = json.Marshal(r.Bundle); err != nil {
return fmt.Errorf("failed to marshal workload results: %w", err)
}Later, buildBundleResponse does:
var r api.BundleResult
if err := json.Unmarshal(j.Workload.Result, &r); err != nil {
return fmt.Errorf("invalid result for bundle job: %w", err)
}However, the types differ:
proto.BundleResulthasUploadKey String(a protobuf message type, not a plain string).api.BundleResulthasUploadKey *string.
Encoding a protobuf message with encoding/json will typically produce an object like:
{ "upload_key": { "value": "..." } }whereas api.BundleResult expects:
{ "upload_key": "..." }This will cause json.Unmarshal in buildBundleResponse to fail (cannot unmarshal object into Go value of type string), meaning succeeded jobs will not be convertible to HTTP WorkloadResponse.
I’d strongly recommend normalizing here by mapping the proto result into the HTTP shape instead of serializing the proto struct directly, e.g.:
case *proto.JobResponse_Bundle:
- if j.Workload.Result, err = json.Marshal(r.Bundle); err != nil {
- return fmt.Errorf("failed to marshal workload results: %w", err)
- }
+ apiResult := api.BundleResult{}
+ if r.Bundle != nil {
+ // Assuming proto.String carries a `Value` or similar; adapt field access accordingly.
+ key := r.Bundle.UploadKey // map to string as appropriate
+ apiResult.UploadKey = &key
+ }
+ if j.Workload.Result, err = json.Marshal(apiResult); err != nil {
+ return fmt.Errorf("failed to marshal workload results: %w", err)
+ }(Adjust field access/unwrap logic to whatever String actually exposes.)
This keeps the stored JSON shape aligned with api.BundleResult and avoids runtime errors in BuildWorkloadResponse.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
client/internal/connect.go(7 hunks)client/internal/engine.go(13 hunks)client/internal/engine_test.go(9 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
client/internal/engine_test.go (5)
client/internal/engine.go (2)
NewEngine(235-276)EngineConfig(83-142)shared/management/client/mock.go (1)
MockClient(13-24)management/server/job/manager.go (1)
NewJobManager(32-42)management/server/account.go (1)
BuildManager(180-268)management/internals/shared/grpc/server.go (1)
NewServer(84-147)
client/internal/engine.go (7)
client/internal/profilemanager/config.go (1)
Config(89-160)client/jobexec/executor.go (3)
Executor(23-24)NewExecutor(26-28)ErrJobNotImplemented(20-20)shared/management/client/client.go (1)
Client(14-27)shared/management/proto/management.pb.go (19)
JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)JobStatus_failed(30-30)JobRequest_Bundle(457-459)JobRequest_Bundle(461-461)JobStatus_succeeded(29-29)BundleParameters(554-563)BundleParameters(578-578)BundleParameters(593-595)SyncResponse(721-738)SyncResponse(753-753)SyncResponse(768-770)BundleResult(625-631)BundleResult(646-646)BundleResult(661-663)client/internal/state.go (1)
CtxGetState(31-33)client/internal/debug/debug.go (2)
GeneratorDependencies(238-243)BundleConfig(232-236)client/server/server.go (1)
New(96-107)
client/internal/connect.go (3)
client/internal/engine.go (2)
NewEngine(235-276)EngineConfig(83-142)client/internal/profilemanager/config.go (1)
Config(89-160)shared/management/proto/management.pb.go (3)
PeerConfig(2082-2098)PeerConfig(2113-2113)PeerConfig(2128-2130)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: Relay / Unit (amd64, -race)
- GitHub Check: Management / Unit (amd64, mysql)
- GitHub Check: Management / Unit (amd64, postgres)
- GitHub Check: Management / Unit (amd64, sqlite)
- GitHub Check: Relay / Unit (386)
- GitHub Check: Management / Integration (amd64, postgres)
- GitHub Check: Management / Benchmark (API) (amd64, postgres)
- GitHub Check: Client (Docker) / Unit
- GitHub Check: Management / Benchmark (API) (amd64, sqlite)
- GitHub Check: Client / Unit (amd64)
- GitHub Check: Management / Benchmark (amd64, sqlite)
- GitHub Check: Management / Benchmark (amd64, postgres)
- GitHub Check: Client / Unit (386)
- GitHub Check: Client / Unit
- GitHub Check: release
- GitHub Check: release_ui_darwin
- GitHub Check: Android / Build
- GitHub Check: Linux
- GitHub Check: Client / Unit
- GitHub Check: Darwin
- GitHub Check: Windows
- GitHub Check: JS / Lint
- GitHub Check: Client / Unit
🔇 Additional comments (16)
client/internal/connect.go (5)
65-66: LGTM! logPath parameter threading is clean.The public Run signature now accepts logPath and forwards it correctly. Mobile variants appropriately default to empty string, which is sensible given platform-specific logging constraints.
Also applies to: 87-87, 105-105
108-108: LGTM!Internal run signature correctly updated to receive logPath.
251-251: LGTM!Call to createEngineConfig correctly passes logPath parameter.
275-275: LGTM!NewEngine call correctly updated with profile config parameter, enabling debug bundle generation support.
414-414: LGTM! EngineConfig enriched for debug bundle support.The function signature and field assignments correctly thread logPath and ProfileConfig through to EngineConfig. Note that the entire profilemanager.Config is stored in ProfileConfig—this appears intentional for debug bundle generation, which may need access to ManagementURL and other config details.
Also applies to: 449-452
client/internal/engine_test.go (3)
28-28: LGTM!Import added to support JobManager usage in test setup.
256-256: LGTM! Test updates are consistent.All NewEngine calls correctly updated with two additional nil parameters (checks, profileConfig). Using nil is appropriate for tests that don't exercise debug bundle or checks functionality.
Also applies to: 418-424, 637-643, 802-808, 1004-1010, 1536-1536
1595-1595: LGTM! Test infrastructure correctly wired.JobManager created and threaded through BuildManager and NewServer, enabling job subsystem testing. Using nil for metrics is appropriate for test context.
Also applies to: 1626-1626, 1632-1632
client/internal/engine.go (8)
34-34: LGTM!Imports added to support debug bundle generation and job execution.
Also applies to: 52-52
138-141: LGTM!EngineConfig fields added to support debug bundle generation. Clear comment documents the purpose.
206-207: LGTM! Proper concurrency primitives added.syncRespMux provides dedicated locking for sync response persistence, separate from syncMsgMux. jobExecutorWG ensures clean shutdown of job-related goroutines.
Also applies to: 220-221
234-234: LGTM!NewEngine signature correctly extended with profile config parameter. jobExecutor initialized during construction for handling management-service job requests.
Also applies to: 254-254
335-335: LGTM! Proper shutdown ordering.jobExecutorWG.Wait() ensures job-related goroutines complete before resource cleanup. The preceding e.cancel() will signal job goroutines to exit via context cancellation.
523-523: LGTM! receiveJobEvents follows established patterns.The function correctly:
- Increments jobExecutorWG before spawning goroutine
- Handles job requests synchronously via callback
- Triggers client restart on Job stream errors (consistent with Management/Signal event handlers)
- Returns proper error codes for unimplemented job types
Also applies to: 942-977
803-814: LGTM! Proper lock separation for sync response persistence.The code correctly uses syncRespMux (not syncMsgMux) to protect sync response persistence state. RLock for reading the flag, then separate write lock for storing the response—proper lock upgrade pattern.
1874-1886: LGTM! Thread-safe sync response persistence.Both functions correctly use syncRespMux for protection. GetLatestSyncResponse properly:
- Reads state under RLock
- Releases lock before expensive clone operation
- Returns a clone to prevent external mutation
Also applies to: 1889-1911
| func (e *Engine) handleBundle(params *mgmProto.BundleParameters) (*mgmProto.JobResponse_Bundle, error) { | ||
| log.Infof("handle remote debug bundle request: %s", params.String()) | ||
| syncResponse, err := e.GetLatestSyncResponse() | ||
| if err != nil { | ||
| log.Warnf("get latest sync response: %v", err) | ||
| } | ||
|
|
||
| bundleDeps := debug.GeneratorDependencies{ | ||
| InternalConfig: e.config.ProfileConfig, | ||
| StatusRecorder: e.statusRecorder, | ||
| SyncResponse: syncResponse, | ||
| LogPath: e.config.LogPath, | ||
| } | ||
|
|
||
| bundleJobParams := debug.BundleConfig{ | ||
| Anonymize: params.Anonymize, | ||
| IncludeSystemInfo: true, | ||
| LogFileCount: uint32(params.LogFileCount), | ||
| } | ||
|
|
||
| waitFor := time.Duration(params.BundleForTime) * time.Minute | ||
|
|
||
| uploadKey, err := e.jobExecutor.BundleJob(e.ctx, bundleDeps, bundleJobParams, waitFor, e.config.ProfileConfig.ManagementURL.String()) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
|
|
||
| response := &mgmProto.JobResponse_Bundle{ | ||
| Bundle: &mgmProto.BundleResult{ | ||
| UploadKey: uploadKey, | ||
| }, | ||
| } | ||
| return response, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential nil pointer dereference on Line 1001.
The code accesses e.config.ProfileConfig.ManagementURL.String() without checking if ProfileConfig is nil. While tests pass nil, they don't exercise this path. For robustness, consider adding a nil check.
Apply this defensive check:
+ if e.config.ProfileConfig == nil || e.config.ProfileConfig.ManagementURL == nil {
+ return nil, errors.New("profile config or management URL not available for bundle generation")
+ }
+
uploadKey, err := e.jobExecutor.BundleJob(e.ctx, bundleDeps, bundleJobParams, waitFor, e.config.ProfileConfig.ManagementURL.String())📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (e *Engine) handleBundle(params *mgmProto.BundleParameters) (*mgmProto.JobResponse_Bundle, error) { | |
| log.Infof("handle remote debug bundle request: %s", params.String()) | |
| syncResponse, err := e.GetLatestSyncResponse() | |
| if err != nil { | |
| log.Warnf("get latest sync response: %v", err) | |
| } | |
| bundleDeps := debug.GeneratorDependencies{ | |
| InternalConfig: e.config.ProfileConfig, | |
| StatusRecorder: e.statusRecorder, | |
| SyncResponse: syncResponse, | |
| LogPath: e.config.LogPath, | |
| } | |
| bundleJobParams := debug.BundleConfig{ | |
| Anonymize: params.Anonymize, | |
| IncludeSystemInfo: true, | |
| LogFileCount: uint32(params.LogFileCount), | |
| } | |
| waitFor := time.Duration(params.BundleForTime) * time.Minute | |
| uploadKey, err := e.jobExecutor.BundleJob(e.ctx, bundleDeps, bundleJobParams, waitFor, e.config.ProfileConfig.ManagementURL.String()) | |
| if err != nil { | |
| return nil, err | |
| } | |
| response := &mgmProto.JobResponse_Bundle{ | |
| Bundle: &mgmProto.BundleResult{ | |
| UploadKey: uploadKey, | |
| }, | |
| } | |
| return response, nil | |
| } | |
| func (e *Engine) handleBundle(params *mgmProto.BundleParameters) (*mgmProto.JobResponse_Bundle, error) { | |
| log.Infof("handle remote debug bundle request: %s", params.String()) | |
| syncResponse, err := e.GetLatestSyncResponse() | |
| if err != nil { | |
| log.Warnf("get latest sync response: %v", err) | |
| } | |
| bundleDeps := debug.GeneratorDependencies{ | |
| InternalConfig: e.config.ProfileConfig, | |
| StatusRecorder: e.statusRecorder, | |
| SyncResponse: syncResponse, | |
| LogPath: e.config.LogPath, | |
| } | |
| bundleJobParams := debug.BundleConfig{ | |
| Anonymize: params.Anonymize, | |
| IncludeSystemInfo: true, | |
| LogFileCount: uint32(params.LogFileCount), | |
| } | |
| waitFor := time.Duration(params.BundleForTime) * time.Minute | |
| if e.config.ProfileConfig == nil || e.config.ProfileConfig.ManagementURL == nil { | |
| return nil, errors.New("profile config or management URL not available for bundle generation") | |
| } | |
| uploadKey, err := e.jobExecutor.BundleJob(e.ctx, bundleDeps, bundleJobParams, waitFor, e.config.ProfileConfig.ManagementURL.String()) | |
| if err != nil { | |
| return nil, err | |
| } | |
| response := &mgmProto.JobResponse_Bundle{ | |
| Bundle: &mgmProto.BundleResult{ | |
| UploadKey: uploadKey, | |
| }, | |
| } | |
| return response, nil | |
| } |
🤖 Prompt for AI Agents
In client/internal/engine.go around lines 979 to 1012, the call to
e.config.ProfileConfig.ManagementURL.String() can panic if ProfileConfig or its
ManagementURL is nil; add a defensive nil-check before calling String() and pass
a safe fallback (e.g., empty string or configured default) into
e.jobExecutor.BundleJob. Specifically, ensure e.config and
e.config.ProfileConfig are non-nil and that ProfileConfig.ManagementURL is
non-nil, compute a local managementURL string variable accordingly, and use that
variable in the BundleJob call so the function never dereferences a nil pointer.




Bugfixes
Describe your changes
Issue ticket number and link
Stack
Checklist
Documentation
Select exactly one:
Docs PR URL (required if "docs added" is checked)
Paste the PR link from https://github.com/netbirdio/docs here:
https://github.com/netbirdio/docs/pull/__
Summary by CodeRabbit
New Features
API Enhancements
Chores
✏️ Tip: You can customize this high-level summary in your review settings.