-
Notifications
You must be signed in to change notification settings - Fork 808
fix(langchain): callback handler overwrites custom names for task and w… #3324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(langchain): callback handler overwrites custom names for task and w… #3324
Conversation
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughThe callback handler now preserves pre-existing Traceloop context by reading Changes
Sequence Diagram(s)sequenceDiagram
participant CallbackHandler as LangChain CallbackHandler
participant OTEL as OpenTelemetry Context
participant Tracer as Tracer
participant Span as WORKFLOW Span
CallbackHandler->>OTEL: get_value(workflow_name), get_value(entity_path)
alt context contains existing values
CallbackHandler->>Tracer: start WORKFLOW span (use existing_workflow_name)
Tracer->>Span: create span
CallbackHandler->>Span: set entity_path = existing_entity_path
else no existing values
CallbackHandler->>Tracer: start WORKFLOW span (use computed name)
Tracer->>Span: create span
CallbackHandler->>CallbackHandler: get_entity_path() -> derive from parent spans
CallbackHandler->>Span: set derived entity_path
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. 📜 Recent review detailsConfiguration used: CodeRabbit UI 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to 04a29a9 in 1 minute and 36 seconds. Click for details.
- Reviewed
46
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
2
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:392
- Draft comment:
Good addition: checking for existing 'workflow_name' and 'entity_path' in context. Ensure that tasks (non-WORKFLOW kind) are handled as intended when custom context values are present. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is informative and suggests ensuring that tasks are handled as intended, which violates the rule against asking the PR author to ensure behavior is intended. It doesn't provide a specific suggestion or ask for a specific test to be written.
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:731
- Draft comment:
In get_entity_path, checking for an existing 'entity_path' from context prevents overwriting custom values. Consider validating or type casting the value if non-string types could be present. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The suggestion seems overly cautious. The entity_path is used as a span attribute, and OpenTelemetry already handles type conversion of span attributes through _set_span_attribute(). Additionally, if an invalid value somehow got into the context, it would be better to let that error surface than silently convert it. The current code is simple and follows fail-fast principles. The comment raises a valid concern about type safety. There could be edge cases where non-string values get into the context that we haven't considered. While type safety is important, OpenTelemetry's span attribute system already handles type conversion, and adding extra validation here would be redundant and could mask real issues. The comment should be deleted. The suggested change adds unnecessary complexity and could mask real issues, while the current code is simple and relies on OpenTelemetry's built-in type handling.
Workflow ID: wflow_3ICDlVKEHc84Vjo8
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
LLMResult, | ||
) | ||
from opentelemetry import context as context_api | ||
from opentelemetry.context import get_value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New import of get_value is added. Consider using a consistent approach (e.g., context_api.get_value) throughout the file for clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (2)
30-30
: Unify context access to a single API (stylistic consistency).You already import
context as context_api
and usecontext_api.get_value
elsewhere. Importingget_value
directly introduces a second way to do the same thing. Prefer one approach for clarity and to avoid subtle divergence.Apply this diff to drop the direct import (and use
context_api.get_value
at call sites as suggested in other comments):-from opentelemetry.context import get_value
410-413
: Potential entity_name/workflow_name mismatch for WORKFLOW spans can affect derived entity_path.When
kind == WORKFLOW
, you setworkflow_name = existing_workflow_name or name
, but the entity_name passed later to_create_task_span
(Line 423) remainsname
. Downstream,get_entity_path
uses a special-case check that assumesentity_name == workflow_name
to keep the root path empty. If they differ, child paths may begin with the callback’sname
instead of the desired workflow name unless the context suppliesentity_path
for every descendant.If the intent is to keep the top-level path empty when a workflow is active, consider aligning the entity_name to the chosen workflow_name for WORKFLOW spans, or adjust
get_entity_path
to not rely on equality for that special case.Example change (outside this hunk): pass
workflow_name
as theentity_name
when creating WORKFLOW spans:# Around Lines 417-426 entity_name_arg = ( workflow_name if kind == TraceloopSpanKindValues.WORKFLOW else name ) span = self._create_task_span( run_id, parent_run_id, name, kind, workflow_name, entity_name_arg, entity_path, metadata, )Please add/adjust tests to assert:
- With a decorator setting
workflow_name
andentity_path=""
, child spans keep an empty entity_path (unless the context explicitly changes it).- With a custom workflow_name differing from the callback
name
, child entity paths still behave as intended.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
(4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit Inference Engine (CLAUDE.md)
**/*.py
: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Lint
🔇 Additional comments (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (1)
395-398
: Consider usingcontext_api.get_value
and confirm upstream setting of context keysYou’re correctly preserving any decorator‐provided
workflow_name
andentity_path
before computing new values to avoid overwrites. Two follow-ups:
- Consistency: Switch the unqualified
get_value("…")
calls tocontext_api.get_value("…")
to match the import pattern.- Verification: I didn’t find any in-repo calls to
set_value("workflow_name", …)
orset_value("entity_path", …)
. Please ensure these keys are populated upstream (e.g., by your decorators or middleware), otherwise this branch will never read existing values.
# Check for existing Traceloop context first | ||
existing_entity_path = get_value("entity_path") | ||
if existing_entity_path: | ||
return existing_entity_path | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix: honor explicitly empty entity_path from context (truthiness bug).
Using a truthy check will ignore an explicitly empty string set by upstream context (a valid, intentional “root” path) and fall back to deriving a path. This reintroduces the overwrite you’re trying to avoid.
Use an is not None
check so that ""
is preserved.
Apply this diff:
- # Check for existing Traceloop context first
- existing_entity_path = get_value("entity_path")
- if existing_entity_path:
- return existing_entity_path
+ # Check for existing Traceloop context first
+ existing_entity_path = context_api.get_value("entity_path")
+ # Important: honor empty string ("") as a valid, explicit root path
+ if existing_entity_path is not None:
+ return existing_entity_path
Note: If you adopt this, you can remove the direct get_value
import as suggested above.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# Check for existing Traceloop context first | |
existing_entity_path = get_value("entity_path") | |
if existing_entity_path: | |
return existing_entity_path | |
# Check for existing Traceloop context first | |
existing_entity_path = context_api.get_value("entity_path") | |
# Important: honor empty string ("") as a valid, explicit root path | |
if existing_entity_path is not None: | |
return existing_entity_path |
🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
around lines 734 to 738, the code uses a truthy check on existing_entity_path
which treats an explicitly empty string ("") as false and incorrectly falls back
to deriving a path; change the condition to check `is not None` so that an
intentionally empty string is preserved (i.e., if get_value("entity_path") is
not None: return that value), and if you implement this change remove the
now-unnecessary direct get_value import noted earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed ae2fabd in 1 minute and 24 seconds. Click for details.
- Reviewed
74
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
5
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:27
- Draft comment:
Good removal of the direct 'get_value' import in favor of using context_api.get_value. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
This comment is purely informative, as it praises the removal of a direct import without providing any actionable feedback or suggestions. It doesn't align with the rules for useful comments.
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:393
- Draft comment:
Replacing get_value calls with context_api.get_value improves consistency. - Reason this comment was not posted:
Comment looked like it was already resolved.
3. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:412
- Draft comment:
Using _build_hybrid_entity_path to combine existing Traceloop context with the LangChain component prevents accidental overwrites of custom names. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It does not align with the rules for good comments, as it does not ask for confirmation of intention, suggest improvements, or identify potential issues.
4. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:731
- Draft comment:
Refactored get_entity_path now acts solely as a fallback when no Traceloop context exists; ensure its behavior is well documented for consumers. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the author to ensure that the behavior of a refactored function is well documented. This falls under the rule of not asking the author to ensure something is done, which is not allowed.
5. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:750
- Draft comment:
The new _build_hybrid_entity_path method cleanly merges the Traceloop context with LangChain component names. Consider adding type validations and unit tests to cover edge cases (e.g., non-string or missing values). - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% While adding tests is generally good practice, this method is already well-typed and handles edge cases. The logic is straightforward string manipulation with clear null handling. The comment doesn't point out any specific issues or edge cases that need testing. It's more of a general "good practice" suggestion without concrete actionable feedback. The suggestion for tests could be valuable since this is a new method handling path construction that could affect tracing. Edge cases like special characters in component names aren't explicitly handled. The method is simple string concatenation with clear types and null handling. Any string validation should be handled at the API boundary where component names are first received, not in this internal utility method. The comment should be deleted as it makes a general suggestion without identifying specific issues or providing actionable feedback. The method is already well-typed and handles the key edge cases.
Workflow ID: wflow_xTORxamwbINVJIYq
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
…orkflows
feat(instrumentation): ...
orfix(instrumentation): ...
.Important
Fixes bug in
callback_handler.py
to preserve custom task and workflow names by maintaining existing Traceloop context.workflow_name
andentity_path
) inon_chain_start()
to prevent overwriting during tracing.entity_path
from context in_build_hybrid_entity_path()
to maintain continuity across workflows.This description was created by
for ae2fabd. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit