-
Notifications
You must be signed in to change notification settings - Fork 181
fix: LangfuseTracer is not thread-safe, causing mixed traces in async environments #2188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@vblagoje Sure thing! I had some trouble updating Haystack versions in my use cases last week, but I will try this today. |
Nice, thanks. Please do. Note that this fix breaks down single long agent run into individual traces and we agreed internally not to do that but to try to keep the hierarchy as we do now in a single trace. I'll be working on this and making sure that traces from relatively regular pipeline runs, as found in examples dir for examples, and complex agent runs work ok. If you can confirm you are not experiencing error of mixed async traces as you did before that would be the missing piece of confirmation we need to integrate this one soon. Let's keep a tight loop on this issue for the next few days @LastRemote |
# Using session_id to group related traces
response = pipe.run(
data={"prompt_builder": {"template_variables": {"location": "Paris"}, "template": messages}},
tracer={"session_id": "user_123_conversation_456"}
)@vblagoje Is this supported in the current |
Mistake, it should be tracer inside data! No not supported, I'm testing it now on this branch only |
|
@vblagoje No problem. I am still encountering troubles with the haystack version upgrade, given how many Agent changes we've had from 2.12 to 2.16. Luckily these should be most likely resolved today so I can kick off the test tomorrow. For the testing part, it is a bit tricky since I would need actual traffic for this. So for the sake of simplicity I decided to create a DummyWait component that echoes the input after an hour. I will try to stress-run against this to see how it goes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
I have one question (not strictly related to this PR) about _session_root_traces: it seems that it will grow indefinitely without cleanup. Would you see this as a problem? I am thinking e.g. at long running applications.
Why:
Purpose: Fixes a race condition where spans from concurrent pipeline runs become intertwined, corrupting tracing data and making debugging unreliable in production web server environments.
LangfuseTraceris not thread-safe, causing mixed traces in async environments #2140What:
self._contextlist withContextVarfor thread-safe span stack managementLangfuseTracer.trace()method to use context-local span storagecurrent_span()method to read from context-local statespan_context_varContextVar for isolated span contexts per execution threadHow can it be used:
Nothing changes in client code except that we don't corrupt traces in concurrent async pipelines. All existing usage patterns continue to work exactly as before, but now with proper thread safety.
How did you test it:
Notes for the reviewer:
@LastRemote would you please test this branch on your use case