-
Notifications
You must be signed in to change notification settings - Fork 807
feat(flow): show tool usage in flow conversations #6771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Deploying windmill with
|
Latest commit: |
12b5d7e
|
Status: | ✅ Deploy successful! |
Preview URL: | https://504c67ea.windmill.pages.dev |
Branch Preview URL: | https://fg-toolsinchat.windmill.pages.dev |
/updatesqlx |
Starting sqlx update... |
/updatesqlx |
Starting sqlx update... |
Successfully ran sqlx update |
/ai-fast review this PR |
Claude finished @centdix's task —— View job PR Review: Tool Usage Display in Flow ConversationsTodo List
SummaryThis PR adds support for displaying tool usage in flow conversations, allowing users to see when AI agents invoke tools during conversations. The implementation adds a new "tool" message type and tracks which flow step generated each message. Database Changes ✅The migration properly extends the conversation system:
Note: The migration looks clean and follows best practices. Backend ChangesPositive Aspects ✅
Areas for Improvement 🔧
Frontend ChangesPositive Aspects ✅
Areas for Improvement 🔧
Security Considerations 🔒
Recommendations
Overall AssessmentThe implementation is solid with proper separation of concerns and good error handling. The suggested improvements are mostly optimizations and polish items. The feature adds valuable debugging capabilities for AI agent flows. Verdict: Approved with minor suggestions ✅ The code is production-ready but would benefit from the performance optimizations and cleanup mentioned above. |
/updatesqlx |
Starting sqlx update... |
Successfully ran sqlx update |
/ai-fast review the changes since last review |
Claude finished @centdix's task —— View job Review of Changes Since Last ReviewTodo List
Summary of New ChangesSince my last review, the following improvements have been made:
Detailed Review✅ Excellent Improvements
🎯 All Previous Concerns AddressedI'm pleased to see that all the performance and error handling issues from my previous review have been addressed:
💡 Minor Suggestions (Optional)
Overall AssessmentAll requested changes have been implemented effectively. The code now properly caches chat settings, handles nested agents correctly, and provides a cleaner user experience with proper error suppression. The performance optimization I suggested has been implemented elegantly with the Verdict: Ready to merge ✅ Great work addressing all the feedback! The implementation is now robust and performant. |
Claude finished @centdix's task —— View job PR Review: Tool Usage Display in Flow ConversationsTodo List
SummaryThis PR implements a comprehensive feature to display tool usage in flow conversations, allowing users to see when AI agents invoke tools during conversations. The implementation adds a new "tool" message type and tracks which flow step generated each message, with success/failure indicators. Review Findings✅ Excellent Implementations1. Database Design (backend/migrations/20251007123506_update_conversation_message_types.up.sql:4-8)
2. Code Organization (backend/windmill-common/src/flow_conversations.rs:17-50)
3. Performance Optimization (backend/windmill-worker/src/ai_executor.rs:103-145)
4. Non-blocking Design (backend/windmill-worker/src/ai_executor.rs:1126-1141, 1238-1252)
5. Frontend Architecture (frontend/src/lib/components/flows/conversations/FlowChatManager.svelte.ts)
6. Visual Design (frontend/src/lib/components/flows/conversations/FlowChatMessage.svelte:46-51)
🔧 Areas for Improvement1. Error Message Content (backend/windmill-worker/src/ai_executor.rs:1232-1236)let content = if success {
format!("Used {} tool", tool_name)
} else {
format!("Error executing {}", tool_name)
}; Issue: Generic tool success messages provide minimal debugging value. 2. OpenAPI Schema Completeness (backend/windmill-api/openapi.yaml)The FlowConversationMessage schema includes the new fields but could be more descriptive: step_name:
type: string
description: The step name that produced that message
success:
type: boolean
description: Whether the message is a success Suggestion: Add examples and clarify that 3. Type Safety Enhancement (frontend/src/lib/components/flows/conversations/FlowChatManager.svelte.ts:100-103)if (this.#conversationsCache[conversationIdToUse]) {
// Cache check happens before API call Issue: Cache invalidation strategy is unclear. 4. Step Name Extraction Robustness (backend/windmill-worker/src/ai_executor.rs)fn get_step_name_from_flow(flow_value: &FlowValue, flow_step_id: Option<&str>) -> Option<String> {
let flow_step_id = flow_step_id?;
let module = flow_value.modules.iter().find(|m| m.id == flow_step_id)?;
Some(
module
.summary
.clone()
.unwrap_or_else(|| format!("AI Agent Step {}", module.id)),
)
} Good: Proper fallback handling with descriptive default names. 🔒 Security Assessment✅ All security practices followed:
📊 Performance Analysis✅ Well optimized:
🧪 Code QualityFollows repository conventions:
🚀 Recommendations
Overall AssessmentVerdict: Excellent Implementation ✅ This is a well-architected feature that demonstrates strong technical execution. The code follows best practices, handles edge cases properly, and integrates cleanly with the existing system. All previous performance and error handling concerns from earlier reviews have been addressed effectively. The implementation provides valuable debugging capabilities for AI agent flows while maintaining good performance and reliability standards. Testing InstructionsTo test the tool usage display feature: Navigate to any flow that contains AI Agent steps with tool usage enabled. Create or open a flow conversation by clicking the chat icon. Run the flow with agent steps that use tools (like API calls, database queries, or script executions). In the conversation interface, you should see new "tool" messages appearing with:
The tool messages will appear in real-time as the flow executes, providing immediate feedback on which tools are being used and whether they succeed or fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed everything up to 12b5d7e in 2 minutes and 29 seconds. Click for details.
- Reviewed
2241
lines of code in22
files - Skipped
0
files when reviewing. - Skipped posting
9
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. frontend/src/lib/components/flows/conversations/FlowChatInterface.svelte:32
- Draft comment:
The initialization effect cleanly sets up and later cleans up the FlowChatManager instance. Consider adding a brief comment clarifying the purpose of each $effect block (one for initialization and one for conversationId updates) to ease future maintenance. - Reason this comment was not posted:
Comment looked like it was already resolved.
2. frontend/src/lib/components/flows/conversations/FlowChatInterface.svelte:47
- Draft comment:
The cleanup function (manager.cleanup) is properly called on unmount. It may be useful to include a comment regarding error handling inside cleanup, in case asynchronous errors occur during manager cleanup. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment is speculative - it suggests what "may be useful" without evidence of actual issues. We don't see the implementation of manager.cleanup() so we can't know if error handling is actually needed. The cleanup pattern shown is standard and simple. Adding error handling here would be premature optimization without evidence of problems. Perhaps there are known issues with cleanup functions in this codebase that would justify this suggestion? The cleanup could involve async operations that could fail. Without seeing the cleanup implementation or evidence of actual issues, suggesting error handling is speculative. The cleanup pattern shown is standard and working code should not be complicated without justification. The comment should be deleted as it makes a speculative suggestion without evidence of actual issues, violating our rule against speculative comments.
3. frontend/src/lib/components/flows/conversations/FlowChatManager.svelte.ts:21
- Draft comment:
The reactive state properties are initialized using $state. For improved readability and type safety, consider adding explicit type annotations for these state variables so that future developers understand their types immediately. - Reason this comment was not posted:
Comment looked like it was already resolved.
4. frontend/src/lib/components/flows/conversations/FlowChatManager.svelte.ts:95
- Draft comment:
In the loadMessages method, a Promise with setTimeout is used twice to delay DOM updates. Consider extracting this delay into a helper function to improve code reuse and readability. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The two setTimeout delays serve different purposes and have different timing values. One is for scrolling to bottom after reset (100ms), the other for restoring scroll position after loading more messages (50ms). Extracting them to a helper wouldn't reduce much code duplication since they're used in different contexts. The timing values are intentionally different. The comment has a point about code reuse, but the two timeouts serve different purposes and have different timing requirements. Combining them might actually make the code less clear by obscuring their distinct purposes. While code reuse is generally good, in this case the timeouts are sufficiently different in purpose and timing that combining them wouldn't meaningfully improve the code. The comment should be deleted as the suggested refactor would not meaningfully improve code quality or readability, and might actually reduce clarity by combining two distinct timing operations.
5. frontend/src/lib/components/flows/conversations/FlowChatManager.svelte.ts:254
- Draft comment:
The startPolling method uses a hardcoded interval (500 ms) and stops polling after 2 minutes. Consider making these values configurable or documenting the rationale behind these specific values for easier adjustment in the future. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The polling values seem reasonable and well-commented. Making them configurable would add complexity without clear benefit. The 500ms polling interval is a standard reasonable default that balances responsiveness with server load. The 2-minute timeout prevents infinite polling. These aren't likely to need frequent adjustment. The comment has a point - hardcoded values can be problematic if requirements change. Maybe some use cases need different polling frequencies or timeout periods. While technically valid, making these configurable would add unnecessary complexity. The current values are reasonable defaults and the comments explain their purpose. If changes are needed later, they can be modified then. Delete the comment. The suggestion would add complexity without clear benefit, and the current implementation with documented magic numbers is sufficient.
6. frontend/src/lib/components/flows/conversations/FlowChatManager.svelte.ts:175
- Draft comment:
In pollJobResult, the catch block only logs the error. For better robustness, consider if additional error handling or user notification might be needed when waiting for a job’s result fails. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The error handling seems appropriate for this context because: 1. It's an internal polling function called by sendMessage() which has user-facing error handling 2. The function continues execution in finally block to clean up state 3. Adding more error handling here would be redundant with parent 4. The error is logged for debugging I could be wrong about the importance of error handling in polling functions - perhaps there are edge cases where the polling error should be handled differently than other errors. While polling errors could be special cases, the existing error handling pattern (log error, continue cleanup) seems appropriate for an internal polling function. The parent already handles user notification. The comment should be deleted. The error handling is appropriate for an internal polling function, with errors logged and parent function handling user notification.
7. frontend/src/lib/components/flows/conversations/FlowChatMessage.svelte:16
- Draft comment:
The FlowChatMessage component conditionally applies classes based on message type and status. Consider adding ARIA roles or additional accessibility attributes to improve accessibility for screen readers. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
8. backend/migrations/20251007123506_update_conversation_message_types.up.sql:1
- Draft comment:
Typo: The comment on line 1 reads "Add up migration script here". Consider revising it for clarity – for example, if it is meant to indicate setting up the migration script, you might simply write "Add migration script here" or "Setup migration script here." - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% 1. "Up migration" is correct technical terminology in the context of database migrations. 2. The comment is suggesting a change that would actually make the header less precise. 3. This is a purely cosmetic comment about a comment, not about actual code. 4. The existing comment is clear and technically accurate. Could there be a house style guide that prefers different migration comment formats? Even if there were a style guide, this kind of minor comment about comment wording is not important enough to warrant a PR comment. The comment should be deleted as it suggests changing correct technical terminology to less precise wording, and comments about comment formatting are not important enough for PR review.
9. frontend/src/lib/components/flows/conversations/FlowChatMessage.svelte:29
- Draft comment:
There's a possible typographical issue with the closingtag for the step_name block. The tag is split across two lines (line 29 and 30), which may lead to unexpected behavior. Consider combining them into a single correctly formatted closing tag, e.g.,</div>
. - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_R6wxS8R9mNEb6hJ1
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Important
This PR adds functionality to display tool usage in flow conversations, updating both backend and frontend components to handle and display tool-related messages.
step_name
andsuccess
columns toflow_conversation_message
table inmigrations/20251007123506_update_conversation_message_types.up.sql
.add_message_to_conversation_tx()
inflow_conversations.rs
to includestep_name
andsuccess
.run_agent()
inai_executor.rs
to handle tool messages and update conversation with tool usage.FlowChatManager
inFlowChatManager.svelte.ts
to manage chat state and message handling.FlowChatInterface.svelte
to useFlowChatManager
for managing chat interactions.FlowChatMessage.svelte
to display tool usage with success indicators.This description was created by
for 12b5d7e. You can customize this summary. It will automatically update as commits are pushed.