Skip to content

Conversation

cg-jl
Copy link

@cg-jl cg-jl commented Sep 3, 2025

Taking a look at how the current implementation of BAML Graphs works, & working
through data separation to make DX & understanding the pipeline easier.

So far:

  • Java-style "hoard-it-all" builder that owns everything. Not bad on its own, just not correctly applied here.
  • Lots of .clone() where a & is enough. Missing a couple of places due to required lifetime separations.
  • Too much coupling in builder between pre-built data structures & builder-local DS.
  • Seemingly, too many stages/abstractions in the pipeline:
    • HeaderIndex usefulness is indeterminate. The tree structure it contains is not homogeneous enough
      to make for direct traversal, & that shows up in GraphBuilder.
    • Graph is a useful but not-needed yet separation.
  • Currently simplifying the logic. That will show how much information is required.

Important

Refactor BAML Graphs by modularizing diagram generation, simplifying graph logic, and renaming for clarity.

  • Refactoring:
    • Move BamlVisDiagramGenerator logic to diagram_generator module in baml_vis.
    • Split graph logic into graph.rs for better separation of concerns.
  • Renaming:
    • Rename BamlVisDiagramGenerator to diagram_generator in imports and usage across generate_mermaid_headers.rs, runtime_interface.rs, and mermaid_graph_tests.rs.
  • Code Simplification:
    • Simplify graph building logic by reducing unnecessary abstractions and stages.
    • Remove redundant .clone() calls where & references suffice.
  • Miscellaneous:
    • Update SHOW_CALL_NODES constant usage in diagram_generator.rs and graph.rs.

This description was created by Ellipsis for 9ec5b21. You can customize this summary. It will automatically update as commits are pushed.

@cg-jl cg-jl temporarily deployed to boundary-tools-dev September 3, 2025 17:07 — with GitHub Actions Inactive
@cg-jl cg-jl temporarily deployed to boundary-tools-dev September 3, 2025 17:07 — with GitHub Actions Inactive
@cg-jl cg-jl temporarily deployed to boundary-tools-dev September 3, 2025 17:07 — with GitHub Actions Inactive
Copy link

vercel bot commented Sep 3, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
promptfiddle Error Error Oct 17, 2025 8:34pm

💡 Enable Vercel Agent with $100 free credit for automated AI reviews

@cg-jl cg-jl marked this pull request as draft September 3, 2025 17:07
Copy link

🔒 Entelligence AI Vulnerability Scanner

No security vulnerabilities found!

Your code passed our comprehensive security analysis.


Copy link

LGTM 👍

Copy link

github-actions bot commented Sep 3, 2025

Copy link

github-actions bot commented Sep 3, 2025

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 9ec5b21 in 2 minutes and 38 seconds. Click for details.
  • Reviewed 1648 lines of code in 8 files
  • Skipped 0 files when reviewing.
  • Skipped posting 8 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. engine/baml-lib/ast/examples/generate_mermaid_headers.rs:3
  • Draft comment:
    Good update – using 'diagram_generator' instead of the deprecated builder. Ensure example reflects the new API.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 30% <= threshold 50% The comment is informative and suggests ensuring that examples reflect the new API. However, it doesn't ask for a specific test or code change, nor does it provide a specific suggestion or question about the code. It seems to be more of a reminder than a specific actionable comment.
2. engine/baml-lib/ast/src/ast.rs:32
  • Draft comment:
    Re-exporting the new 'diagram_generator' module replaces the old BamlVisDiagramGenerator – looks correct.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative and does not provide any actionable feedback or suggestions. It simply states that the re-export looks correct, which is not useful for the PR author.
3. engine/baml-lib/ast/src/ast/baml_vis.rs:18
  • Draft comment:
    Deprecated code for BamlVisDiagramGenerator has been removed in favor of the new module. This cleanup is clear.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, stating that deprecated code has been removed. It doesn't provide any actionable feedback or suggestions for improvement.
4. engine/baml-lib/ast/src/ast/baml_vis/graph.rs:604
  • Draft comment:
    Unreachable code detected in 'merge_by_pos': the early 'return' prevents the subsequent iterator implementation from ever executing. Consider removing the dead code.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% In Rust, when you define a type and implement traits for it within a function body, those definitions are part of the function's scope and are used to define the return type. The return statement here returns a value of type State, which needs the following definitions to be valid. This is not dead code - it's required for the function to work correctly. Could this be a case where the linter is correctly identifying that code after a return statement is technically unreachable, even if the code is needed for type definitions? No, this is a misunderstanding of Rust's scoping rules. The struct and impl blocks are declarations that are part of the function's scope and are required for the return type to be valid. They are not executable code that would be unreachable. The comment is incorrect. The code after the return statement contains necessary type definitions that are part of the function's scope and are required for the returned iterator type to work.
5. engine/baml-lib/baml/tests/mermaid_graph_tests.rs:4
  • Draft comment:
    Test update: Using 'diagram_generator' module now; snapshot tests align with the new API.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, providing details about a test update and alignment with a new API. It doesn't suggest any changes or ask for confirmation on specific code behavior.
6. engine/baml-runtime/src/runtime/runtime_interface.rs:175
  • Draft comment:
    Function 'function_graph' now uses 'diagram_generator::generate_headers_flowchart' correctly; verify that this integration reflects intended graph-generation behavior.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to verify the integration, which violates the rule against asking for confirmation of intention or behavior. It does not provide a specific suggestion or point out a specific issue with the code.
7. engine/baml-lib/ast/src/ast/baml_vis/graph.rs:107
  • Draft comment:
    Typo in comment: "precomuted" should be "precomputed".
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The rules say not to make purely informative comments and to only comment if there is clearly a code change required. While this is technically a code change, it's an extremely minor typo fix in a doc comment that doesn't affect functionality. The rules suggest focusing on more substantive issues. The typo could theoretically cause confusion for future developers reading the docs. Documentation quality is important. While docs are important, this typo is extremely minor and obvious - any reader would understand the intended meaning. The cost of the review comment outweighs the tiny benefit. This comment should be deleted as it violates the rule about not making purely informative comments and focusing only on substantive issues that require code changes.
8. engine/baml-lib/ast/src/ast/baml_vis/graph.rs:240
  • Draft comment:
    Typo in comment: "wont" should be "want".
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% The rules state not to make purely informative comments and to avoid obvious or unimportant changes. While this is technically about a change (new file), fixing a typo in a comment has no functional impact and is a very minor issue. Comments should focus on code quality issues that require action. The typo could be confusing to future readers since "wont" is a real word with a different meaning than "want". Documentation clarity has value. While clear documentation is good, this is an internal comment about implementation details. The typo does not significantly impact understanding since the meaning is clear from context. The comment should be deleted as it points out a trivial issue that does not meaningfully impact code quality or functionality.

Workflow ID: wflow_ToyGhJNLECHQjSnd

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@cg-jl cg-jl temporarily deployed to boundary-tools-dev September 3, 2025 18:04 — with GitHub Actions Inactive
@cg-jl cg-jl temporarily deployed to boundary-tools-dev September 3, 2025 18:04 — with GitHub Actions Inactive
@cg-jl cg-jl temporarily deployed to boundary-tools-dev September 3, 2025 18:04 — with GitHub Actions Inactive
Copy link

github-actions bot commented Sep 3, 2025

Copy link

github-actions bot commented Sep 3, 2025

Copy link

codecov bot commented Sep 3, 2025

@cg-jl cg-jl temporarily deployed to boundary-tools-dev September 4, 2025 11:12 — with GitHub Actions Inactive
@cg-jl cg-jl temporarily deployed to boundary-tools-dev September 4, 2025 11:12 — with GitHub Actions Inactive
@cg-jl cg-jl temporarily deployed to boundary-tools-dev September 4, 2025 11:12 — with GitHub Actions Inactive
Copy link

github-actions bot commented Sep 4, 2025

Copy link

github-actions bot commented Sep 4, 2025

@cg-jl cg-jl temporarily deployed to boundary-tools-dev September 4, 2025 11:15 — with GitHub Actions Inactive
@cg-jl cg-jl temporarily deployed to boundary-tools-dev September 4, 2025 11:15 — with GitHub Actions Inactive
@cg-jl cg-jl temporarily deployed to boundary-tools-dev September 4, 2025 11:15 — with GitHub Actions Inactive
@cg-jl cg-jl temporarily deployed to boundary-tools-dev September 5, 2025 22:12 — with GitHub Actions Inactive
@cg-jl cg-jl temporarily deployed to boundary-tools-dev September 5, 2025 22:12 — with GitHub Actions Inactive
Copy link

github-actions bot commented Sep 5, 2025

Copy link

github-actions bot commented Sep 5, 2025

sxlijin and others added 24 commits October 7, 2025 14:39
There's no need to make it live as much.

remove get_by_hid guards in GraphBuilder

We build a manual map that contains all entries.
Why not just use that?

make scope_root map local to `build`

move nested_targets to `build`

make cluster & node ids copyable

Should relief a ton of copies.

borrow strings from index

Makes no sense to keep cloning all strings if all come from header
index. Specially for call_node_cache!

make pos_tuple return Path ref

Another clone down :]

make BamlVisDiagramGenerator a module

baml_vis: move graph stuff under a module

move mermaid diagram generator to a different function

clippy + fix example

Separate outputs from precompute() as cached graph data

We can now use a new lifetime to separate read & write locks inside the
struct.

Do not clone `nested_chlidren` on each `get`

`.clone()` as an escape hatch is a big mistake.

annotate & arrange processing order (dependency lists)

Only done for if branches. Should be done for the rest, slowly.

Also removed a couple of inefficiencies wrt collect.

Make `build_header` return unit

Apart from removing some clones, this is to separate the visit process
from the data so it is easier to extract phases.

Visit header nodes only once

The cache check was hiding bugs! Visits are pre/post-order through a
tree, so we can't possibly visit a node twice by following edges.

remove visited_scopes

Now that we make sure to visit nodes only once, it proves unnecessary.

move render_header_calls out of recursive visit

remove call_node_cache

Now we can easily prove that each (hid, callee) pair is only visited
once!

move node & cluster id creation to graph.

separate adding graph edges from recursive visit

move baml_vis stuff to  baml_vis; make scope traversal always source order

simplify getting top-level scopes

add header filter

Now the recursion uses the result from the header filter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants