Skip to content

Conversation

chezsmithy
Copy link
Contributor

@chezsmithy chezsmithy commented Jun 27, 2025

Description

Claude 4 sonnet is very excited about creating project structure display in readme files. When this was attempted the markdown display was broken and the file output was truncated. This was due file structure being outputted with nested codeblocks without a type attached.

When I orginally implemented a fix to this code I noted that there might be some risky edge cases where the nested code blocks didn't have types. As such this PR should solve for that use case by implementing a look ahead feature to check upcoming code blocks to determine if nesting is starting or we are ending the outer code block nesting early.

Significant hardening has been done on all of the places where markdown is used. I've still noted a few issues in chat where markdown might be returned and fences are not balanced (for example an open fence is included indented, and the closing fence happens to be not indented or enclosed in quotes) that could be further improved in a new PR.

Further, I noted that Bedrock Lower cost nova models we want to use for edit and apply were causing problems with outputting markdown content that wasn't compatible with the current prompts and filtering logic. This has been solved for.

Checklist

  • I've read the contributing guide
  • The relevant docs, if any, have been updated or created
  • The relevant tests, if any, have been updated or created

Screenshots

[ For visual changes, include screenshots. Screen recordings are particularly helpful, and appreciated! ]

Tests

[ What tests were added or updated to ensure the changes work as expected? ]

I've added a new function for the look ahead checking, and added new tests for that. Then I added a specfic test for the nested structure output that was failing.

I've added extensive tests for all the changes included functions that have been modified. During manual testing, I have found additional edge cases where the generation was stopping pre-maturely and I've added tests to attempt to solve for those. My goal has been to allow using AI to manage the lineStream.ts, and lineStream.vitetest.ts themselves which has been a problem traditionally becuase they contained stop blocks and fences that broke code generation. I've validated that its now possible to refactor those files sucessfully.


Summary by cubic

Fixed an issue where project structure markdown with nested bare codeblocks would break display and truncate output.

  • Bug Fixes
    • Added logic to detect and handle nested bare codeblocks without a type.
    • Updated tests to cover these edge cases.

@chezsmithy chezsmithy requested a review from a team as a code owner June 27, 2025 23:04
@chezsmithy chezsmithy requested review from sestinj and removed request for a team June 27, 2025 23:04
Copy link

netlify bot commented Jun 27, 2025

👷 Deploy request for continuedev pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit 948246d

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 27, 2025
@sestinj
Copy link
Contributor

sestinj commented Jun 30, 2025

@chezsmithy There is a failing test, but I do like the PR. Will merge once tests are fixed

@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jul 1, 2025
@chezsmithy chezsmithy marked this pull request as draft July 1, 2025 00:45
@chezsmithy
Copy link
Contributor Author

chezsmithy commented Jul 1, 2025

TODO Items:

  • Passing tests
  • Review all ``` and STOP checking to make sure it is looking only for leading backticks or STOP keywords and not catching examples inside of code strings. Thus tests won't generate using these keywords or truncation can occur.
  • Optimize lookahead function for ``` / STOP checking to not scan the whole remainder of the file if we can.

@chezsmithy chezsmithy force-pushed the fix-nested-bare-codeblocks branch from 179347d to be47720 Compare July 2, 2025 21:47
@chezsmithy chezsmithy marked this pull request as ready for review July 2, 2025 22:05
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Jul 2, 2025
@chezsmithy
Copy link
Contributor Author

@sestinj feeling good about this now. I've added a tone of tests and split out the filterCodeBlockLines function so that it's easier to test in issolation as its such a critical component.

Copy link
Collaborator

@RomneyDa RomneyDa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix will be great!
See comments, I think there's some double-duplicate code.
If possible, let's move all the markdown-patching-related code to core utils, have it in one place, and import it into GUI

@github-project-automation github-project-automation bot moved this from Todo to In Progress in Issues and PRs Jul 3, 2025
@chezsmithy chezsmithy requested a review from RomneyDa July 4, 2025 21:14
@chezsmithy chezsmithy changed the title fix: 🐛 Project structure creation with bare codeblocks is breaking markdown display and creation fix: 🐛 Fix bare markdown fences and premature code generation stops Jul 5, 2025
@chezsmithy
Copy link
Contributor Author

@sestinj with the introduction of the search and replace tool, would it make sense to break out the rendering fixes for the UI with markdown from the apply updates that might become less relevant with the serach and replace changes?

Copy link
Collaborator

@RomneyDa RomneyDa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chezsmithy I think generally safe to say apply won't go anywhere too soon!

I noticed ~30 lines of the logic in shouldChangeLineAndStop and stopAtLines seem duplicate, and the MarkdownPatchStateTrackers in core and gui seem mostly duplicate as well

Might be misunderstanding subtle differences in the implementations, but could these both be united? This PR is solving complicated problems so it's inherently a bit tough to wrap my head around but I think the deduplication (if possible) could help me review more confidently

The tests are great

@chezsmithy chezsmithy force-pushed the fix-nested-bare-codeblocks branch from 1364466 to 361685d Compare July 13, 2025 18:41
@chezsmithy
Copy link
Contributor Author

@chezsmithy I think generally safe to say apply won't go anywhere too soon!

I noticed ~30 lines of the logic in shouldChangeLineAndStop and stopAtLines seem duplicate, and the MarkdownPatchStateTrackers in core and gui seem mostly duplicate as well

Might be misunderstanding subtle differences in the implementations, but could these both be united? This PR is solving complicated problems so it's inherently a bit tough to wrap my head around but I think the deduplication (if possible) could help me review more confidently

The tests are great

@RomneyDa I've also addressed the duplicate lines noted in the the comments above I believe. Just a couple open questions on the raw change for you.

@chezsmithy chezsmithy requested a review from RomneyDa July 13, 2025 22:00
Copy link
Collaborator

@RomneyDa RomneyDa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chezsmithy see comment on performance for patchNestedMarkdown. Let me know if you agree with this. Maybe run a quick n=100000 test locally or something to see if it makes a difference?

I'm less worried about performance for apply since it's not happening within the React app. If it seems performance is not deteriorated at all then lgtm!

Copy link
Collaborator

@RomneyDa RomneyDa Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be especially sensitive about performance in the nested markdown patch because it runs on every single streamed chunk into the GUI.

As far as I can tell (correct me if I'm missing something) the MarkdownBlockStateTracker is not more efficient than the current implementation for patchNestedMarkdown, it adds many array.includes and array.filter operations on top of line trims and regex comparisons. I worry that small increases in operations for this patch could cause some performance issues since it's running on every streamed chunk

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RomneyDa I agree. One of the fixes I applied was a look forward scan to find the next closing fence. Originally it did a scan on every line. The tracker captures them all upfront and then optimizes the like processing by scanning forward to the next fence.

I'm still tempted to split this into two PRs. One for apply and one for the gui just in case a rollback is required.

I'll build a bigger test and do some benchmarking.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One PR seems okay if we gut check on performance, either way works for me

@sestinj sestinj removed their request for review July 22, 2025 02:33
Copy link
Collaborator

@RomneyDa RomneyDa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chezsmithy pulled this down and didn't notice any performance issues. Current state looks good so will go ahead and merge, if you were in the process of making changes I'll look out for another PR!

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jul 24, 2025
@RomneyDa RomneyDa merged commit a19cf70 into continuedev:main Jul 24, 2025
38 of 39 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Issues and PRs Jul 24, 2025
@github-actions github-actions bot locked and limited conversation to collaborators Jul 24, 2025
@sestinj
Copy link
Contributor

sestinj commented Jul 24, 2025

🎉 This PR is included in version 1.4.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@sestinj
Copy link
Contributor

sestinj commented Jul 24, 2025

🎉 This PR is included in version 1.1.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm This PR has been approved by a maintainer released size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants