Skip to content

fix: refresh codebase index on config change #6131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 16, 2025

Conversation

shssoichiro
Copy link
Contributor

Description

Updates the CodebaseIndexer to refresh the index when the assistant config is changed. The DocsService already does this, so this change copies the approach that the DocsService is using.

The biggest current issue is that the codebase indexing will begin before Continue has loaded remote assistant configs, which means it will use whatever model is available locally (usually Transformers.js in VSCode), or worse, silently fail. By refreshing after the config loads, we will ensure we are indexing the codebase using the user's configured embed model.

Checklist

  • I've read the contributing guide
  • The relevant docs, if any, have been updated or created
  • The relevant tests, if any, have been updated or created

Screenshots

N/A This is a backend change.

Tests

Added appropriate tests to CodebaseIndexer.test.ts

@shssoichiro shssoichiro requested a review from a team as a code owner June 15, 2025 19:16
@shssoichiro shssoichiro requested review from RomneyDa and removed request for a team June 15, 2025 19:16
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 15, 2025
Copy link

netlify bot commented Jun 15, 2025

👷 Deploy request for continuedev pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit 634aafb

Copy link

recurseml bot commented Jun 15, 2025

😱 Found 1 issue. Time to roll up your sleeves! 😱

@shssoichiro shssoichiro force-pushed the refresh-codebase-index branch 2 times, most recently from 0623705 to cc82654 Compare June 17, 2025 04:33
Copy link
Collaborator

@RomneyDa RomneyDa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shssoichiro it would be great to refresh codebase indexing on config change. I think it should only update when the embeddings model or @codebase config changes. Indexing (even just computing potential changes) can be pretty expensive.

Could you add something similar to the DocsService's siteIndexingConfigsAreEqual that checks if the embeddings model config has changed OR the codebase context provider config has changed?

@github-project-automation github-project-automation bot moved this from Todo to In Progress in Issues and PRs Jul 1, 2025
@shssoichiro shssoichiro force-pushed the refresh-codebase-index branch 4 times, most recently from 58c86a5 to d222d37 Compare July 3, 2025 05:28
@shssoichiro
Copy link
Contributor Author

Thanks, I've updated both indexers as requested and fixed the tests which were failing due to caching.

@shssoichiro shssoichiro requested a review from RomneyDa July 3, 2025 05:29
Copy link
Collaborator

@RomneyDa RomneyDa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment!

@shssoichiro shssoichiro requested a review from RomneyDa July 4, 2025 22:11
@shssoichiro shssoichiro force-pushed the refresh-codebase-index branch from 74f07e6 to 111303a Compare July 5, 2025 01:14
Copy link
Collaborator

@RomneyDa RomneyDa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return (
  embedModelsAreEqual(
    config1?.selectedModelByRole.embed,
    config2.selectedModelByRole.embed,
  ) && codebaseProvider1 === codebaseProvider2
);

This will always cause a refresh because codebaseProvider1 and codebaseProvider2 will never be equal, because the references will reset on config reload.

@shssoichiro
Copy link
Contributor Author

What would be a good way to check for this? Looking at the ContextProviderDescription, it seems like the properties there would typically always be the same given that we are always looking at the "codebase" provider.

@RomneyDa
Copy link
Collaborator

@shssoichiro you are right that changes in nRetrieve and nFinal and useReranking (the only codebase params) should not trigger a refresh. I just mean that the equality expression codebaseProvider1 === codebaseProvider2 will always evaluate to false, because they are objects that are created with each config reload.

@shssoichiro shssoichiro force-pushed the refresh-codebase-index branch from 111303a to 29536db Compare July 10, 2025 02:01
@shssoichiro
Copy link
Contributor Author

Thanks, that makes sense. In that case it probably makes sense to remove that check and just check if the embed provider has changed. I've done that and fixed the merge conflicts.

@shssoichiro shssoichiro force-pushed the refresh-codebase-index branch 2 times, most recently from 5b4da13 to f4f13db Compare July 10, 2025 02:03
@shssoichiro shssoichiro force-pushed the refresh-codebase-index branch from a50822c to c1f2faa Compare July 11, 2025 13:26
Copy link
Collaborator

@RomneyDa RomneyDa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shssoichiro the docConfigsAreEqual feels redundant since the 3rd param does not do anything. I don't want to nitpick too much but the docs config logic has been touchy and confusing in the past and I want to make sure we arrange it in a way that unused args aren't passed around. Let's rearrange so that the embedModelsAreEqual is only used when needed

@shssoichiro shssoichiro force-pushed the refresh-codebase-index branch from c1f2faa to 634aafb Compare July 14, 2025 14:26
@shssoichiro shssoichiro requested a review from RomneyDa July 14, 2025 14:27
@RomneyDa RomneyDa merged commit c41be7e into continuedev:main Jul 16, 2025
39 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Issues and PRs Jul 16, 2025
@github-actions github-actions bot locked and limited conversation to collaborators Jul 16, 2025
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jul 16, 2025
@shssoichiro shssoichiro deleted the refresh-codebase-index branch July 16, 2025 18:53
@sestinj
Copy link
Contributor

sestinj commented Jul 22, 2025

🎉 This PR is included in version 1.1.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm This PR has been approved by a maintainer released size:L This PR changes 100-499 lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants