Skip to content

Conversation

domingo2000
Copy link
Contributor

@domingo2000 domingo2000 commented Aug 30, 2025

By doing a walk over the tree of directories in the workpsace we improve the dir Glob time because it only checks for the closest test directory folder, instead of searching in all the files of the workspace.

Motivation

Closes #3327

Implementation

As discussed here the implementation consist in a tree walk over the directory structure. By doing this walk until the workspace path, we only check for test files in the most closer ancestor folder that contains tests, the locality of tests in projects using engines in rails, or Packwerck, or in general in ruby projects, make the search correct most of the times.

By checking fewer directories the performance of the Glob that search for files is improved drastically, making the go to relevant file feels instant in big projects.

Automated Tests

Changed the current test implementation to use tempdirs and create some examples in that tempdir. That way the tests are not implementation specific, as it was when stubbing the filesystem calls.

Manual Tests

  1. Rubocop project
screenrecording-2025-08-30_16-11-24.mp4
  1. Rails project
screenrecording-2025-08-30_16-15-06.mp4
  1. Buk Core project (> 27K ruby files)
screenrecording-2025-08-30_16-14-19.mp4

@domingo2000 domingo2000 requested a review from a team as a code owner August 30, 2025 20:16
Copy link

graphite-app bot commented Aug 30, 2025

How to use the Graphite Merge Queue

Add the label graphite-merge to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

Copy link
Contributor

@jenny-codes jenny-codes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea but left some comments!

Comment on lines 19 to 22
stub_directory_structure("/workspace/lib/ruby_lsp/requests", has_test_dir: false)
stub_directory_structure("/workspace/lib/ruby_lsp", has_test_dir: false)
stub_directory_structure("/workspace/lib", has_test_dir: false)
stub_directory_structure("/workspace", has_test_dir: true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you walk me through why those stubs are needed? The only thing that was changed in the implementation was the Dir.glob pattern no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if you have a better strategy to stub the Globs to actually test it, without being so implementation specific, i am happy to receive feedback.

I did not want to pollute some /fixtures folder to just test this functionality.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the current tests pass without any modifications? if not they should pass with modifications to the searching pattern. I think for this feature we can create a new test "test_search_in_closest_directory". And let me look at the codebase to see if there are any existing patterns we can use to simulate file structures

@domingo2000 domingo2000 force-pushed the improve-go-to-relevant-file-performance branch from 3041894 to 5d0e220 Compare October 1, 2025 17:47
@domingo2000
Copy link
Contributor Author

domingo2000 commented Oct 1, 2025

Thanks for the feedback @jenny-codes! I corrected all revisions, If you have any more feedback let me know.

Also checked again manually on the rubocop, rails and Buk monorepo to be sure it works well 👍🏻 .


# Move up one level
parent_dir = File.dirname(current_dir)
break if parent_dir == current_dir
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need the break statement here. The loop will already exit when parent_dir == current_dir

candidate_paths = Dir.glob(File.join("**", relevant_filename_pattern))
pattern = File.join(search_root, "**", relevant_filename_pattern)
candidate_paths = Dir.glob(pattern)
.uniq
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't need .uniq here

# This scopes the search to reduce the number of files that need to be checked.
#: -> String
def search_root
current_path = File.join(@workspace_path, @path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second look–why do we need the workspace_path here? This method should be able to function only on the relative path. By removing it we can simplify the code in this method as well as removing the cleanup on line 41.

Comment on lines 19 to 22
stub_directory_structure("/workspace/lib/ruby_lsp/requests", has_test_dir: false)
stub_directory_structure("/workspace/lib/ruby_lsp", has_test_dir: false)
stub_directory_structure("/workspace/lib", has_test_dir: false)
stub_directory_structure("/workspace", has_test_dir: true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the current tests pass without any modifications? if not they should pass with modifications to the searching pattern. I think for this feature we can create a new test "test_search_in_closest_directory". And let me look at the codebase to see if there are any existing patterns we can use to simulate file structures

@domingo2000 domingo2000 force-pushed the improve-go-to-relevant-file-performance branch from 5d0e220 to ed632b0 Compare October 2, 2025 21:20
@domingo2000
Copy link
Contributor Author

I changed to the relative path implementation. It works fine with manual testing. I will try to change the test to use a fixture folder in the filesystem instead of stubbing the responses to test the implementation without harcoding it.

@domingo2000 domingo2000 force-pushed the improve-go-to-relevant-file-performance branch 3 times, most recently from ddf2ae5 to fc31e36 Compare October 3, 2025 12:32
@domingo2000
Copy link
Contributor Author

@jenny-codes I ended up testing the feature using a temporary directory strategy. That way the stubs are not implementation specific, the test is autocontained, and we don't need a real fixture folder.

By doing a walk over the tree of directories in the workpsace we improve
the dir Glob time because it only checks for the closest test directory
folder, instead of searching in all the files of the workspace.
@domingo2000 domingo2000 force-pushed the improve-go-to-relevant-file-performance branch from fc31e36 to a30f4dd Compare October 3, 2025 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve GoToRelevantFile performance
2 participants