-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Target Stages, an improvement of the incremental system #3881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The current model for incremental recompilations doesn't share progress between compiler activities, leading to unnecessary rebuilds. This PR introduces two concepts (target stages and stage dependencies) that would help alleviate that problem, via extracting some unstable fields from crate hashing into their own parts of the incremental system, and separating concerns between the different compilation stages. **So, `cargo check` isn't affect by `-Clto=true` and `cargo build` can reuse artifacts produced by `cargo check`.** [Rendered](https://github.com/blyxyas/rfcs/blob/target-stages/text/0000-target-stages.md)
| For example, let's say that the user calls `cargo check` two times, with different `-Cstrip` values. | ||
| `cargo check` (in the local module) will not be affected by this change, as `-Cstrip` is a stage dependency on `linking` and beyond. | ||
| Being that `cargo check` doesn't change behaviour depending on the linking behaviour, it should **not** be recompiled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this talking purely in terms of the incremental compilation cache? For cargo, we write new cache entries for changes in profile settings, see https://doc.rust-lang.org/nightly/nightly-rustc/cargo/core/compiler/fingerprint/index.html#fingerprints-and-unithashs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this talking purely in terms of the incremental compilation cache?
Yes, it's talking from the perspective of rustc's cache specifically.
Is there a reason why fingerprints aren't based purely on the final flags passed to rustc (including RUSTFLAGS)? A rustc invocation with two different profiles but with manually set-up flags to be identical should be ~identical for cache purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why fingerprints aren't based purely on the final flags passed to rustc (including RUSTFLAGS)?
Note that "fingerprint" could mean
- fingerprint (no new cache entry on change)
-Cextra-filename(new cache entry on change)-Cmetadata- or the Unit ID
A simple answer is that we want the hashes for the different fingerprints to be independent of the workspace location for build reproducibility and to ensure cache reuse when it is moved (particularly a problem for CI). I could likely come up with more reasons but that us likely sufficient that this is not something we are likely to reconsider.
text/0000-target-stages.md
Outdated
| [reference-level-explanation]: #reference-level-explanation | ||
|
|
||
| There are two parts to this feature, one handled in the compiler itself and another handled by the incremental system | ||
| and heavily interacting with Cargo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit unclear on what Cargo will do with this new compiler feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RFC assumes involvement from Cargo for restructuring target as I couldn't get any target/ documentation from rustc, only from Cargo. So I asume that the target directory layout comes from Cargo, and its contents themselves come from rustc_incremental::*.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That says that cargo needs to be involved but doesn't help me understand what that involvement would be.
| Introducing a concept of truly incremental compilation helps for these cases which share a common base and would help with: | ||
|
|
||
| - `target` directory size; artifacts are shared between activities with a common base (e.g. `check` and `build` on the same crate and same flags) | ||
| - Compilation times, as e.g. `check` -> `build` allows building to start right from type-checking, and allows `check` to be skipped altogether. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only if we change check builds to actually store MIR, slowing down check builds in the process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know / remember what the overhead would be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supposedly up to 10% faster: rust-lang/rust#49433 (comment). The perf results were wiped when moving to another perf server in 2019 though. In addition it may well be a bit larger nowadays given that we also skip calculating some other things like reachable_non_generics and exported_symbols.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually store MIR? check builds already compute MIR for borrowchecking and CTFE, is it that slow to serialize?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also be partly from skipping MIR optimization steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for debug builds, that shouldn't add that much overhead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also run some MIR optimizations in debug mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, but their overhead should be rather small, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing a perf run at rust-lang/rust#149457 to find out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The results are in and the perf regression for removing all should_codegen() checks outside of codegen (aside from the one to determine if object files are necessary for dependencies) is pretty bad: https://perf.rust-lang.org/compare.html?start=0ad2cd3f1357da47c3eb4acc8224a8f10dd87d0f&end=fbd25dde0af118f932dcb48e3fb6ff322b35e2ff&stat=instructions:u 70%+ percent in some cases. The perf hit seems to be caused for a large part by collect_and_partition_mono_items being called now for getting the list of generic functions to encode in the crate metadata for use by -Zshare-generics. But even just the extra MIR encoding takes on the order of 100ms extra time out of a 1s compile for the image crate.
text/0000-target-stages.md
Outdated
| ## Non-overlapping cases | ||
|
|
||
| Not all cases overlap, for example, while `-Cdebug-assertions` is marked as "codegen", it's a commonly used option for | ||
| detecting the profile of the compilation in `cfg`s. That is the reason that some stage dependencies are can affect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/are can/can
|
I've not got much experience with compiler work, so apologies if this question isn't well formed, but will this completely change the contents of the |
|
The contents of the incremental compilation directory created by rustc are already entirely unstable. It is just a bunch of caches of rustc internal data structures and a bunch of cached object files. |
- Fix typo - Clarify Cargo's involvement instead of just noting that it "heavily interacts with Cargo" - Do not handle naymore 3rd party programs, they are now outside the scope of this RFC
The current model for incremental recompilations doesn't share progress between compiler activities, leading to unnecessary rebuilds.
This PR introduces two concepts (target stages and stage dependencies) that would help alleviate that problem, via extracting some unstable fields from crate hashing into their own parts of the incremental system, and separating concerns between the different compilation stages.
So,
cargo checkisn't affected by-Clto=trueandcargo buildcan reuse artifacts produced bycargo check.Rendered