Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage collect whole target/ #13136

Open
3 tasks
epage opened this issue Dec 8, 2023 · 6 comments · May be fixed by #13846
Open
3 tasks

Garbage collect whole target/ #13136

epage opened this issue Dec 8, 2023 · 6 comments · May be fixed by #13846
Assignees
Labels
C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review Z-gc Nightly: garbage collection Z-script Nightly: cargo script

Comments

@epage
Copy link
Contributor

epage commented Dec 8, 2023

Problem

With "cargo script", the target directory is "hidden" from the user, making it easy to leak when you delete your script.

If we move forward with rust-lang/rfcs#3371, a similar situation will happen for regular packages.

If I haven't touched a project in a long while but have run rustup update, there might be nothing of use left in target/, wasting space.

Sometimes I want to cargo clean all projects on my system (see #11305).

Proposed Solution

We should track in the GC data base a list of

  • Root-manifests (ie the Cargo.toml / cargo script associated with the target directory)
  • Target directory
  • (maybe) The path to the Cargo.lock for future potential work like Pin cache entries still in use #13137 without having to infer the Cargo.lock (special logic needed for cargo-script, feature requests exist for even weirder situations)

Note that neither of the two fields can serve as a unique / primary key. If people use CARGO_TARGET_DIR=/tmp/cargo then multiple workspaces may point to the same target dir. Likewise, people may end up with multiple target dirs for one workspace.

We need to track the Cargo.toml / cargo script because the workspace root is ambiguous when it comes to cargo scripts.

Example entries for CARGO_TARGET_DIR=/tmp/cargo :

id workspace-manifest target dir timestamp
? /foo/Cargo.toml /tmp/cargo ?
? /bar/Cargo.toml /tmp/cargo ?
? /baz/script.rs /tmp/cargo ?

Example entries for rust-analyzer target dir

id workspace-manifest target dir timestamp
? /foo/Cargo.toml /foo/target ?
? /foo/Cargo.toml /foo/target-ra ?
? /bar/Cargo.toml /bar/target ?
? /bar/Cargo.toml /bar/target-ra ?
? /baz/script.rs ~/.cargo/target/... ?

Forms of cleanup

  • Delete target/ if unused for X time (this is in the "locally recreatable" category)
  • Delete all target/ (I just upgraded Rust, maybe rustup could suggest this)
  • Delete leaked target/ (workspace doesn't exist)
    • However, it might be transient (on a thumb drive). Should we make this time based?

Notes

No response

@epage epage added C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review Z-script Nightly: cargo script Z-gc Nightly: garbage collection labels Dec 8, 2023
@baby230211
Copy link
Contributor

@rustbot claim

@baby230211
Copy link
Contributor

Three ways for cargo clean script

  1. clean target dir that workspace doesn't used for specific time.
  2. clean target dir that workspace doesn't used anymore.
  3. clean target dir in all workspace.

@epage
Copy link
Contributor Author

epage commented Mar 20, 2024

To clarify, those are use cases for why tracking of whole target/ could be useful.

I'd do a small tweak of wording in case this leads to confusion

  • clean target dir if it and/or its workspace hasn't been used for specific time
  • clean target dir for workspace that is no longer present
  • clean all target dirs

@weihanglo
Copy link
Member

weihanglo commented Mar 27, 2024

@epage, is this garbage collection specific to target directories under ~/.cargo/target that generated by -Zscript? From the issue title I cannot tell.

It's for every target directory.

@epage
Copy link
Contributor Author

epage commented Apr 24, 2024

Currently, we do the batch-save in PackageSet::get_many, so we likely want mark-workspace-used to live in a place before the get_many call

pub fn get_many(&self, ids: impl IntoIterator<Item = PackageId>) -> CargoResult<Vec<&Package>> {

get_many gets called as part of PackageSet::download_accessible

self.get_many(to_download.into_iter())?;

which gets called as part of resolving:

pkg_set.download_accessible(

which gets called as part of create_bcx:

let resolve = ops::resolve_ws_with_opts(

So our options are

  • Put mark-workspace-used in resolve (before download_accessible) so every operation gets it recorded
  • Put mark-workspace-used in create_bcx before the resolve so only compiles have mark-workspace-used recorded

@epage
Copy link
Contributor Author

epage commented Apr 24, 2024

A code path to model off of is

fn mark_used(&self, size: Option<u64>) -> CargoResult<()> {
self.gctx
.deferred_global_last_use()?
.mark_git_checkout_used(global_cache_tracker::GitCheckout {
encoded_git_name: self.ident,
short_name: self.short_id.expect("update before download"),
size,
});
Ok(())
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review Z-gc Nightly: garbage collection Z-script Nightly: cargo script
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants