Forrest logo
back to the dvc tool

dvc-gc:tldr:53784

dvc-gc: Garbage collect from the cache, including a specific cloud remote storage.
$ dvc gc --all-commits --cloud --remote ${remote_name}
try on your machine

This command is used to run garbage collection (GC) on a remote storage associated with a DVC project. Here is the breakdown of the command:

  • dvc: It is the command-line tool used for managing version control and data pipelines with DVC.
  • gc: It stands for garbage collection. Garbage collection is a process that helps free up space by deleting unnecessary or unreferenced files in the DVC cache.
  • --all-commits: This option instructs DVC to run garbage collection on all commits, ensuring that all unused files across the entire commit history of the project are identified and removed.
  • --cloud: This option specifies that the garbage collection operation should be performed on the files stored in a remote storage instead of the local cache.
  • --remote ${remote_name}: This option is used to specify the remote storage name where the garbage collection should take place. ${remote_name} needs to be replaced with the actual name of the remote storage. The remote storage would typically be a cloud service like Amazon S3 or Microsoft Azure Blob Storage.

Overall, this command ensures that unnecessary files from all commits are removed from the specified remote storage, freeing up storage space and optimizing the usage of the remote storage for the DVC project.

This explanation was created by an AI. In most cases those are correct. But please always be careful and never run a command you are not sure if it is safe.
back to the dvc tool