Forrest logo
back to the dvc tool

dvc-gc:tldr:d91a5

dvc-gc: Garbage collect from the cache, including the default cloud remote storage (if set).
$ dvc gc --all-commits --cloud
try on your machine

The command "dvc gc --all-commits --cloud" is used in the DVC (Data Version Control) tool to perform garbage collection on the data and metadata associated with a DVC project.

Here's what each part of the command means:

  • "dvc gc": This is the main command to trigger garbage collection in DVC. "gc" stands for garbage collect.

  • "--all-commits": This flag instructs DVC to perform garbage collection on all the commits in the DVC project. By default, DVC only cleans up the data associated with the most recent commit, but with this flag, it cleans up data for all commits.

  • "--cloud": This flag indicates that garbage collection should also be performed on data stored in remote/cloud storage. DVC supports storing data in cloud storage services like AWS S3, Google Cloud Storage, etc. Enabling this flag ensures data in cloud storage is also cleaned up during garbage collection.

Garbage collection in DVC involves removing any data and metadata that is no longer referenced by any of the project's commits. This helps to free up storage space by removing unnecessary or obsolete data files. The "--all-commits" and "--cloud" flags provide more comprehensive cleanup by including all commits and cloud storage in the process.

This explanation was created by an AI. In most cases those are correct. But please always be careful and never run a command you are not sure if it is safe.
back to the dvc tool