Replies: 1 comment 2 replies
-
If you are only tracking directories with DVC (and not individual files), it is possible to recover the structure of those directories via the You could also consider using the new cloud versioning remote format, which does preserve the original directory/filename structure for tracked files (but this requires using an object-storage remote that supports cloud versioning). But really, DVC is not intended to function without the corresponding |
Beta Was this translation helpful? Give feedback.
-
So far we had all the datasets (with images) for each project on our server in their respective folders. That whole data folder has been backed-up to an offsite storage. Thus in the worst case we still had all our images/datasets for all projects in a secure location. In that worst case we could find all the images based on the project name and sub folders.
Now with DVC all the images from the various projects/datasets end up in our DVC remote server. That DVC data store is of course also backed-up to the offsite storage. However, since DVC stores all the files via their MD5 hash, we cannot find individual files by their names or file system paths any longer.
Now my question: In the worst case scenario, when we would loose all of our Git history (for whatever reason) and with it also the MD5 hashes from the DVC-tracked files, we could then never ever restore the original files from the DVC storage, since (AFAIK) it doesn't know about the file or folder names. What is the "official" approach to handle this problem with DVC (except "don't loose your Git history")?
Beta Was this translation helpful? Give feedback.
All reactions