This repo implement idea of data versioning using DVC tool.
This project demonstrates how to use Git and DVC (Data Version Control) to manage code, track data versions, and store large data files in remote storage (such as AWS S3). It provides an efficient way to version control datasets and manage large data files while keeping your code and data together.
- Code and Data Management: Git tracks the code files and DVC tracks the data.
- Remote Storage: AWS S3 (or any other supported remote storage) is used to store large datasets. //instead of this used a loclal file by name s3 to store data
- Versioning: Both the code and the data are versioned, allowing you to track changes in data and rollback if necessary.
Clone this repository to your local machine:
git clone https://github.com/your-username/my-repo.git
cd my-repo