This repository contains a collection of Airflow DAGs for solving various ML problems. These DAGs are designed to provide a scalable and reliable pipeline for machine learning tasks, leveraging the power of Apache Airflow and integrating with MLflow.
The cnn_skin_cancer
use case focuses on the classification of melanoma as benign or malignant using a TensorFlow model. It sets up an containerized Airflow DAG pipeline with MLflow integration, allowing for efficient and reproducible machine learning workflows.
To use the DAGs in this repository, follow the steps below:
- Install Airflow, Docker, and MLflow
- Clone this repository:
git clone https://github.com/seblum/mlops-airflow-dags.git
- Navigate to the cloned repository:
cd mlops-airflow-dags
. - Set up a virtualEnv and install the
requirements.txt
- Set the following environment variables:
export AWS_ACCESS_KEY_ID="<AWS-ACCESS-KEY>"
export AWS_SECRET_ACCESS_KEY="<AWS-SECRET-ACCESS-KEY>"
export AWS_ROLE_NAME="<AWS-ROLE-WITH-RELEVANT-ACCESS-TO-S3>"
export AWS_BUCKET="<S3-BUCKET-WITH-DATA>"
export AWS_REGION="<AWS-REGION>"
- Customize the DAGs to fit your specific requirements by modifying the DAG definition files.
- Run MLflow:
mlflow ui -p 5008
. - Run the Airflow webserver:
airflow webserver -p 8081
. - Run the Airflow scheduler:
airflow scheduler
. - Access the Airflow web interface by opening
http://localhost:8081
in your web browser. - Configure and trigger the desired DAGs through the Airflow UI.
Contributions to this repository are welcome! If you have any improvements, bug fixes, or new use case suggestions, please submit a pull request. For major changes, please open an issue first to discuss the proposed changes.
This repository is licensed under the Apache License. Feel free to use and modify the code as per your needs.
This project is related to the Bookdown Book MLOps Engineering and the ML platform based on Airflow & MLflow of this project.
I would like to express my gratitude to the authors and contributors of multiple online resources that inspired and helped this project. Their valuable insights and guidance are greatly appreciated.
If you find this repository helpful, consider giving it a ⭐️ to show your appreciation!