Whats-this-rock

This project deploys a telegram bot that classifies rock images into 1 of 7 types.

This package uses tensorflow to accelerate deep learning experimentation.

MLOps workflow like

Experiment Tracking
Model Management
Hyperparameter Tuning

was all done using Weights & Biases

Additionally, nbdev was used to

develop the package
produce documentation based on a series of notebooks.
CI
publishing to PyPi

Inspiration

The common complaint that you need massive amounts of data to do deep learning can be a very long way from the truth!

You very often don’t need much data at all, a lot of people are looking for ways to share data and aggregate data, but that’s unnecessary.They assume they need more data than they do, cause they’re not familiar with the basics of transfer learning which is this critical technique for needing orders of magnitudes less data.

Jeremy Howards

Documentation

Documentation for the project has been created using nbdev, and is available at udaylunawat.github.io/Whats-this-rock.

nbdev is a notebook-driven development platform. Simply write notebooks with lightweight markup and get high-quality documentation, tests, continuous integration, and packaging for free!

Once I discovered nbdev, I couldn’t help myself but redo the whole project from scratch.

It’s just makes me 10x more productive and makes the whole process streamlined and more enjoyable.

Installation

You can directly install using pip

pip install rocks_classifier

Install - Directly from Github (latest beta version)

pip install git+https://github.com/udaylunawat/Whats-this-rock.git

Download and process data

%%bash
rocks_process_data  --config-dir configs \
                    remove_bad= True \
                    remove_misclassified= True \
                    remove_duplicates= True \
                    remove_corrupted= True \
                    remove_unsupported= True \
                    sampling=None \
                    train_split=0.8 \

Train Model

Train model using default parameters in configs/config.yaml.

rocks_train_model   --config-dir configs

You can try different models and parameters by editing configs/config.yaml, or you can override it by passing arguments like this:-

By using Hydra it’s now much more easier to override parameters like this:-

rocks_train_model   --config-dir configs \
                    wandb.project=Whats-this-rock \
                    wandb.mode=offline \
                    wandb.use=False \
                    dataset_id=[1,2] \
                    epochs=30 \
                    lr=0.005 \
                    augmentation=None \
                    monitor=val_loss \
                    loss=categorical_crossentropy \
                    backbone=resnet \
                    lr_schedule=cosine_decay_restarts \
                    lr_decay_steps=300 \
                    trainable=False \

Wandb Sweeps (Hyperparameter Tuning)

Edit configs/sweep.yaml

wandb sweep \
--project Whats-this-rock \
--entity udaylunawat \
configs/sweep.yaml

This will return a command with $sweepid, run it to start running sweeps!

wandb agent udaylunawat/Whats-this-rock/$sweepid

Telegram Bot

You can try the bot here on Telegram.

Type /help to get instructions in chat.

Or deploy it yourself

rocks_deploy_bot

Demo

Colab	GitHub	Download
Run in Colab	View Source on GitHub	Download Notebook

Features

& Things I’ve Experimented with

`Feature`		`Feature`
`Wandb`	- Experiment Tracking - System Tracking - Model Tracking - Hyperparameter Tuning	`Datasets`	- Dataset 1 - Dataset 2
`Augmentation`	- Keras-CV - MixUp - CutMix - Normal	`Models`	- ConvNextTiny - Efficientnet - Resnet101 - MobileNetv1 - MobileNetv2 - Xception
`Optimisers`	- Adam - Adamax - SGD - RMSProp	`LR Scheduler`	- CosineDecay - ExponentialDecay - CosineDecayRestarts
`Remove Images`	- Duplicate Images - Corrupt Images - Bad Images - Misclassified	`Configuration Management`	- hydra - ml-collections - argparse -google-fire
`Generators`	- tf.data.DataSet - ImageDataGenerator	`Deployment`	- Heroku - Railway
`Evaluation`	- Classification Report - Confusion Matrix	`GitHub Actions` (CICD)	- GitHub Super Linter - Deploy to Telegram - Deploy to Railway - nbdev tests CI - GitHub Pages(Documentation)
`Linting`	- Flake8 - Pydocstyle	`Telegram Bot`	- Greet - Info - Predict Image
`Formatting`	- Black - yapf	`Documentation`	- Code Description - Code comments - Source link - Doclinks
`Badges`	- Build - Issues - Lint Codebase	`Docker`
`Publishing`	- PyPi

Planned Features

Feature		Feature
`Deploy`	- HuggingFaces	`Backend`	- FastAPI
`Coding Style`	- Object Oriented	`Frontend`	- Streamlit
`WandB`	- Group Runs - WandB Tables	`Badges`	- Railway

Technologies Used

Directory Tree

├── imgs                              <- Images for skill banner, project banner and other images
│
├── configs                           <- Configuration files
│   ├── configs.yaml                  <- config for single run
│   └── sweeps.yaml                   <- confguration file for sweeps hyperparameter tuning
│
├── data
│   ├── corrupted_images              <- corrupted images will be moved to this directory
│   ├── misclassified_images          <- misclassified images will be moved to this directory
│   ├── bad_images                    <- Bad images will be moved to this directory
│   ├── duplicate_images              <- Duplicate images will be moved to this directory
│   ├── sample_images                 <- Sample images for inference
│   ├── 0_raw                         <- The original, immutable data dump.
│   ├── 1_extracted                   <- Extracted data.
│   ├── 2_processed                   <- Intermediate data that has been transformed.
│   └── 3_tfds_dataset                <- The final, canonical data sets for modeling.
│
├── notebooks                         <- Jupyter notebooks. Used to create the source code.
│
├── rocks_classifier                  <- Source code used in this project.
│   │
│   ├── data                          <- Scripts to download or generate data
│   │   ├── download.py
│   │   ├── preprocess.py
│   │   └── utils.py
│   │
│   ├── callbacks                     <- functions that are executed during training at given stages of the training procedure
│   │   └── callbacks.py
│   │
│   ├── models                        <- Scripts to train models and then use trained models to make
│   │   │                                predictions
│   │   ├── evaluate.py
│   │   ├── models.py
│   │   ├── predict.py
│   │   ├── train.py
│   │   └── utils.py
│   │
│   └── visualization                 <- Scripts for visualizations
│
├── .dockerignore                     <- Docker ignore
├── .gitignore                        <- GitHub's excellent Python .gitignore customized for this project
├── LICENSE                           <- Your project's license.
├── README.md                         <- The top-level README for developers using this project.
├── CHANGELOG.md                      <- Release changes.
├── CODE_OF_CONDUCT.md                <- Code of conduct.
├── CONTRIBUTING.md                   <- Contributing Guidelines.
├── settings.ini                      <- configuration.
├── README.md                         <- The top-level README for developers using this project.
├── requirements.txt                  <- The requirements file for reproducing the analysis environment, e.g.
│                                        generated with `pip freeze > requirements.txt`
└── setup.py                          <- makes project pip installable (pip install -e .) so src can be imported

Learnings

“Better data is better than better models!”

Bug / Feature Request

If you find a bug (the site couldn’t handle the query and / or gave undesired results), kindly open an issue here by including your search query and the expected result.

If you’d like to request a new function, feel free to do so by opening an issue here. Please include sample queries and their corresponding results.

Contributing

Contributions make the open source community such an amazing place to learn, inspire, and create.
Any contributions you make are greatly appreciated.
Check out our contribution guidelines for more information.

License

Whats-this-rock! is licensed under the MIT License - see the LICENSE file for details.

Credits

Support

This project needs a ⭐️ from you. Don’t forget to leave a star ⭐️

Walt might be the one who knocks
but Hank is the one who rocks.

Name		Name	Last commit message	Last commit date
Latest commit History 940 Commits
.github		.github
archived		archived
docs		docs
imgs		imgs
notebooks		notebooks
rocks_classifier		rocks_classifier
tests		tests
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitconfig		.gitconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.bak		CHANGELOG.bak
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Procfile		Procfile
README.md		README.md
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
runtime.txt		runtime.txt
settings.ini		settings.ini
setup.py		setup.py

License

udaylunawat/Whats-this-rock

Folders and files

Latest commit

History

Repository files navigation

Whats-this-rock

Inspiration

Documentation

Installation

Download and process data

Train Model

Wandb Sweeps (Hyperparameter Tuning)

Telegram Bot

Demo

Features

Planned Features

Technologies Used

Directory Tree

Learnings

Bug / Feature Request

Contributing

License

Credits

Support

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages