Using augmentation methods on audio files to increase the size of limited datasets.

s21-team7-project

Members: Grayson Cordell, Jesse Gailbreath, Noah Norrod, Jacob Swindell

Using augmentation methods on audio files to increase the size of limited datasets.

Small datasets can be plagued with the issue of producing models that present poor accuracy. Sometimes, there is just not enough information to get a network to learn without drastic overfitting. So, you have a small dataset and you are having issues training a network with decent accuracy? Well, do we have a cure for you: data augmentation.

What's in the repository?

Here you will find a some resources and a step-by-step demonstation on how incorporating data augmentation can improve your accuracy.

A free dataset of audio samples recored as .wav files (original source: https://github.com/Jakobovski/free-spoken-digit-dataset)
Documentation on how you can perform augmentations on the audio samples. If you just want to get to the good stuff, feel free to use our augmented dataset located in the Data section.
The code to transform the .wav files into spectrogram images for use in a neural network. Don't have the time? We've got you! A full set of images are housed in the Data section as well.
The code to build a network designed to test out your handiwork (or ours if you just want to see the end result).
You will also find some interesting links and documentation all related to this topic and our final paper sharing our results.

Ready to take a look?

We used Jupyter Notebook to write our code. If you are not familiar with it, head here: https://jupyter.org/

We also used Google Colab to execute our notebook. Not familiar with this? No worries, just head over here: https://colab.research.google.com/notebooks/intro.ipynb

Our datasets are fairly large so we utilized Google Drive for storage and access: https://www.google.com/drive/

To run our short demo, use this link to download the unseen_originals dataset to have locally: https://drive.google.com/drive/folders/156QEK8JuWnFqsHnptc--pRW0lYam8fis?usp=sharing

To run our long demo you can download from the our Demo folder or from https://drive.google.com/drive/folders/10ldZfWHfqrv20AnGLO9QQdpr2S9493Tz?usp=sharing

You will also need this file to use our pre-trained model: https://drive.google.com/file/d/1LyMNrvk72AVxLWGEQP-B4DAm9wkoc93b/view?usp=sharing

We utilized Tensorflow and Keras to make the magic happen. It does require a few Python dependencies (yep, you have to have Python already installed). Using the pip command will make this fairly simple. Enter the following steps in your command line to get going:

$ pip install --upgrade tensorflow
$ pip install numpy scipy
$ pip install scikit-learn
$ pip install pillow
$ pip install h5py
$ pip install keras

If you are having some issues, you can check out this resource for more details: https://www.tensorflow.org/install/pip

Okay, you should be good-to-go! We have two demo options depending on how much time you have. Open the demo folder above to let the fun begin.

Short Demo https://colab.research.google.com/drive/1te5e-wfNxUqyPRLCPVMau8Nf3AMlprOB?usp=sharing UnseenOriginals- https://drive.google.com/drive/folders/156QEK8JuWnFqsHnptc--pRW0lYam8fis?usp=sharing PreTrained- https://drive.google.com/file/d/1LyMNrvk72AVxLWGEQP-B4DAm9wkoc93b/view?usp=sharing

Long Demo https://colab.research.google.com/drive/1te5e-wfNxUqyPRLCPVMau8Nf3AMlprOB?usp=sharing Smaller dataset- https://drive.google.com/drive/folders/10ldZfWHfqrv20AnGLO9QQdpr2S9493Tz?usp=sharing For Large Dataset Use Repo

If you are looking for a quick way to augment wav files checkout Graysons walkthrough https://colab.research.google.com/drive/1yl3aNLWl1qHWhtzitdaq91GRJhiqfMVD?usp=sharing

Or his Ugly Auto-AudioAugmentor https://colab.research.google.com/drive/1TnD-WGQ8w4O03ktO-SQ-RWZOJpc949yK?usp=sharing

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
Architecture_Resources		Architecture_Resources
Code		Code
DEMO		DEMO
Data		Data
Project_Milestones		Project_Milestones
Project_Paper		Project_Paper
.gitattributes		.gitattributes
Data Augmentation Explained.txt		Data Augmentation Explained.txt
LICENSE		LICENSE
Neural Networks Project.pdf		Neural Networks Project.pdf
Presentation Link		Presentation Link
README.md		README.md
team7_proposal (1).ipynb		team7_proposal (1).ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Members: Grayson Cordell, Jesse Gailbreath, Noah Norrod, Jacob Swindell

Using augmentation methods on audio files to increase the size of limited datasets.

What's in the repository?

Ready to take a look?

About

Releases

Packages

Contributors 5

Languages

License

CSCI4850/s21-team7-project

Folders and files

Latest commit

History

Repository files navigation

Members: Grayson Cordell, Jesse Gailbreath, Noah Norrod, Jacob Swindell

Using augmentation methods on audio files to increase the size of limited datasets.

What's in the repository?

Ready to take a look?

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages