As part of my Machine Learning Nanodegree, I undertook and completed a Capstone project to solve a problem of my choosing.
My project was based on the Kaggle Competition Invasive Species Monitoring. My proposal outlining how I intend to solve this problem can be read here.
My final report is located here and my coded implementation is located here.
In this capstone project, you will leverage what you’ve learned throughout the Nanodegree program to solve a problem of your choice by applying machine learning algorithms and techniques. You will first define the problem you want to solve and investigate potential solutions and performance metrics. Next, you will analyze the problem through visualizations and data exploration to have a better understanding of what algorithms and features are appropriate for solving it. You will then implement your algorithms and metrics of choice, documenting the preprocessing, refinement, and postprocessing steps along the way. Afterwards, you will collect results about the performance of the models used, visualize significant quantities, and validate/justify these values. Finally, you will construct conclusions about your results, and discuss whether your implementation adequately solves the problem.
This project is designed to prepare you for delivering a polished, end-to-end solution report of a real-world problem in a field of interest. When developing new technology, or deriving adaptations of previous technology, properly documenting your process is critical for both validating and replicating your results.
Things you will learn by completing this project:
How to research and investigate a real-world problem of interest. How to accurately apply specific machine learning algorithms and techniques. How to properly analyze and visualize your data and results for validity. How to document and write a report of your work.
- The Capstone Project proposal proposal.pdf.
- The Capstone Project Report capstone_project_report.pdf.
- The implemented code, in the form of an ipython notebook Capstone Project.ipynb
- The datasets used for the project can be located on the Kaggle Competition Page.
- The highest scoring submission file is also included in the repository as submit_VGG16_run7.csv.
- The best saved weights for VGG16 could not be included as the files size was ~270mb.
The neural network was trained using a Jupyter Notebook and the following Python 3 libraries:
- NumPy 1.12.0
- MatPlotLib 2.0.0
- TensorFlow 1.4.1
- SciPy 0.19.0
- Scikit-Learn 0.18.1
- Skimage 0.12.3
- Keras 2.1.2
- Pandas 0.19.2