Skip to content

Exploring the intersection of supervised machine learning algorithms and weather data to drive ClimateWins forward. (CF student project)

Notifications You must be signed in to change notification settings

kgdatatech/climatewins-ml

Repository files navigation

CF-Achievement 7 | Machine Learning Specialization | Project Brief: ClimateWins Weather Prediction Data

Objective: Join me on a journey with ClimateWins, a European nonprofit organization dedicated to combating climate change where I'll answer pertinent questions, such as, can machine learning be used to predict whether weather conditions will be favorable on a certain day?

I. Context

climatewins cover image

Goal: I'll be leading the charge in integrating supervised machine learning to forecast climate consequences, empowering ClimateWins to address extreme weather events with cutting-edge algorithms such as Gradient Descent, K-Nearest Neighbors (KNN), Decision Trees, and Artificial Neural Networks (ANN) with Python to derive a data-driven strategy.

II. Key Questions Include

  1. How is machine learning used? Is it applicable to weather data?
  2. ClimateWins has heard of ethical concerns surrounding machine learning and AI. Are there any concerns specific to this project?
  3. Historically, what have the maximums and minimums in temperature been?
  4. Can machine learning be used to predict whether weather conditions will be favorable on a certain day? (If so, it could also be possible to predict danger.)

Key objectives

  1. Identify weather patterns outside the regional norm in Europe.
  2. Determine if unusual weather patterns are increasing.
  3. Generate possibilities for future weather conditions over the next 25 to 50 years based on current trends.
  4. Determine the safest places for people to live in Europe over the next 25 to 50 years.

III. Data

Data Sets:

IV. Tools

  • Utilizing the latest versions of MS Excel, Anaconda, Jupyter Notebook, Python, with Gradient descent, K-nearest neighbors, Artificial neural network, and Decision tree.

Data limitations/challenges:

  • Logistic Regression was not used to predict binary outcomes.
  • Bias types such as selection bias, as only 15-18 weather stations were sampled out of 26,321 total.
  • Fine tuning the different model parameters in order to find optimization, minimal loss, and convergence
  • Overfitting data
  • Analysis was Temperature (mean) focused

V. Final Results

View Full Presentation PDF

View Proposal Strategy PDF

View EDA Results in Tableau Dashboard

View Case Study