The aim of this project is to develop a machine learning model to predict the levels of CO in the air using historical datasets containing atmospheric variables. The project makes use of variables selection, decision trees, and cross-validation techniques to ensure robustness and model accuracy.
Data taken from: https://www.kaggle.com/datasets/fedesoriano/air-quality-data-set
- Data preprocessing
- Handling missing values
- Time-related data normalization
- Quantile based outliers capping
- Variable selection
- Using Lasso regression
- Model training
- Decision trees (RandomForest) and cross-validation
- Model evaluation
- RMSE as an accuracy measure
- Variables importance visualization
- Results visualization
- Predictions plots
- Trends plots
Copyright 2024 Mattia Bennati
Licensed under the GNU GPL V2: https://www.gnu.org/licenses/old-licenses/gpl-2.0.html