spotify_valence_prediction

Purpose

One of the metrics that Spotify uses to characterise its songs, is valence. Valence is used to measure how happy or sad a particular song is.
- The calculation of valence was developed by The Echo Nest, which was acquired by Spotify in 2014.
- Though the valence of each song is publicly available, there is a mystery that surrounds its calculation. Some information can be found here but not something very specific.
In this project we will use Machine Learning methods to create a predictive model for the valence calculation.

Tools and Packages

The analysis was executed on Jupyter (Jupyter Notebook 6.4.4 and Python 3.9.0 will work for sure).

Additional packages required for the project to run are:

All the packages above can be installed using the pip install command-line command.

Data

Two data sources were used:

Spotify's Web API: Spotify offers numerous metrics for every song through its API. Specifically, Get Tracks' Audio Features and Get Track's Audio Analysis operations were used.
Spotify-Data 1921-2020 from Kaggle. This dataset was used to get the Spotify ids from many songs.

Important Note: Some files that contain data, obtained from the API of Spotify, were too large to by uploaded. If a user wants to access this files without executing the corresponding code, please contact the author.

Notebooks

This repository contains 4 notebooks, and each of them has its own purpose. Specifically:

data_preparation: contains the code used to collect and prepare the data that will be used by the other notebooks.
statistics: contains some statistical analyses, that were done in order to understand better the correlation between numerous features and valence.
non_nn_predictive: contains the development of various predictive models (not including neural networks).
nn_predictive: contains the development of neural network models, for valence prediction.

Results

The best results were achieved by the Neural Network built around all the data collected and metadata created. Specifically, its Mean Absolute Error on the test set was 0.0846.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
README.md		README.md
data_preparation.ipynb		data_preparation.ipynb
nn_predictive.ipynb		nn_predictive.ipynb
non_nn_predictive.ipynb		non_nn_predictive.ipynb
statistics.ipynb		statistics.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

spotify_valence_prediction

Purpose

Tools and Packages

Data

Notebooks

Results

About

Releases

Packages

Languages

giorgossideris/spotify_valence_prediction

Folders and files

Latest commit

History

Repository files navigation

spotify_valence_prediction

Purpose

Tools and Packages

Data

Notebooks

Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages