Accountability Accounting, a prominent investment bank, is interested in offering a new cryptocurrency investment portfolio for its customers. The company, however, is lost in the vast universe of cryptocurrencies. To help them, a report will be created that includes what cryptocurrencies are on the trading market and how they could be grouped to create a classification system for this new investment.
The original data we will be working with is not ideal, so it will need to be processed to fit the machine learning models. Since there is no known output for what we are looking for, we have decided to use unsupervised learning.
- Preprocessed data for model using pandas and sklearn library to standardize features
- Reduced data dimensions using Principal Component Analysis (PCA)
- Performed clustering using K-Means
- Visualized results in 2-D and 3-D using hvplot
- Deliverable 1: Preprocessing the Data for PCA
- Deliverable 2: Reducing Data Dimensions Using PCA
- Deliverable 3: Clustering Cryptocurrencies Using K-means
- Deliverable 4: Visualizing Cryptocurrencies Results
Data Sources: - crypto_data.csv
Software:
- Jupyter Notebook 6.1.4
- Python 3.8.5
The number of clusters needed was unknown, so to find the best number an Elbow Curve was plotted using hvPlot.
- The best value for k is clearly 4.
- 4 clusters can confidently be used in the K-Means model.
Using the PCA algorithm with three principal components and knowledge of creating scatter plots with Plotly Express and hvplot, four distinct clusters of cryptocurrencies were visualized in 3-D.
A table with the tradable cryptocurrencies was created using hvplot.
A scatter plot was created with TotalCoinsMinted on the x-axis and TotalCoinSupply on the y-axis the shows the CoinName when you hover over the data point.
- This plot in 2-D does not show a clear seperation in the four clusters.
Based on this analysis we have discovered that there are 532 tradable currencies on the market and they can be divided into four different classification groups. -
- This was done by David Supple