Skip to content

πŸ˜ˆπŸ˜‡πŸ—¨οΈ Developed multiple sentiment classifiers, experimenting with the models (logistic regression, Neural nets - RNN,LSTM,GRU) and the embedding for the Twitter dataset

License

Notifications You must be signed in to change notification settings

Nikoletos-K/Twitter-Sentiment-Analysis

Repository files navigation


Twitter Sentiment Analysis

Binary classification experiments for the Twitter dataset

Notebook viewer

‼️ Because of memory restrictions, GitHub and Browsers can't open always big jupyter notebooks. For this reason I have every notebook linked with the βœ”οΈ jupyter nbviewer βœ”οΈ in the following table. If you have any problems opening the notebooks, follow the links.

Notebook Link to jupyter nbviewer Link to Colab
BiRNN_LSTM_GRU-BestModel.ipynb nbviewer Open In Colab
BiRNN_LSTM_GRU-Experiments.ipynb nbviewer Open In Colab
FeedForwardNN_GloVe.ipynb nbviewer Open In Colab
FeedForwardNN_TfiDf.ipynb nbviewer Open In Colab
LogisticRegression.ipynb nbviewer Open In Colab

Logistic regression

Developed a sentiment classifier using logistic regression for the Twitter sentiment classification dataset available in this link. I used the toolkit Scikit-Learn again.

Vectorization: Tf-Idf

Tf-Idf vectorization of the tweets. No pretrained vectors

Evaluation

Model metrics for evaluation: F1 score, Recall and Precision

Visualization: Confusion matrices


Feed-Forward Neural Net

Developed two sentiment classifier using feed-forward neural (pyTorch) networks for the Twitter sentiment analysis dataset.

Experimented with:

  • the number of hidden layers, and the number of their units
  • the activation functions (only the ones presented in the lectures)
  • the loss function
  • the optimizer, etc

Vectorization-1: Tf-Idf

Tf-Idf vectorization of the tweets. No pretrained vectors

Vectorization-2: Pre-trained word embendding vectors - GloVe

Vectorization made with GloVe (Stanford pre-trained embenddings)

Evaluation

Model metrics for evaluation: F1 score, Recall and Precision

Visualization: ROC curves, Loss vs Epochs, Accuracy vs Epochs and Confusion matrix


Bidirectional stacked RNN with LSTM/GRU cells

Experimented with:

  • the number of stacked RNNs,
  • the number of hidden layers,
  • type of cells,
  • skip connections,
  • gradient clipping and
  • dropout probability

Used the Adam optimizer and the binary cross-entropy loss function and transformed the predicted logits to probabilities using a sigmoid function.

Vectorization: GloVe

Pre-trained word embeddings (GloVe) as the initial embeddings to input on models.

Evaluation

Model metrics for evaluation: F1 score, Recall and Precision

Visualization: ROC curves, Loss vs Epochs, Accuracy vs Epochs and Confusion matrix


Β© Konstantinos Nikoletos | 2020 - 2021

About

πŸ˜ˆπŸ˜‡πŸ—¨οΈ Developed multiple sentiment classifiers, experimenting with the models (logistic regression, Neural nets - RNN,LSTM,GRU) and the embedding for the Twitter dataset

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published