TSA

Twitter Sentiment Analyzer using Gensim's Word2Vec + NLTK + Keras.

Requirements

Miner

⚠️To be able to use miner.py you must have a Twitter developer account.

Your authentication details will need to be placed directly inside miner.py OR in a file named creds.py that will look like this:

consumer_key = 'abcd'
consumer_secret = 'efgh'
access_token = '4567'
access_token_secret = '1234'

Analyzer

To use TSA you will need to open your virtual env and install the required files in requirements.txt with pip3:

pip3 install -r requirements.txt

If you use Anaconda for example, you can follow these steps:

conda create -n TSA python=3.6

conda activate TSA

pip3 install -r requirements.txt

Usage

You can start the program by running ./analyzer.py

You will then be greeted with a menu:

 ████████ ███████  █████  
    ██    ██      ██   ██ 
    ██    ███████ ███████ 
    ██         ██ ██   ██ 
    ██    ███████ ██   ██                          

---Welcome to TSA---
What would you like to do?

    1) Mine tweets
    2) Validate a json file
    3) Analyze a corpus of tweets
    4) Find the most common words in a corpus of tweets
    5) Predict a string
    6) Quit

Enter your choice (1-6):

You will be prompted to enter a number in the range 1-6.

1) Mine tweets will prompt you to enter your queries, miner.py will then look for tweets containing these terms and they will be put in a file called tweets.json. You will be able to interrupt the program by pressing Ctrl+C.
2) Validate a json file will prompt you to enter a JSON file name. The file will then be reformatted into a valid JSON file and you will then be able to use it with the analyzer.
3) Analyze a corpus of tweets will prompt you to enter a JSON file name and will ask you if you also want to visualize extra data. The file will then analyze the sentiments in your corpus, you will see a prediction percentage for positive and negative tweets. If you agreed to get extra data, you will also get 4 extra charts: one showing the word embedding space in Sentiment140, one for Zipf's law, one for the word frequency distribution and one for bigrams.
4) Find the most common words in a corpus of tweets will prompt you to enter the number of the most common words you want to see and a JSON file name for your corpus. You will then see the N most common words in your corpus.
5) Predict a string will prompt you to enter a string you'd like to predict the sentiment of.
6) Quit will quit the program.

Credits

This program has been built with:

Python 3.6
tweepy
gensim
tensorflow
nltk
scikit-learn

⚠️Python 3.6 is recommended to run this program.

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
data		data
pretrained		pretrained
test		test
README.md		README.md
analyzer.py		analyzer.py
filter.py		filter.py
miner.py		miner.py
model.py		model.py
plots.py		plots.py
requirements.txt		requirements.txt
tests.py		tests.py
tfidf.py		tfidf.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

pretrained

pretrained

test

test

README.md

README.md

analyzer.py

analyzer.py

filter.py

filter.py

miner.py

miner.py

model.py

model.py

plots.py

plots.py

requirements.txt

requirements.txt

tests.py

tests.py

tfidf.py

tfidf.py

utils.py

utils.py

Repository files navigation

TSA

Requirements

Miner

Analyzer

Usage

Credits

About

Releases 2

Packages

Languages

Alhern/TSA

Folders and files

Latest commit

History

Repository files navigation

TSA

Requirements

Miner

Analyzer

Usage

Credits

About

Topics

Resources

Stars

Watchers

Forks

Languages