Created three Machine learning models that utilize data to model, fit, and predict emotion across 3 modes: text, voice, and photo/video.
The models that were created to predict voice, text and image are described below:
-
Voice
- Neural Network
- Multi-layer Perceptron classifier (MLPClassifier)
- Librosa to extrat low level features from audio files
- RAVDESS Emotional Speech Audio Datset
- Neural Network
-
Text
- Multinomial Naive Bayes (MultinomialNB)
- TfidfVectorizer
- Natural Language Toolkit (NLTK) (to tokenize and stopwords)
- Twitter Text Sentiment Data
-
Image
- Deep Learning Model.
- TensorFlow - Keras
- Facial Shape Landmark predictor from Italo José GitHub
- Python OpenCV
- Dlib
- Imutils (Future implementations)
- Image Labeled Dataset FER-2013
- Deep Learning Model.
- Python OpenCV
- Trained Facial Recognition Model
- Dlib
- Keras
- TensorFlow
- NLTK
- Due to the size of this library further download needs to be done after you pip install it. The first time you run it you need to uncomment the line called # nltk.download() Onced you run the cell a pop-up window will appear that will required further download and then you can successfully run the script.
- Imutils
- Librosa
- Numpy
- Pandas
- Flask
Due to their size these datasets were not included in our repo. Please refer to them using the links below.