- Python 3.6
- Pandas, Numpy
- NLTK
Annotated and raw documents are to be present in the required directories and format
- Run
setup.sh
as shown below to download pretrained GloVe [1] word vectors:$ bash setup.sh
- Run
model.py
with Python 3.6 as follows. This file performs 4 fold cross validation on the training data (~200 documents) to choose the best model amongSVM
,Logistic Regression
,Decision Tree
andRandom Forest
. Then it evaluates the best model on the testing data (~100 documents) and reports its Precision, Recall and F1 score.$ python model.py
[1] Jeffrey Pennington and Richard Socher and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. Empirical Methods in Natural Language Processing (EMNLP)