A sentiment categorization system for tweets is designed using classical machine learning algorithms (no deep learning). The dataset comprises of 1.6M tweets (available here) automatically labeled, and thus, noisy. This is part of Natural Language Processing course taken by Prof Mausam.
The model uses ensemble learning approach. An ensemble of 5 classifiers are designed for the prediction task at hand.
Training
bash run-train.sh <data_directory> <model_directory>
Testing
bash run-test.sh <model_directory> <input_file_path> <output_file_path>