Testing typical Machine Learning Algoritms on the dataset of Russian texts for tonality analysis.
- Copy this repo
- unzip dataset - data.zip to root folder
- run in the command line: python3 classifier.py --model [CHOOSE MODEL] --max_features [0..27000] --stop_words[None, Russian] --vertorization [TYPE]
Types of the words vectorization:
- Frequency of word
- TF-IDF
- Bool Vector