(EmotCMT stands for Emotion annotated Code-Mixed Text data)
This dataset contains 825 tweets of Indonesian-English code-mixed tweets with emotion labels associated to the tweets. Each tweet is assigned one emotion label. The corresponding labels used in this dataset are: love, fear, sadness, joy and anger. This dataset was built for our work on analysing the effect of Indonesian-English code-mixed normalisation on emotion clasification task. The detail process of the creation of this dataset is explained in our IJACSA paper below.
Please cite the following paper if you use this dataset:
Yulianti, E., Kurnia, A., Adriani, M., & Duto, Y. S. (2021). Normalisation of Indonesian-English Code-Mixed Text and its Effect on Emotion Classification. International Journal of Advanced Computer Science and Applications, 12(11).
You can use this dataset for free. You don't need our permission to use it. Please cite our paper if your work uses our data in your publication. Please note that you are not allowed to create a copy of this dataset and share it publicly in your own repository without our permission.
evi.y [at] cs.ui.ac.id