Skip to content

A dataset for emotion classification on Indonesian-English code-mixed text

License

Notifications You must be signed in to change notification settings

ir-nlp-csui/emotcmt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

EmotCMT: A dataset for emotion classification on Indonesian-English Code-Mixed Text

(EmotCMT stands for Emotion annotated Code-Mixed Text data)

Introduction

This dataset contains 825 tweets of Indonesian-English code-mixed tweets with emotion labels associated to the tweets. Each tweet is assigned one emotion label. The corresponding labels used in this dataset are: love, fear, sadness, joy and anger. This dataset was built for our work on analysing the effect of Indonesian-English code-mixed normalisation on emotion clasification task. The detail process of the creation of this dataset is explained in our IJACSA paper below.

References

Please cite the following paper if you use this dataset:

Yulianti, E., Kurnia, A., Adriani, M., & Duto, Y. S. (2021). Normalisation of Indonesian-English Code-Mixed Text and its Effect on Emotion Classification. International Journal of Advanced Computer Science and Applications, 12(11).

Licence

You can use this dataset for free. You don't need our permission to use it. Please cite our paper if your work uses our data in your publication. Please note that you are not allowed to create a copy of this dataset and share it publicly in your own repository without our permission.

Contact

evi.y [at] cs.ui.ac.id

About

A dataset for emotion classification on Indonesian-English code-mixed text

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published