dialog-intent-induction/data at master · asappresearch/dialog-intent-induction

History

Name		Name	Last commit message	Last commit date
parent directory ..
LICENSE		LICENSE
README.md		README.md
airlines_500onlyb.csv.bz2		airlines_500onlyb.csv.bz2
airlines_processed.csv.bz2		airlines_processed.csv.bz2
airlines_raw.csv.bz2		airlines_raw.csv.bz2
askubuntu_processed.csv.bz2		askubuntu_processed.csv.bz2
askubuntu_raw.csv.bz2		askubuntu_raw.csv.bz2

README.md

Data

Training vs test data

We train on all data, without labels. We use the labels in order to evaluate the resulting clusters.

Twitter Airlines Customer Support

The data is available in two version:

raw: minimal redaction (company and customer twitter id), no preprocessing: airlines_raw.csv.bz2
redacted, and preprocessed: airlines_processed.csv.bz2

We sampled 500 examples, and annotated them. 8 examples were rejected because not English, leaving 492 labeled examples. The remaining examples were labeled UNK.

AskUbuntu

raw, no preprocessing: askubuntu_raw.csv.bz2
preprocessed: askubuntu_processed.csv.bz2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

README.md

Data

Training vs test data

Twitter Airlines Customer Support

AskUbuntu

Files

data

Directory actions

More options

Directory actions

More options

Latest commit

History

data

Folders and files

parent directory

README.md

Data

Training vs test data

Twitter Airlines Customer Support

AskUbuntu