This repository contains ressources for question answering in multiple languages.
In the data folder, there are 9 monolingual or cross-lingual datasets, based on SQuAD v1.1 dev set (https://rajpurkar.github.io/SQuAD-explorer/explore/1.1/dev/) and its translations on French and Japanese from https://github.com/AkariAsai/extractive_rc_by_runtime_mt . In the SQuAD dataset, the input is a question-paragraph pair in English and the output is the location of the answer in the paragraph. The 9 datasets correspond to the 9 combinations where the paragraph is in one language (among French, Japanese and English) and the question is in another language (among the same three possibilities).
If you use these datasets, please cite this paper : https://arxiv.org/abs/1910.04659 and the references mentioned above.
In our paper "Multilingual Question Answering from Formatted Text applied to Conversational Agents", we trained multilingual BERT (https://github.com/google-research/bert) on the English training SQuAD v2.0 and we tested it on the 9 datasets mentioned above.
We compare the performances of mBERT on the monolingual French and Japanese test sets with a previously published baseline (https://github.com/AkariAsai/extractive_rc_by_runtime_mt):
French | Japanese | |||
F1 | EM | F1 | EM | |
Baseline | 61.88 | 40.67 | 52.19 | 37.00 |
Multilingual BERT | 76.65 | 61.77 | 61.83 | 59.94 |
We also discuss the impressive results of mBert on cross-lingual datasets :
Question | En | Fr | Jap | ||||
F1 | EM | F1 | EM | F1 | EM | ||
En | 90.57 | 81.96 | 78.55 | 67.28 | 66.22 | 52.91 | |
Context | Fr | 81.10 | 65.14 | 76.65 | 61.77 | 60.28 | 42.20 |
Jap | 58.95 | 57.49 | 47.19 | 45.26 | 61.83 | 59.93 |