pillars-of-gec

This repository provides code, state-of-the art predictions and links to the pretrained Grammatical Error Correction models for "Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language Models" paper which was accepted for publication at BEA-2024 (19th Workshop on Innovative Use of NLP for Building Educational Applications; co-located with NAACL 2024).

Structure

Scripts directory contain required code to reproduce some of the baselines and build ensembles.
Data directory contain single systems and ensembles outputs on 3 main GEC benchmarks.
Table bellow contain single system scores and links to trained models available for download.

Pretrained models and results

Model name	CoNNL-2014 (test)			BEA-2019 (dev)			BEA-2019 (test)
	Precision	Recall	F05	Precision	Recall	F05	Precision	Recall	F05
CTC-copy [repo]	72.6	47.0	65.5	58.2	38.0	52.7	71.7	59.9	69.0
GECToR-2024 [link]	75.0	44.7	66.0	64.6	37.2	56.3	77.7	59.0	73.1
EditScorer [repo]	78.5	39.4	65.5	67.3	36.1	57.4	81.0	56.1	74.4
T5-11B [link]	70.9	56.5	67.5	60.9	51.1	58.6	73.2	71.2	72.8
UL2-20B [link]	73.8	50.4	67.5	60.5	48.6	57.7	75.2	70.0	74.1
Chat-LLaMa-2-7B-FT [link]	75.5	46.8	67.2	58.3	46.0	55.3	72.3	67.4	71.2
Chat-LLaMa-2-13B-FT [link]	77.2	45.6	67.9	59.8	46.1	56.4	74.6	67.8	73.1
Majority-voting ensemble (best 7)	83.7	45.7	71.8	71.7	42.2	62.9	87.3	64.1	81.4
MAJORITY-VOTING ✚[ majority-voting(best 7), GRECO-rank-w(best 7), GPT-4-rank-a(clust 3) ]	83.9	47.5	72.8	70.6	43.5	62.8	86.1	65.6	81.1

Evaluation

There are 3 evaluation sets that we are using for GEC:

CoNLL-2014 (nucle14-2a, m2 file is available; m2scorer is official scorer)
BEA19-dev (bea-dev, m2 file is available; errant is official scorer)
BEA19-test (bea-test, m2 file is NOT available; score can be got only through codelab sumbission)

Examples of evaluation

Evalsest directory: data/evaluation_sets.

Example of evaluation with Errant

ERRANT_SCORER=path_to_errant_scorer_directory
INPUT_FILE=data/evaluation_sets/bea-dev.txt
M2_FILE=data/evaluation_sets/bea-dev.m2
PRED_FILE=YOUR_PRED_FILE.txt
TMP_FILE=YOUR_TMP_FILE.m2


python $ERRANT_SCORER/parallel_to_m2.py -orig $INPUT_FILE -cor $PRED_FILE -out $TMP_FILE
python $ERRANT_SCORER/compare_m2.py -hyp $TMP_FILE -ref $M2_FILE >> {{result}}

Example of evaluation with m2scorer

M2_SCORER=path_to_m2scorer
M2_FILE=data/evaluation_sets/nucle14-2a.m2
PRED_FILE=YOUR_PRED_FILE.txt
$M2_SCORER $PRED_FILE $M2_FILE >> {{reslut}}

Citation

[to be updated once proceedings are published]

@misc{omelianchuk2024pillars,
      title={Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language Models}, 
      author={Kostiantyn Omelianchuk and Andrii Liubonko and Oleksandr Skurzhanskyi and Artem Chernodub and Oleksandr Korniienko and Igor Samokhin},
      year={2024},
      eprint={2404.14914},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

scripts

scripts

README.md

README.md

Repository files navigation

pillars-of-gec

Structure

Pretrained models and results

Evaluation

Examples of evaluation

Citation

About

Releases

Packages

Languages

grammarly/pillars-of-gec

Folders and files

Latest commit

History

Repository files navigation

pillars-of-gec

Structure

Pretrained models and results

Evaluation

Examples of evaluation

Citation

About

Resources

Stars

Watchers

Forks

Languages