Skip to content
/ ugec Public

The official code for the "Unsupervised Grammatical Error Correction Rivaling Supervised Methods" paper, published in EMNLP 2023.

Notifications You must be signed in to change notification settings

nusnlp/ugec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Unsupervised Grammatical Error Correction Rivaling Supervised Methods

Hannan Cao, Liping Yuan, Yuchen Zhang, Hwee Tou Ng. Unsupervised Grammatical Error Correction Rivaling Supervised Methods. In EMNLP 2023.

Training Data & Checkpoints

GEC training data; GEC model checkpoints;

English GEC

Flan-T5-xxl

  1. Please store all the downloaded checkpoint and data for Flan-T5-xxl in this folder: en_flan_t5/llm_finetune
  2. Install the requirement.txt inside en_flan_t5 folder

Train:

bash train.sh

Inference: go to en_flan_t5/llm_inference folder

bash eval_gec.sh your/ckpt/name

BART-base

  1. Please store all the downloaded checkpoint and data for BART-base in this folder: en_fairseq_train
  2. Install the requirement.txt inside en_fairseq_train folder

Train:

cd gec
bash train.sh path/to/the/model/to/be/restored path/to/data-bin/folder output_path

Inference:

bash new_generate.sh path/to/model/ckpt testing/input/path

Chinese GEC

  1. Please store all the downloaded checkpoint and data for BART-base in this folder: chinese_bart_large
  2. Install the requirement.txt inside chinese_bart_large folder

Train:

cd gec
bash train_ch.sh

Inference:

cd gec
bash test_ch.sh

Citation

If you found our paper or code useful, please cite as:

@inproceedings{cao-etal-2023-unsupervised,
    title = "Unsupervised Grammatical Error Correction Rivaling Supervised Methods",
    author = "Cao, Hannan  and
      Yuan, Liping  and
      Zhang, Yuchen  and
      Ng, Hwee Tou",
    editor = "Bouamor, Houda  and
      Pino, Juan  and
      Bali, Kalika",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.emnlp-main.185",
    doi = "10.18653/v1/2023.emnlp-main.185",
    pages = "3072--3088",
}

If you encounter any problem with the code, please contact [email protected] .

About

The official code for the "Unsupervised Grammatical Error Correction Rivaling Supervised Methods" paper, published in EMNLP 2023.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published