Skip to content

Repo for the paper Unsupervised Simplification of Legal Texts

License

Notifications You must be signed in to change notification settings

koc-lab/lex-simple

Repository files navigation

lex-simple

Repo for the paper Unsupervised Simplification of Legal Texts https://arxiv.org/pdf/2209.00557

Dataset

We have gathered a new dataset for the goal of legal text simplification. To that aim, we have selected 1000 random legal sentences from the CaseLaw Access project of Harward Law School. Then, by collaborating with the faculty and the students of Bilkent Law School, we produced 3 different simplified reference files for these 1000 sentences. We hope that this dataset can serve as a benchmark for future legal text simplification studies.

Code

In order to run the algorithm proposed in the paper, run the following command. Python 3.6 or above is required. In particular, run:

conda create -n uslt python=3.10
conda activate uslt
git clone https://github.com/koc-lab/lex-simple.git
cd lex-simple
pip install -r requirements.txt
python -m spacy download en_core_web_sm
python -m spacy download en
cd scripts
python run_uslt.py

After running the code above, you will generate a .txt file with lexical simplifications. In order to do structural simplification on top of lexical simplification, follow the steps in https://github.com/Lambda-3/DiscourseSimplification/tree/master. In particular, run

cd .. #make sure you are in the main directory
git clone https://github.com/koc-lab/SentenceSplitting.git
cd DiscourseSimplification
mvn clean install -DskipTests

First, create a directory under DiscourseSimplification at edu/stanford/nlp/models/pos-tagger/english-left3words, and move the stanford nlp taggers you may find in this drive link inside these folders: https://drive.google.com/drive/folders/1GQerFiPgzFnS2lawIfAz8C_NsLbdQUJG?usp=share_link Then, generate an empty file called 'input.txt' inside this directory and copy and paste the lexically simplified document generated by the run_uslt.py code. Then, run

mvn clean compile exec:java
cd ..
python decode_sentence_splitting.py

Now you generated the final txt file!

Evaluation

You need to install easse, for which please follow the guides in https://github.com/feralvam/easse

After gathering the text outputs, run

python eval.py

About

Repo for the paper Unsupervised Simplification of Legal Texts

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages