lex-simple

Repo for the paper Unsupervised Simplification of Legal Texts https://arxiv.org/pdf/2209.00557

Dataset

We have gathered a new dataset for the goal of legal text simplification. To that aim, we have selected 1000 random legal sentences from the CaseLaw Access project of Harward Law School. Then, by collaborating with the faculty and the students of Bilkent Law School, we produced 3 different simplified reference files for these 1000 sentences. We hope that this dataset can serve as a benchmark for future legal text simplification studies.

Code

In order to run the algorithm proposed in the paper, run the following command. Python 3.6 or above is required. In particular, run:

conda create -n uslt python=3.10
conda activate uslt
git clone https://github.com/koc-lab/lex-simple.git
cd lex-simple
pip install -r requirements.txt
python -m spacy download en_core_web_sm
python -m spacy download en
cd scripts
python run_uslt.py

After running the code above, you will generate a .txt file with lexical simplifications. In order to do structural simplification on top of lexical simplification, follow the steps in https://github.com/Lambda-3/DiscourseSimplification/tree/master. In particular, run

cd .. #make sure you are in the main directory
git clone https://github.com/koc-lab/SentenceSplitting.git
cd DiscourseSimplification
mvn clean install -DskipTests

First, create a directory under DiscourseSimplification at edu/stanford/nlp/models/pos-tagger/english-left3words, and move the stanford nlp taggers you may find in this drive link inside these folders: https://drive.google.com/drive/folders/1GQerFiPgzFnS2lawIfAz8C_NsLbdQUJG?usp=share_link Then, generate an empty file called 'input.txt' inside this directory and copy and paste the lexically simplified document generated by the run_uslt.py code. Then, run

mvn clean compile exec:java
cd ..
python decode_sentence_splitting.py

Now you generated the final txt file!

Evaluation

You need to install easse, for which please follow the guides in https://github.com/feralvam/easse

After gathering the text outputs, run

python eval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

lex-simple

Dataset

Code

Evaluation

Files

README.md

Latest commit

History

README.md

File metadata and controls

lex-simple

Dataset

Code

Evaluation